UNICA IRIS Institutional Research Information System

The increase of consumer credit has made it necessary to research more and more effective models for the credit scoring. Such models are usually trained by using the past loan applications, evaluating the new ones on the basis of certain criteria. Although the state of the art offers several different approaches for their definition, this process represents a hard challenge due to several reasons. The most important ones are the data unbalance between the default and the non-default cases that reduces the effectiveness of almost all techniques, and the data heterogeneity, which makes it difficult the definition of a model able to effectively evaluate all the new loan applications. The approach proposed in this paper faces the aforementioned problems by moving the evaluation process from the canonical time domain to a frequency one, using a model based on the past non-default loan applications. It allows us to overcome the data unbalance problem by exploiting only a class of data, also defining a model that is less influenced by the data heterogeneity. The performed experiments show interesting results, since the proposed approach achieves performance closer or better than that of one of the best state-of-the-art approaches of credit scoring, such as random forests, although it operates in a proactive way, only by exploiting the past non-default cases.

A fourier spectral pattern analysis to design credit scoring models

Saia, Roberto;Carta, Salvatore

2017-01-01

Abstract

The increase of consumer credit has made it necessary to research more and more effective models for the credit scoring. Such models are usually trained by using the past loan applications, evaluating the new ones on the basis of certain criteria. Although the state of the art offers several different approaches for their definition, this process represents a hard challenge due to several reasons. The most important ones are the data unbalance between the default and the non-default cases that reduces the effectiveness of almost all techniques, and the data heterogeneity, which makes it difficult the definition of a model able to effectively evaluate all the new loan applications. The approach proposed in this paper faces the aforementioned problems by moving the evaluation process from the canonical time domain to a frequency one, using a model based on the past non-default loan applications. It allows us to overcome the data unbalance problem by exploiting only a class of data, also defining a model that is less influenced by the data heterogeneity. The performed experiments show interesting results, since the proposed approach achieves performance closer or better than that of one of the best state-of-the-art approaches of credit scoring, such as random forests, although it operates in a proactive way, only by exploiting the past non-default cases.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Codice ISBN
	
				9781450352437
			
	Parole chiave
	
				Business intelligence; Classification; Credit scoring; Imbalanced datasets; Metrics; Human-computer interaction; Computer networks and communications; Software
			
	Tipologia:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
iml2017.pdf Solo gestori archivio Tipologia: versione post-print (AAM) Dimensione 238.68 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	238.68 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/257023

Citazioni

ND

11

11

social impact