The increase of consumer credit has made it necessary to research more and more effective models for the credit scoring. Such models are usually trained by using the past loan applications, evaluating the new ones on the basis of certain criteria. Although the state of the art offers several different approaches for their definition, this process represents a hard challenge due to several reasons. The most important ones are the data unbalance between the default and the non-default cases that reduces the effectiveness of almost all techniques, and the data heterogeneity, which makes it difficult the definition of a model able to effectively evaluate all the new loan applications. The approach proposed in this paper faces the aforementioned problems by moving the evaluation process from the canonical time domain to a frequency one, using a model based on the past non-default loan applications. It allows us to overcome the data unbalance problem by exploiting only a class of data, also defining a model that is less influenced by the data heterogeneity. The performed experiments show interesting results, since the proposed approach achieves performance closer or better than that of one of the best state-of-the-art approaches of credit scoring, such as random forests, although it operates in a proactive way, only by exploiting the past non-default cases.
A fourier spectral pattern analysis to design credit scoring models
Saia, Roberto;Carta, Salvatore
2017-01-01
Abstract
The increase of consumer credit has made it necessary to research more and more effective models for the credit scoring. Such models are usually trained by using the past loan applications, evaluating the new ones on the basis of certain criteria. Although the state of the art offers several different approaches for their definition, this process represents a hard challenge due to several reasons. The most important ones are the data unbalance between the default and the non-default cases that reduces the effectiveness of almost all techniques, and the data heterogeneity, which makes it difficult the definition of a model able to effectively evaluate all the new loan applications. The approach proposed in this paper faces the aforementioned problems by moving the evaluation process from the canonical time domain to a frequency one, using a model based on the past non-default loan applications. It allows us to overcome the data unbalance problem by exploiting only a class of data, also defining a model that is less influenced by the data heterogeneity. The performed experiments show interesting results, since the proposed approach achieves performance closer or better than that of one of the best state-of-the-art approaches of credit scoring, such as random forests, although it operates in a proactive way, only by exploiting the past non-default cases.File | Dimensione | Formato | |
---|---|---|---|
iml2017.pdf
Solo gestori archivio
Tipologia:
versione post-print (AAM)
Dimensione
238.68 kB
Formato
Adobe PDF
|
238.68 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.