Lenders, such as credit card companies and banks, use credit scores to evaluate the potential risk posed by lending money to consumers and, therefore, mitigating losses due to bad debt. Within the financial technology domain, an ideal approach should be able to operate proactively, without the need of knowing the behavior of non-reliable users. Actually, this does not happen because the most used techniques need to train their models with both reliable and non-reliable data in order to classify new samples. Such a scenario might be affected by the cold-start problem in datasets, where there is a scarcity or total absence of non-reliable examples, which is further worsened by the potential unbalanced distribution of the data that reduces the classification performances. In this paper, we overcome the aforementioned issues by proposing a proactive approach, composed of a combined entropy-based method that is trained considering only reliable cases and the sample under investigation. Experiments done in different real-world datasets show competitive performances with several state-of-art approaches that use the entire dataset of reliable and unreliable cases.

A combined entropy-based approach for a proactive credit scoring

CARTA, SALVATORE MARIO;CASTELO BRANCO FERREIRA, ANSELMO;REFORGIATO RECUPERO, DIEGO ANGELO GAETANO;Saia, Roberto
2020-01-01

Abstract

Lenders, such as credit card companies and banks, use credit scores to evaluate the potential risk posed by lending money to consumers and, therefore, mitigating losses due to bad debt. Within the financial technology domain, an ideal approach should be able to operate proactively, without the need of knowing the behavior of non-reliable users. Actually, this does not happen because the most used techniques need to train their models with both reliable and non-reliable data in order to classify new samples. Such a scenario might be affected by the cold-start problem in datasets, where there is a scarcity or total absence of non-reliable examples, which is further worsened by the potential unbalanced distribution of the data that reduces the classification performances. In this paper, we overcome the aforementioned issues by proposing a proactive approach, composed of a combined entropy-based method that is trained considering only reliable cases and the sample under investigation. Experiments done in different real-world datasets show competitive performances with several state-of-art approaches that use the entire dataset of reliable and unreliable cases.
2020
Business intelligence; Credit scoring; Data mining; Entropy; FinTech; Trust management
File in questo prodotto:
File Dimensione Formato  
preprint.pdf

Solo gestori archivio

Tipologia: versione pre-print
Dimensione 475.08 kB
Formato Adobe PDF
475.08 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/278985
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 26
  • ???jsp.display-item.citation.isi??? 18
social impact