The Payments Systems Directive 2 (PSD2), recently issued by the European Union, allows the banks to share their customer data if they authorize the operation. On the one hand, this opportunity offers interesting perspectives to the financial operators, allowing them to evaluate the customers reliability (Credit Scoring) even in the absence of the canonical information typically used (e.g., age, current job, total incomes, or previous loans). On the other hand, the state-of-the-art approaches and strategies still train their Credit Scoring models using the canonical information. This scenario is further worsened by the scarcity of proper datasets needed for research purposes and the class imbalance between the reliable and unreliable cases, which biases the reliability of the classification models trained using this information. The proposed work is aimed at experimentally investigating the possibility of defining a Credit Scoring model based on the bank transactions of a customer, instead of using the canonical information, comparing the performance of the two models (canonical and transaction-based), and proposing an approach to improve the performance of the transactions-based model. The obtained results show the feasibility of a Credit Scoring model based only on banking transactions, and the possibility of improving its performance by introducing simple meta-features.
From Payment Services Directive 2 (PSD2) to Credit Scoring: A Case Study on an Italian Banking Institution
Saia R.
;Giuliani A.;Pompianu L.;Carta S.
2021-01-01
Abstract
The Payments Systems Directive 2 (PSD2), recently issued by the European Union, allows the banks to share their customer data if they authorize the operation. On the one hand, this opportunity offers interesting perspectives to the financial operators, allowing them to evaluate the customers reliability (Credit Scoring) even in the absence of the canonical information typically used (e.g., age, current job, total incomes, or previous loans). On the other hand, the state-of-the-art approaches and strategies still train their Credit Scoring models using the canonical information. This scenario is further worsened by the scarcity of proper datasets needed for research purposes and the class imbalance between the reliable and unreliable cases, which biases the reliability of the classification models trained using this information. The proposed work is aimed at experimentally investigating the possibility of defining a Credit Scoring model based on the bank transactions of a customer, instead of using the canonical information, comparing the performance of the two models (canonical and transaction-based), and proposing an approach to improve the performance of the transactions-based model. The obtained results show the feasibility of a Credit Scoring model based only on banking transactions, and the possibility of improving its performance by introducing simple meta-features.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.