Credit Default Risk Analysis Using Machine Learning Algorithms with Hyperparameter Optimization

Juan Inga; Erwin Sacoto-Cabrera

doi:10.1007/978-3-031-24327-1_8

Credit Default Risk Analysis Using Machine Learning Algorithms with Hyperparameter Optimization

Juan Inga, Erwin Sacoto-Cabrera

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

2 Scopus citations

Abstract

Machine learning models are an important tool that provide a scientific method to identify potential debtors early and predict which clients are more likely to default on their debts, improving the accuracy of assessment in credit risk analysis in financial companies. The purpose of this study was to analyze the performance of gradient boosting machine learning algorithms (CatBoost, LightGBM, and XGBoost) in predicting customer default risk, and the ability of the RandomUnderSampler sampling technique to address unbalanced categories of credit risk. The exploratory analysis of the data set was carried out, then the data preprocessing, finally the training with hyperparameter adjustments with the GridSearchCV method to identify the largest number of clients with credit risk. The model is evaluated based on metrics of sensitivity, specificity and precision, on a set of consumer credit data. Among the proposed algorithms, XGBoost outperformed the LightGBM and catBoost models. Experimental results confirmed that the XGBoost model performs better for credit risk prediction with historical data.

Original language	English
Title of host publication	Intelligent Technologies
Subtitle of host publication	Design and Applications for Society - Proceedings of CITIS 2022
Editors	Vladimir Robles-Bykbaev, Josefa Mula, Gilberto Reynoso-Meza
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	81-95
Number of pages	15
ISBN (Print)	9783031243264
DOIs	https://doi.org/10.1007/978-3-031-24327-1_8
State	Published - 2023
Event	8th International Conference on Science, Technology and Innovation for Society, CITIS 2022 - Guayaquil, Ecuador Duration: 22 Jun 2022 → 24 Jun 2022

Publication series

Name	Lecture Notes in Networks and Systems
Volume	607 LNNS
ISSN (Print)	2367-3370
ISSN (Electronic)	2367-3389

Conference

Conference	8th International Conference on Science, Technology and Innovation for Society, CITIS 2022
Country/Territory	Ecuador
City	Guayaquil
Period	22/06/22 → 24/06/22

Bibliographical note

Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Keywords

Binary classification
Credit risk
Gradient boosting
Machine learning

Access to Document

10.1007/978-3-031-24327-1_8

Cite this

Inga, J., & Sacoto-Cabrera, E. (2023). Credit Default Risk Analysis Using Machine Learning Algorithms with Hyperparameter Optimization. In V. Robles-Bykbaev, J. Mula, & G. Reynoso-Meza (Eds.), Intelligent Technologies: Design and Applications for Society - Proceedings of CITIS 2022 (pp. 81-95). (Lecture Notes in Networks and Systems; Vol. 607 LNNS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-24327-1_8

Inga, Juan ; Sacoto-Cabrera, Erwin. / Credit Default Risk Analysis Using Machine Learning Algorithms with Hyperparameter Optimization. Intelligent Technologies: Design and Applications for Society - Proceedings of CITIS 2022. editor / Vladimir Robles-Bykbaev ; Josefa Mula ; Gilberto Reynoso-Meza. Springer Science and Business Media Deutschland GmbH, 2023. pp. 81-95 (Lecture Notes in Networks and Systems).

@inproceedings{380d721195994d1c9a17c62d5f941094,

title = "Credit Default Risk Analysis Using Machine Learning Algorithms with Hyperparameter Optimization",

abstract = "Machine learning models are an important tool that provide a scientific method to identify potential debtors early and predict which clients are more likely to default on their debts, improving the accuracy of assessment in credit risk analysis in financial companies. The purpose of this study was to analyze the performance of gradient boosting machine learning algorithms (CatBoost, LightGBM, and XGBoost) in predicting customer default risk, and the ability of the RandomUnderSampler sampling technique to address unbalanced categories of credit risk. The exploratory analysis of the data set was carried out, then the data preprocessing, finally the training with hyperparameter adjustments with the GridSearchCV method to identify the largest number of clients with credit risk. The model is evaluated based on metrics of sensitivity, specificity and precision, on a set of consumer credit data. Among the proposed algorithms, XGBoost outperformed the LightGBM and catBoost models. Experimental results confirmed that the XGBoost model performs better for credit risk prediction with historical data.",

keywords = "Binary classification, Credit risk, Gradient boosting, Machine learning",

author = "Juan Inga and Erwin Sacoto-Cabrera",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.; 8th International Conference on Science, Technology and Innovation for Society, CITIS 2022 ; Conference date: 22-06-2022 Through 24-06-2022",

year = "2023",

doi = "10.1007/978-3-031-24327-1_8",

language = "English",

isbn = "9783031243264",

series = "Lecture Notes in Networks and Systems",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "81--95",

editor = "Vladimir Robles-Bykbaev and Josefa Mula and Gilberto Reynoso-Meza",

booktitle = "Intelligent Technologies",

address = "Germany",

}

Inga, J & Sacoto-Cabrera, E 2023, Credit Default Risk Analysis Using Machine Learning Algorithms with Hyperparameter Optimization. in V Robles-Bykbaev, J Mula & G Reynoso-Meza (eds), Intelligent Technologies: Design and Applications for Society - Proceedings of CITIS 2022. Lecture Notes in Networks and Systems, vol. 607 LNNS, Springer Science and Business Media Deutschland GmbH, pp. 81-95, 8th International Conference on Science, Technology and Innovation for Society, CITIS 2022, Guayaquil, Ecuador, 22/06/22. https://doi.org/10.1007/978-3-031-24327-1_8

Credit Default Risk Analysis Using Machine Learning Algorithms with Hyperparameter Optimization. / Inga, Juan ; Sacoto-Cabrera, Erwin.
Intelligent Technologies: Design and Applications for Society - Proceedings of CITIS 2022. ed. / Vladimir Robles-Bykbaev; Josefa Mula; Gilberto Reynoso-Meza. Springer Science and Business Media Deutschland GmbH, 2023. p. 81-95 (Lecture Notes in Networks and Systems; Vol. 607 LNNS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Credit Default Risk Analysis Using Machine Learning Algorithms with Hyperparameter Optimization

AU - Inga, Juan

AU - Sacoto-Cabrera, Erwin

PY - 2023

Y1 - 2023

N2 - Machine learning models are an important tool that provide a scientific method to identify potential debtors early and predict which clients are more likely to default on their debts, improving the accuracy of assessment in credit risk analysis in financial companies. The purpose of this study was to analyze the performance of gradient boosting machine learning algorithms (CatBoost, LightGBM, and XGBoost) in predicting customer default risk, and the ability of the RandomUnderSampler sampling technique to address unbalanced categories of credit risk. The exploratory analysis of the data set was carried out, then the data preprocessing, finally the training with hyperparameter adjustments with the GridSearchCV method to identify the largest number of clients with credit risk. The model is evaluated based on metrics of sensitivity, specificity and precision, on a set of consumer credit data. Among the proposed algorithms, XGBoost outperformed the LightGBM and catBoost models. Experimental results confirmed that the XGBoost model performs better for credit risk prediction with historical data.

AB - Machine learning models are an important tool that provide a scientific method to identify potential debtors early and predict which clients are more likely to default on their debts, improving the accuracy of assessment in credit risk analysis in financial companies. The purpose of this study was to analyze the performance of gradient boosting machine learning algorithms (CatBoost, LightGBM, and XGBoost) in predicting customer default risk, and the ability of the RandomUnderSampler sampling technique to address unbalanced categories of credit risk. The exploratory analysis of the data set was carried out, then the data preprocessing, finally the training with hyperparameter adjustments with the GridSearchCV method to identify the largest number of clients with credit risk. The model is evaluated based on metrics of sensitivity, specificity and precision, on a set of consumer credit data. Among the proposed algorithms, XGBoost outperformed the LightGBM and catBoost models. Experimental results confirmed that the XGBoost model performs better for credit risk prediction with historical data.

KW - Binary classification

KW - Credit risk

KW - Gradient boosting

KW - Machine learning

UR - http://www.scopus.com/inward/record.url?scp=85151047863&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-24327-1_8

DO - 10.1007/978-3-031-24327-1_8

M3 - Conference contribution

AN - SCOPUS:85151047863

SN - 9783031243264

T3 - Lecture Notes in Networks and Systems

SP - 81

EP - 95

BT - Intelligent Technologies

A2 - Robles-Bykbaev, Vladimir

A2 - Mula, Josefa

A2 - Reynoso-Meza, Gilberto

PB - Springer Science and Business Media Deutschland GmbH

T2 - 8th International Conference on Science, Technology and Innovation for Society, CITIS 2022

Y2 - 22 June 2022 through 24 June 2022

ER -

Inga J , Sacoto-Cabrera E. Credit Default Risk Analysis Using Machine Learning Algorithms with Hyperparameter Optimization. In Robles-Bykbaev V, Mula J, Reynoso-Meza G, editors, Intelligent Technologies: Design and Applications for Society - Proceedings of CITIS 2022. Springer Science and Business Media Deutschland GmbH. 2023. p. 81-95. (Lecture Notes in Networks and Systems). doi: 10.1007/978-3-031-24327-1_8

Credit Default Risk Analysis Using Machine Learning Algorithms with Hyperparameter Optimization

Abstract

Publication series

Conference

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this