Ir directamente a la navegación principal Ir directamente a la búsqueda Ir directamente al contenido principal

A Structured Approach to Software Defect Classification and Explanation: Random Forest and Gradient Boosting Ensembles with a Focus on Prediction Interpretability

Producción científica: Capítulo del libro/informe/acta de congresoContribución de conferenciarevisión exhaustiva

Resumen

Software defect prediction is crucial for reducing costs and improving quality. According to a Cutter Consortium report, software defects cause an estimated annual loss of $1.56 trillion in global productivity. Additionally, Tricentis reported that over 30% of software development projects failed due to undetected defects. Undetected defects can increase maintenance costs, delay deliveries, and compromise security, particularly in critical applications such as financial or medical systems. A significant challenge is dealing with imbalanced data, where there are more defect-free modules than defective ones, making detection difficult. This study proposes a four-phase approach: loading and transforming data, using balancing techniques, applying machine learning models, and explaining predictions. Techniques such as SMOTE, ADASYN, and RandomUnderSampling were used to balance the data, applied to models like Random Forest, Gradient Boosting, and SVM. The JM1 dataset, containing software quality metrics and 80% defect-free modules, was used for analysis. Data preprocessing involved imputation, encoding, and normalization. Results show that Random Forest and Gradient Boosting, combined with balancing techniques, achieved the best performance in defect identification. In the future, advanced algorithms such as XGBoost and LightGBM will be explored, and parameter optimization will be conducted to further enhance results. This approach aims to improve defect detection in software and to be applied in other fields.

Idioma originalInglés
Título de la publicación alojadaProceedings of 10th International Congress on Information and Communication Technology - ICICT 2025
EditoresXin-She Yang, Simon Sherratt, Nilanjan Dey, Amit Joshi
EditorialSpringer Science and Business Media Deutschland GmbH
Páginas409-420
Número de páginas12
ISBN (versión impresa)9789819664405
DOI
EstadoPublicada - 2025
Evento10th International Congress on Information and Communication Technology, ICICT 2025 - London, Reino Unido
Duración: 18 feb. 202521 feb. 2025

Serie de la publicación

NombreLecture Notes in Networks and Systems
Volumen1416 LNNS
ISSN (versión impresa)2367-3370
ISSN (versión digital)2367-3389

Conferencia

Conferencia10th International Congress on Information and Communication Technology, ICICT 2025
País/TerritorioReino Unido
CiudadLondon
Período18/02/2521/02/25

Nota bibliográfica

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

Citar esto