Resumen
This paper addresses the problem of identifying risk factors associated with diabetes using advanced machine learning techniques. The method used is based on combining rigorous data preparation with an exhaustive evaluation of multiple algorithms, optimizing predictive accuracy and facilitating the interpretation of the results. The development of the study is organized in three phases: Data preparation: From a public dataset, loading, detailed analysis of variables, denoising and data transformation are carried out. These steps ensure that the information is of high quality and ready for exploratory and predictive analysis. Classifier testing: Different machine learning algorithms are evaluated, from classical approaches to advanced methods such as J48, KNN, Linear Regression, Multi-Layer Perceptron (MLP), AdaBoost, XGBoost, CatBoost, Gradient Boosting, LightGBM and Random Forest. During this phase, exploratory and predictive analysis is performed to measure the performance of the methods based on seven key metrics. Selection of the best method: The results obtained allow us to identify the best performing method. In this case, Gradient Boosting and Random Forest proved to be the most efficient, while Multilayer Perceptron (MLP) presented the lowest performance in both the training and testing phases. This integrated approach not only ensures an efficient extraction of knowledge from the data, but also provides a detailed comparison of the performance of the methods, allowing to identify the most suitable to address this type of problem.
| Idioma original | Inglés |
|---|---|
| Título de la publicación alojada | Proceedings of 10th International Congress on Information and Communication Technology - ICICT 2025 |
| Editores | Xin-She Yang, Simon Sherratt, Nilanjan Dey, Amit Joshi |
| Editorial | Springer Science and Business Media Deutschland GmbH |
| Páginas | 539-552 |
| Número de páginas | 14 |
| ISBN (versión impresa) | 9789819664405 |
| DOI | |
| Estado | Publicada - 2025 |
| Evento | 10th International Congress on Information and Communication Technology, ICICT 2025 - London, Reino Unido Duración: 18 feb. 2025 → 21 feb. 2025 |
Serie de la publicación
| Nombre | Lecture Notes in Networks and Systems |
|---|---|
| Volumen | 1416 LNNS |
| ISSN (versión impresa) | 2367-3370 |
| ISSN (versión digital) | 2367-3389 |
Conferencia
| Conferencia | 10th International Congress on Information and Communication Technology, ICICT 2025 |
|---|---|
| País/Territorio | Reino Unido |
| Ciudad | London |
| Período | 18/02/25 → 21/02/25 |
Nota bibliográfica
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
ODS de las Naciones Unidas
Este resultado contribuye a los siguientes Objetivos de Desarrollo Sostenible
-
ODS 3: Salud y bienestar
Citar esto
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver