A New Architecture for Diabetes Prediction Using Data Mining, Deep Learning, and Ensemble Algorithms

Adolfo Jara-Gavilanes, Romel Ávila-Faicán, Remigio Hurtado Ortiz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

It is a big challenge to diagnose diabetes in an early stage. This causes a health problem because it is a severe cause of death if it is not treated early or it can trigger many secondary diseases that impact the well-being of the patient. In this document, we present a new method to accurately predict this disease using data mining, deep learning, and ensemble algorithms. Data mining includes the processes of data preprocessing to make it more comprehensible and gaining insights from the dataset. This architecture is divided in 7 steps: First, the dataset is loaded. Second, the variables are analyzed to understand their value to predict diabetes. Third, the noise is removed from the dataset, deleting empty data. Fourth, the variables are transformed and scaled. Fifth, an exploratory analysis is made to explore the correlations between the variables. Sixth, the following predictive methods are applied: random forest, artificial neural network, and AdaBoost. Finally, results are presented and explained. To implement this method, we used a public dataset from kaggle called: diabetes dataset. This method achieved great accuracy, precision, and recall, which helps demonstrate the effectiveness of the method. Finally, this document could be the base for new research in this disease like trying to predict the type of diabetes the patient has, and it can be applied to different health problems. Furthermore, more predictive methods should be applied to try to achieve a higher accuracy.

Original languageEnglish
Title of host publicationProceedings of 8th International Congress on Information and Communication Technology - ICICT 2023
EditorsXin-She Yang, R. Simon Sherratt, Nilanjan Dey, Amit Joshi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages203-216
Number of pages14
ISBN (Print)9789819930425
DOIs
StatePublished - 2024
Event8th International Congress on Information and Communication Technology, ICICT 2023 - London, United Kingdom
Duration: 20 Feb 202323 Feb 2023

Publication series

NameLecture Notes in Networks and Systems
Volume695 LNNS
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference8th International Congress on Information and Communication Technology, ICICT 2023
Country/TerritoryUnited Kingdom
CityLondon
Period20/02/2323/02/23

Bibliographical note

Publisher Copyright:
© 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

Keywords

  • AdaBoost
  • Artificial neural networks
  • Data preprocessing
  • Data science
  • Diabetes
  • Predictive analysis
  • Random forest

Fingerprint

Dive into the research topics of 'A New Architecture for Diabetes Prediction Using Data Mining, Deep Learning, and Ensemble Algorithms'. Together they form a unique fingerprint.

Cite this