Data Analysis Architecture using Techniques of Machine Learning for the Prediction of the Quality of Blood Fonations against the Hepatitis C Virus

Paul Idrovo-Berrezueta, Denys Dutan-Sanchez, Remigio Hurtado-Ortiz, Vladimir Robles-Bykbaev

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Nowadays the WHO (World health Organization) has difficulties improving the access to safe blood. The WHO have published that the problem with blood donations is that of the millions of blood donations that they receive one in four donations made from low-income countries do not test all the donated blood. This is a big problematic because a hospital cannot ensure a patient if the blood, he/she is receiving is safe. As a solution to this problematic, we have proposed the use a method based on CRISP-DM, where as a first procedure we apply a preparation to the data, then we prepared the dataset by cleaning the null variables, transforming the dataset by applying Hot Encoding, analysis the data with PCA (Principal Component Analysis) and using the 85% of variance, and using oversampling for the class that we have chosen. Once the dataset has been preprocessed we proceed to use the techniques of machine learning to help evaluate if a donor's blood is qualified or not for its use. We have applied a variety of machine learning techniques such as: RandomForest, KNN (K-Nearest-Neighbor), SVM (Support Vector Machine), and a neural network ANN (Artificial Neural Network). As a final step, we interpreted the results and got to a conclusion that the classifier that had the highest precision is the Random Forest classifier. For this this research we found a public dataset gathered by the university of Germany. This investigation has the objective to help improve the detection of hepatitis C in low-income countries and hopes to help improve the access to safe blood for patients who need them. In addition, we can apply this data analysis method for future investigations from which we encourage that tests be made with other techniques or models to analyze data.

Original languageEnglish
Title of host publication2022 IEEE International Autumn Meeting on Power, Electronics and Computing, ROPEC 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665458924
DOIs
StatePublished - 2022
Event2022 IEEE International Autumn Meeting on Power, Electronics and Computing, ROPEC 2022 - Ixtapa, Mexico
Duration: 9 Nov 202211 Nov 2022

Publication series

Name2022 IEEE International Autumn Meeting on Power, Electronics and Computing, ROPEC 2022

Conference

Conference2022 IEEE International Autumn Meeting on Power, Electronics and Computing, ROPEC 2022
Country/TerritoryMexico
CityIxtapa
Period9/11/2211/11/22

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

Keywords

  • Artificial Neural Network
  • Blood Donor
  • Data science
  • Hepatitis C
  • K-Nearest-Neighbor
  • Machine learning
  • Neural Network
  • Oversizing
  • Principal Component Analysis
  • Random Forest
  • Support Vector Machine

Fingerprint

Dive into the research topics of 'Data Analysis Architecture using Techniques of Machine Learning for the Prediction of the Quality of Blood Fonations against the Hepatitis C Virus'. Together they form a unique fingerprint.

Cite this