Skip to main navigation Skip to search Skip to main content

Evaluation of Data Balancing Methods for the Classification of Digital Mammography Images with Benign and Malignant Breast Lesions Using Machine Learning

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Breast cancer is highly prevalent and a leading cause of cancer-related death in women. Early detection through mammographic imaging is critical but challenging due to subjectivity among doctors and the complex clinical context. Additionally, image datasets commonly exhibit class imbalances, posing a greater challenge compared to classification problems in other fields. In this work, we explore various class balancing techniques to enhance the predictive performance of machine learning models. We use the publicly available dataset “The mini-MIAS database of mammograms” to train SVM and CNN models (Suckling et al. in The mammographic image analysis society digital mammogram database. University of Essex, 1994 [1]), comparing their performance with and without class balancing preprocessing and ensemble methods to determine their impact on sensitivity and specificity in classification. This is done using metrics such as accuracy, F1-score, sensitivity, and specificity. The experiments presented lay the foundation for addressing issues with imbalanced datasets in the context of automated detection of anomalies in mammograms. These findings can be extended to test other class-balancing strategies and data preprocessing approaches.

Original languageEnglish
Title of host publicationProceedings of 9th International Congress on Information and Communication Technology - ICICT 2024
EditorsXin-She Yang, Simon Sherratt, Nilanjan Dey, Amit Joshi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages473-481
Number of pages9
ISBN (Print)9789819733019
DOIs
StatePublished - 2024
Event9th International Congress on Information and Communication Technology, ICICT 2024 - London, United Kingdom
Duration: 19 Feb 202422 Feb 2024

Publication series

NameLecture Notes in Networks and Systems
Volume1003 LNNS
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference9th International Congress on Information and Communication Technology, ICICT 2024
Country/TerritoryUnited Kingdom
CityLondon
Period19/02/2422/02/24

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Classification metrics
  • Data science
  • Deep learning
  • Imbalanced datasets
  • Mammography

CACES Knowledge Areas

  • 116A Computer Science

Fingerprint

Dive into the research topics of 'Evaluation of Data Balancing Methods for the Classification of Digital Mammography Images with Benign and Malignant Breast Lesions Using Machine Learning'. Together they form a unique fingerprint.

Cite this