Abstract
Breast cancer is highly prevalent and a leading cause of cancer-related death in women. Early detection through mammographic imaging is critical but challenging due to subjectivity among doctors and the complex clinical context. Additionally, image datasets commonly exhibit class imbalances, posing a greater challenge compared to classification problems in other fields. In this work, we explore various class balancing techniques to enhance the predictive performance of machine learning models. We use the publicly available dataset “The mini-MIAS database of mammograms” to train SVM and CNN models (Suckling et al. in The mammographic image analysis society digital mammogram database. University of Essex, 1994 [1]), comparing their performance with and without class balancing preprocessing and ensemble methods to determine their impact on sensitivity and specificity in classification. This is done using metrics such as accuracy, F1-score, sensitivity, and specificity. The experiments presented lay the foundation for addressing issues with imbalanced datasets in the context of automated detection of anomalies in mammograms. These findings can be extended to test other class-balancing strategies and data preprocessing approaches.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of 9th International Congress on Information and Communication Technology - ICICT 2024 |
| Editors | Xin-She Yang, Simon Sherratt, Nilanjan Dey, Amit Joshi |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 473-481 |
| Number of pages | 9 |
| ISBN (Print) | 9789819733019 |
| DOIs | |
| State | Published - 2024 |
| Event | 9th International Congress on Information and Communication Technology, ICICT 2024 - London, United Kingdom Duration: 19 Feb 2024 → 22 Feb 2024 |
Publication series
| Name | Lecture Notes in Networks and Systems |
|---|---|
| Volume | 1003 LNNS |
| ISSN (Print) | 2367-3370 |
| ISSN (Electronic) | 2367-3389 |
Conference
| Conference | 9th International Congress on Information and Communication Technology, ICICT 2024 |
|---|---|
| Country/Territory | United Kingdom |
| City | London |
| Period | 19/02/24 → 22/02/24 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Classification metrics
- Data science
- Deep learning
- Imbalanced datasets
- Mammography
CACES Knowledge Areas
- 116A Computer Science
Fingerprint
Dive into the research topics of 'Evaluation of Data Balancing Methods for the Classification of Digital Mammography Images with Benign and Malignant Breast Lesions Using Machine Learning'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Development of Models and Software with Artificial Intelligence and Machine Learning for Decision Support in Cancer Diagnosis and Treatment
Robles Bykbaev, V. E. (Col), Bojorque Chasi, R. X. (Col), Hurtado Ortiz, R. I. (PI), Salamea Cordero, P. A. (Col), Sanmartin Quituisaca, J. A. (Student), Azuero Ambrosi, P. E. (Student), Crespo Sarango, L. A. (Student), Loaiza Martinez, M. D. L. (Col), Tapia Vasquez, J. D. (Student), Baculima Suárez, J. A. (Student), Novillo Quinde, E. G. (Student), Pañora Uruchima, J. F. (Student) & Sigua Calle, P. M. (Student)
18/01/24 → 1/08/25
Project: Research and Development
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver