A cascaded classification approach using transfer learning and feature engineering for improved breast cancer classification

(1) * Chokri Ferkous Mail (Laboratoire des Sciences et Technologies de l'Information et de la Communication LabSTIC, Université 8 Mai 1945 Guelma, Algeria)
(2) Ouissal Fadel Mail (Department of Computer Science, Université 8 Mai 1945 Guelma, Algeria)
(3) Abderrahmane Kefali Mail (Department of Computer Science, Université 8 Mai 1945 Guelma, Algeria)
(4) Hayet-Farida Merouani Mail (Department of Computer Science, Université Badji Mokhtar, Annaba, Algeria)
*corresponding author

Abstract


The primary objective of this study is to design a cascaded classification framework that integrates deep-learning representations with handcrafted and clinical features to enhance the reliability and accuracy of breast cancer detection in mammographic screening. A multi-source mammography dataset comprising four databases was used to ensure diversity and reduce bias. The proposed system operates in two stages. In the first stage, transfer learning models (VGG16, ResNet50, and EfficientNet_B0) were evaluated using ROC-AUC, PR-AUC, calibration curves, and bootstrap confidence intervals. EfficientNet_B0, which achieved the best balance between discrimination and calibration, was selected as the feature extractor. In the second stage, the malignancy probability was combined with Haralick texture features, patient age, and breast density, and classified using SVM, Random Forest, MLP, Decision Tree, and Logistic Regression. Model robustness was verified through multi-run experiments (five random seeds) and subgroup analyses by age and density. Among the CNN models, EfficientNet_B0 yielded the best performance (accuracy = 0.9438, ROC-AUC = 0.944, PR-AUC = 0.960). In the second stage, although Random Forest achieved the highest accuracy (0.9556 ± 0.002), SVM obtained the highest mean ROC-AUC (0.980 ± 0.001) with stable accuracy (0.9539 ± 0.001) and the most significant p-values, indicating superior robustness and generalization. The proposed cascaded framework effectively combines deep, handcrafted, and clinical features to improve mammogram classification performance. The SVM-based model demonstrates strong calibration, stability, and subgroup consistency, highlighting its potential for deployment in computer-aided mammography screening systems that assist radiologists in early breast cancer detection.

Keywords


Machine Learning; Deep Learning; Transfer Learning; Cascaded classification; Mammography; Haralick features

   

DOI

https://doi.org/10.26555/ijain.v12i1.1670
      

Article metrics

Abstract views : 308 | PDF views : 65

   

Cite

   

Full Text

Download

References


[1] S. R. Sannasi Chakravarthy and H. Rajaguru, “Automatic Detection and Classification of Mammograms Using Improved Extreme Learning Machine with Deep Learning,” IRBM, vol. 43, no. 1, pp. 49–61, Feb. 2022, doi: 10.1016/j.irbm.2020.12.004.

[2] J. Alyami et al., “Cloud Computing-Based Framework for Breast Tumor Image Classification Using Fusion of AlexNet and GLCM Texture Features with Ensemble Multi-Kernel Support Vector Machine (MK-SVM),” Comput. Intell. Neurosci., vol. 2022, no. Aug, pp. 1–9, Aug. 2022, doi: 10.1155/2022/7403302.

[3] R. Li, S. Wang, Z. Wang, and L. Zhang, “Breast cancer X-ray image staging: based on efficient net with multi-scale fusion and cbam attention,” J. Phys. Conf. Ser., vol. 2082, no. 1, p. 012006, Nov. 2021, doi: 10.1088/1742-6596/2082/1/012006.

[4] M. A. Jones, R. Faiz, Y. Qiu, and B. Zheng, “Improving mammography lesion classification by optimal fusion of handcrafted and deep transfer learning features,” Phys. Med. Biol., vol. 67, no. 5, p. 054001, Feb. 2022, doi: 10.1088/1361-6560/AC5297.

[5] L. Kanya Kumari and B. Naga Jagadesh, “A Robust Feature Extraction Technique for Breast Cancer Detection using Digital Mammograms based on Advanced GLCM Approach,” EAI Endorsed Trans. Pervasive Heal. Technol., vol. 8, no. 30, pp. 1–10, Jan. 2022, doi: 10.4108/EAI.11-1-2022.172813.

[6] “RSNA Screening Mammography Breast Cancer Detection AI Challenge (2023) | RSNA,” Radiological Society of North America. Accessed: Nov. 30, 2025. [Online]. Available at: https://www.rsna.org/artificial-intelligence/ai-image-challenge/screening-mammography-breast-cancer-detection-ai-challenge

[7] “CMMD | The Chinese Mammography Database,” NATIONAL CANCER INSTITUTE CIP Cancer Imaging Program. [Online]. Available at: https://www.cancerimagingarchive.net/collection/cmmd/#citations

[8] C. D. Lekamlage, F. Afzal, E. Westerberg, and A. Chaddad, “Mini-DDSM: Mammography-based Automatic Age Estimation,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Nov. 2020, pp. 1–6. doi: 10.1145/3441369.3441370;PAGEGROUP:STRING:PUBLICATION.

[9] I. C. Moreira, I. Amaral, I. Domingues, A. Cardoso, M. J. Cardoso, and J. S. Cardoso, “INbreast: Toward a Full-field Digital Mammographic Database.,” Acad. Radiol., vol. 19, no. 2, pp. 236–248, Feb. 2012, doi: 10.1016/j.acra.2011.09.014.

[10] A. B. M. A. Hossain, J. K. Nisha, and F. Johora, “Breast Cancer Classification from Ultrasound Images using VGG16 Model based Transfer Learning,” Int. J. Image, Graph. Signal Process., vol. 15, no. 1, pp. 12–22, Feb. 2023, doi: 10.5815/IJIGSP.2023.01.02.

[11] S. Civilibal, K. K. Cevik, and A. Bozkurt, “A deep learning approach for automatic detection, segmentation and classification of breast lesions from thermal images,” Expert Syst. Appl., vol. 212, no. February, p. 118774, Feb. 2023, doi: 10.1016/J.ESWA.2022.118774.

[12] S. J. Frank, “A deep learning architecture with an object-detection algorithm and a convolutional neural network for breast mass detection and visualization,” Healthc. Anal., vol. 3, no. November, p. 100186, Nov. 2023, doi: 10.1016/J.HEALTH.2023.100186.

[13] J. Peta and S. Koppu, “Explainable Soft Attentive EfficientNet for breast cancer classification in histopathological images,” Biomed. Signal Process. Control, vol. 90, no. April, p. 105828, Apr. 2024, doi: 10.1016/J.BSPC.2023.105828.

[14] D. A. Spak, J. S. Plaxco, L. Santiago, M. J. Dryden, and B. E. Dogan, “BI-RADS® fifth edition: A summary of changes,” Diagn. Interv. Imaging, vol. 98, no. 3, pp. 179–190, Mar. 2017, doi: 10.1016/J.DIII.2017.01.001.

[15] R. M. Haralick, I. Dinstein, and K. Shanmugam, “Textural Features for Image Classification,” IEEE Trans. Syst. Man Cybern., vol. SMC-3, no. 6, pp. 610–621, 1973, doi: 10.1109/TSMC.1973.4309314.

[16] M. A. Aswathy and M. Jagannath, “An SVM approach towards breast cancer classification from H&E-stained histopathology images based on integrated features,” Med. Biol. Eng. Comput. 2021 599, vol. 59, no. 9, pp. 1773–1783, Jul. 2021, doi: 10.1007/S11517-021-02403-0.

[17] H. S. Laxmisagar and M. C. Hanumantharaju, “FPGA implementation of breast cancer detection using SVM linear classifier,” Multimed. Tools Appl. 2023 8226, vol. 82, no. 26, pp. 41105–41128, Mar. 2023, doi: 10.1007/S11042-023-15121-6.

[18] A. Kumari, M. Akhtar, M. Tanveer, and M. Arshad, “Diagnosis of breast cancer using flexible pinball loss support vector machine,” Appl. Soft Comput., vol. 157, no. May, p. 111454, May 2024, doi: 10.1016/J.ASOC.2024.111454.

[19] N. K. Al-Salihy and T. Ibrikci, “Classifying breast cancer by using decision tree algorithms,” ACM Int. Conf. Proceeding Ser., pp. 144–148, Feb. 2017, doi: 10.1145/3056662.3056716.

[20] M. M. Ghiasi and S. Zendehboudi, “Application of decision tree-based ensemble learning in the classification of breast cancer,” Comput. Biol. Med., vol. 128, no. January, p. 104089, Jan. 2021, doi: 10.1016/J.COMPBIOMED.2020.104089.

[21] K. Juneja and C. Rana, “An improved weighted decision tree approach for breast cancer prediction,” Int. J. Inf. Technol. 2018 123, vol. 12, no. 3, pp. 797–804, Apr. 2018, doi: 10.1007/S41870-018-0184-2.

[22] T. Ananth Kumar, G. Rajakumar, and T. S. Arun Samuel, “Analysis of breast cancer using grey level co-occurrence matrix and random forest classifier,” Int. J. Biomed. Eng. Technol., vol. 37, no. 2, pp. 176–184, 2021, doi: 10.1504/IJBET.2021.119503.

[23] J. Quist, L. Taylor, J. Staaf, and A. Grigoriadis, “Random Forest Modelling of High-Dimensional Mixed-Type Data for Breast Cancer Classification,” Cancers 2021, Vol. 13, Page 991, vol. 13, no. 5, p. 991, Feb. 2021, doi: 10.3390/CANCERS13050991.

[24] S. N. S., “Prediction of Breast Cancer Through Random Forest,” Curr. Med. Imaging, vol. 19, no. 10, pp. 1144–1155, Sep. 2022, doi: 10.2174/1573405618666220930150625.

[25] D. Houfani et al., “An Improved Model for Breast Cancer Diagnosis by Combining PCA and Logistic Regression Techniques,” Int. J. Comput. Digit. Syst., vol. 20, no. March, pp. 2210–142, 2021, [Online]. Available at: https://rodin.uca.es/bitstream/handle/10498/29738/SC_2023_0475.pdf?sequence=1&isAllowed=y.

[26] L. Liu, “Research on logistic regression algorithm of breast cancer diagnose data by machine learning,” in Proceedings - 2018 International Conference on Robots and Intelligent System, ICRIS 2018, Institute of Electrical and Electronics Engineers Inc., Jul. 2018, pp. 157–160. doi: 10.1109/ICRIS.2018.00049.

[27] F. Morais-Rodrigues et al., “Analysis of the microarray gene expression for breast cancer progression after the application modified logistic regression,” Gene, vol. 726, no. February, p. 144168, Feb. 2020, doi: 10.1016/J.GENE.2019.144168.

[28] S. Boumaraf, X. Liu, C. Ferkous, and X. Ma, “A New Computer-Aided Diagnosis System with Modified Genetic Feature Selection for BI-RADS Classification of Breast Masses in Mammograms,” Biomed Res. Int., vol. 2020, no. 1, p. 7695207, Jan. 2020, doi: 10.1155/2020/7695207.

[29] Z. Guo, L. Xu, and N. Ali Asgharzadeholiaee, “A Homogeneous Ensemble Classifier for Breast Cancer Detection Using Parameters Tuning of MLP Neural Network,” Appl. Artif. Intell., vol. 36, no. 1, p. 21, Dec. 2022, doi: 10.1080/08839514.2022.2031820.




Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

___________________________________________________________
International Journal of Advances in Intelligent Informatics
ISSN 2442-6571  (print) | 2548-3161 (online)
Organized by UAD and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
W: http://ijain.org
E: info@ijain.org (paper handling issues)
 andri.pranolo.id@ieee.org (publication issues)

View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0