Deep neural network-based interpretable prediction model for survival outcomes in female breast cancer patients: Integrating biomechanical perspectives with clinicopathological features
Abstract
Background: This study integrates biomechanical perspectives with clinicopathological data to develop a DNN model for survival prediction. By linking tumor size and lymph node status to biomechanical drivers such as solid stress and cell migration forces, we aim to uncover the mechanobiological mechanisms underlying prognosis heterogeneity. Methods: We analyzed data from 37,917 patients in the SEER database, encompassing clinical characteristics, pathological features, and treatment details. The DNN, featuring an attention mechanism, was evaluated using metrics such as accuracy, precision, recall, F1 score, and Area Under Curve (AUC). Interpretability techniques were applied to identify prognostic factors. Results: The DNN model achieved F1 scores of 0.928 and 0.935 for validation and test sets, respectively, with an AUC of 0.96, surpassing traditional models. Key factors identified included regional lymph node positivity, tumor size, and tumor grade, with a notable negative correlation between regional lymph node positivity and survival. Conclusions: DNN models with attention mechanisms demonstrate superior predictive performance and valuable interpretability in identifying critical prognostic factors.
References
1. Wu Y, Zhang Y, Duan S, et al. Survival prediction in second primary breast cancer patients with machine learning: An analysis of SEER database. Computer Methods and Programs in Biomedicine. 2024; 254: 108310. doi: 10.1016/j.cmpb.2024.108310
2. Dell’Aquila K, Vadlamani A, Maldjian T, et al. Machine learning prediction of pathological complete response and overall survival of breast cancer patients in an underserved inner-city population. Breast Cancer Research. 2024; 26(1). doi: 10.1186/s13058-023-01762-w
3. Yu Y, Ren W, He Z, et al. Machine learning radiomics of magnetic resonance imaging predicts recurrence-free survival after surgery and correlation of LncRNAs in patients with breast cancer: a multicenter cohort study. Breast Cancer Research. 2023; 25(1). doi: 10.1186/s13058-023-01688-3
4. Aldrighetti CM, Niemierko A, Van Allen E, et al. Racial and Ethnic Disparities Among Participants in Precision Oncology Clinical Studies. JAMA Network Open. 2021; 4(11): e2133205. doi: 10.1001/jamanetworkopen.2021.33205
5. Naik K, Goyal RK, Foschini L, et al. Current Status and Future Directions: The Application of Artificial Intelligence/Machine Learning for Precision Medicine. Clinical Pharmacology & Therapeutics. 2024; 115(4): 673-686. doi: 10.1002/cpt.3152
6. Deo RC, Nallamothu BK. Learning About Machine Learning: The Promise and Pitfalls of Big Data and the Electronic Health Record. Circulation: Cardiovascular Quality and Outcomes. 2016; 9(6): 618-620. doi: 10.1161/circoutcomes.116.003308
7. Mahoro E, Akhloufi MA. Applying Deep Learning for Breast Cancer Detection in Radiology. Current Oncology. 2022; 29(11): 8767-8793. doi: 10.3390/curroncol29110690
8. Ming C, Viassolo V, Probst-Hensch N, et al. Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models. Breast Cancer Research. 2019; 21(1). doi: 10.1186/s13058-019-1158-4
9. Zhou BY, Wang LF, Yin HH, et al. Decoding the molecular subtypes of breast cancer seen on multimodal ultrasound images using an assembled convolutional neural network model: A prospective and multicentre study. eBioMedicine. 2021; 74: 103684. doi: 10.1016/j.ebiom.2021.103684
10. Zheng X, Yao Z, Huang Y, et al. Deep learning radiomics can predict axillary lymph node status in early-stage breast cancer. Nature Communications. 2020; 11(1). doi: 10.1038/s41467-020-15027-z
11. Wang Y, Acs B, Robertson S, et al. Improved breast cancer histological grading using deep learning. Annals of Oncology. 2022; 33(1): 89-98. doi: 10.1016/j.annonc.2021.09.007
12. Stashko C, Hayward MK, Northey JJ, et al. A convolutional neural network STIFMap reveals associations between stromal stiffness and EMT in breast cancer. Nature Communications. 2023; 14(1). doi: 10.1038/s41467-023-39085-1
13. Jiang M, Li CL, Luo XM, et al. Ultrasound-based deep learning radiomics in the assessment of pathological complete response to neoadjuvant chemotherapy in locally advanced breast cancer. European Journal of Cancer. 2021; 147: 95-105. doi: 10.1016/j.ejca.2021.01.028
14. Poirion OB, Jing Z, Chaudhary K, et al. DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data. Genome Medicine. 2021; 13(1). doi: 10.1186/s13073-021-00930-x
15. Hussain H, Tamizharasan PS, Rahul CS. Design possibilities and challenges of DNN models: a review on the perspective of end devices. Artificial Intelligence Review. 2022; 55(7): 5109-5167. doi: 10.1007/s10462-022-10138-z
16. Du M, Liu N, Hu X. Techniques for interpretable machine learning. Communications of the ACM. 2019; 63(1): 68-77. doi: 10.1145/3359786
17. Bifarin OO. Interpretable machine learning with tree-based shapley additive explanations: Application to metabolomics datasets for binary classification. PLOS ONE. 2023; 18(5): e0284315. doi: 10.1371/journal.pone.0284315
18. Farzipour A, Elmi R, Nasiri H. Detection of Monkeypox Cases Based on Symptoms Using XGBoost and Shapley Additive Explanations Methods. Diagnostics. 2023; 13(14): 2391. doi: 10.3390/diagnostics13142391
19. Ren J, Li Y, Zhou J, et al. Developing machine learning models for personalized treatment strategies in early breast cancer patients undergoing neoadjuvant systemic therapy based on SEER database. Scientific Reports. 2024; 14(1). doi: 10.1038/s41598-024-72385-0
20. Rochlin DH, Barrio AV, McLaughlin S, et al. Feasibility and Clinical Utility of Prediction Models for Breast Cancer–Related Lymphedema Incorporating Racial Differences in Disease Incidence. JAMA Surgery. 2023; 158(9): 954. doi: 10.1001/jamasurg.2023.2414
21. Huang S, Cai N, Pacheco PP, et al. Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genom. Cancer Genomics & Proteomics. 2018; 15(1). doi: 10.21873/cgp.20063
22. Langarizadeh M, Moghbeli F. Applying Naive Bayesian Networks to Disease Prediction: a Systematic Review. Acta Informatica Medica. 2016; 24(5): 364. doi: 10.5455/aim.2016.24.364-369
23. Boateng EY, Abaye DA. A Review of the Logistic Regression Model with Emphasis on Medical Research. Journal of Data Analysis and Information Processing. 2019; 07(04): 190-207. doi: 10.4236/jdaip.2019.74012
24. de Ville B. Decision trees. WIREs Computational Statistics. 2013; 5(6): 448-455. doi: 10.1002/wics.1278
25. Parmar A, Katariya R, Patel V. A review on random forest: An ensemble classifier. In: Proceedings of the International Conference on Intelligent Data Communication Technologies and Internet of Things (ICICI) 2018. 7–8 August 2019; Coimbatore, India. pp. 758-763.
26. Cunningham P, Delany SJ. k-Nearest Neighbour Classifiers - A Tutorial. ACM Computing Surveys. 2021; 54(6): 1-25. doi: 10.1145/3459665
27. Rana A, Singh Rawat A, Bijalwan A, et al. Application of multi layer (perceptron) artificial neural network in the diagnosis system: A systematic review. In: Proceedings of the 2018 International Conference on Research in Intelligent and Computing in Engineering (RICE). 22–24 August 2018; San Salvador, El Salvador. pp. 1-6.
28. Clift AK, Dodwell D, Lord S, et al. Development and internal-external validation of statistical and machine learning models for breast cancer prognostication: cohort study. BMJ. Published online May 10, 2023: e073800. doi: 10.1136/bmj-2022-073800
29. Inglis A, Parnell A, Hurley C. vivid: An R package for variable importance and variable interactions displays for machine learning models. arXiv. 2022; arXiv:2210.11391. doi: 10.48550/arXiv.2210.11391
30. Friedman JH, Popescu BE. Predictive learning via rule ensembles. The Annals of Applied Statistics. 2008; 2(3). doi: 10.1214/07-aoas148
31. Friedman JH. Greedy function approximation: A gradient boosting machine. The Annals of Statistics. 2001; 29(5). doi: 10.1214/aos/1013203451
32. Goldstein A, Kapelner A, Bleich J, et al. Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. Journal of Computational and Graphical Statistics. 2015; 24(1): 44-65. doi: 10.1080/10618600.2014.907095
33. Vimbi V, Shaffi N, Mahmud M. Interpreting artificial intelligence models: a systematic review on the application of LIME and SHAP in Alzheimer’s disease detection. Brain Informatics. 2024; 11(1). doi: 10.1186/s40708-024-00222-1
34. Guang Y, Wang W, Song H, et al. Prediction of external corrosion rate for buried oil and gas pipelines: A novel deep learning method with DNN and attention mechanism. International Journal of Pressure Vessels and Piping. 2024; 209: 105218. doi: 10.1016/j.ijpvp.2024.105218
35. Hacene GB, Mauch L, Uhlich S, et al. DNN quantization with attention. arXiv. 2021; arXiv:2103.13322v1. doi: 10.48550/arXiv.2103.13322
36. Senda J, Tanaka M, Iijima K, et al. Auditory stimulus reconstruction from ECoG with DNN and self-attention modules. Biomedical Signal Processing and Control. 2024; 89: 105761. doi: 10.1016/j.bspc.2023.105761
37. Kukačka J, Golkov V, Cremers D. Regularization for deep learning: A taxonomy. arXiv. 2017; arXiv:1710.10686v1. doi: 10.48550/arXiv.1710.10686
38. Gao J, Lu Y, Ashrafi N, et al. Prediction of sepsis mortality in ICU patients using machine learning methods. BMC Medical Informatics and Decision Making. 2024; 24(1). doi: 10.1186/s12911-024-02630-z
39. Sammut SJ, Crispin-Ortuzar M, Chin SF, et al. Multi-omic machine learning predictor of breast cancer therapy response. Nature. 2021; 601(7894): 623-629. doi: 10.1038/s41586-021-04278-5
40. Pickup MW, Mouw JK, Weaver VM. The extracellular matrix modulates the hallmarks of cancer. EMBO reports. 2014; 15(12): 1243-1253. doi: 10.15252/embr.201439246
41. Stylianopoulos T, Martin JD, Chauhan VP, et al. Causes, consequences, and remedies for growth-induced solid stress in murine and human tumors. Proceedings of the National Academy of Sciences. 2012; 109(38): 15101-15108. doi: 10.1073/pnas.1213353109
42. Xiao W, Pahlavanneshan M, Eun CY, et al. Matrix stiffness mediates pancreatic cancer chemoresistance through induction of exosome hypersecretion in a cancer associated fibroblasts-tumor organoid biomimetic model. Matrix Biology Plus. 2022; 14: 100111. doi: 10.1016/j.mbplus.2022.100111
43. Zhou BY, Wang LF, Yin HH, et al. Decoding the molecular subtypes of breast cancer seen on multimodal ultrasound images using an assembled convolutional neural network model: A prospective and multicentre study. eBioMedicine. 2021; 74: 103684. doi: 10.1016/j.ebiom.2021.103684
44. Wu Y, Zhang Y, Duan S, et al. Survival prediction in second primary breast cancer patients with machine learning: An analysis of SEER database. Computer Methods and Programs in Biomedicine. 2024; 254: 108310. doi: 10.1016/j.cmpb.2024.108310
Copyright (c) 2025 Author(s)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright on all articles published in this journal is retained by the author(s), while the author(s) grant the publisher as the original publisher to publish the article.
Articles published in this journal are licensed under a Creative Commons Attribution 4.0 International, which means they can be shared, adapted and distributed provided that the original published version is cited.