TY - JOUR
T1 - Machine Learning Algorithm for Distinguishing Ductal Carcinoma In Situ from Invasive Breast Cancer
AU - Vy, Vu Pham Thao
AU - Yao, Melissa Min Szu
AU - Le, Nguyen Quoc Khanh
AU - Chan, Wing P.
N1 - Funding Information:
Funding: This work was financially supported by the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan, grant number DP2-109-21121-01-A-03 and DP2-110-21121-01-A-03.
Publisher Copyright:
© 2022 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2022/5/1
Y1 - 2022/5/1
N2 - Purpose: Given that early identification of breast cancer type allows for less-invasive therapies, we aimed to develop a machine learning model to discriminate between ductal carcinoma in situ (DCIS) and minimally invasive breast cancer (MIBC). Methods: In this retrospective study, the health records of 420 women who underwent biopsies between 2010 and 2020 to confirm breast cancer were collected. A trained XGBoost algorithm was used to classify cancers as either DCIS or MIBC using clinical characteristics, mammographic findings, ultrasonographic findings, and histopathological features. Its performance was measured against other methods using area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy, precision, and F1 score. Results: The model was trained using 357 women and tested using 63 women with an overall 420 patients (mean [standard deviation] age, 57.1 [12.0] years). The model performed well when feature importance was determined, reaching an accuracy of 0.84 (95% confidence interval [CI], 0.76–0.91), an AUC of 0.93 (95% CI, 0.87–0.95), a specificity of 0.75 (95% CI, 0.67–0.83), and a sensitivity of 0.91 (95% CI, 0.76–0.94). Conclusion: The XGBoost model, combining clinical, mammographic, ultrasonographic, and histopathologic findings, can be used to discriminate DCIS from MIBC with an accuracy equivalent to that of experienced radiologists, thereby giving patients the widest range of therapeutic options.
AB - Purpose: Given that early identification of breast cancer type allows for less-invasive therapies, we aimed to develop a machine learning model to discriminate between ductal carcinoma in situ (DCIS) and minimally invasive breast cancer (MIBC). Methods: In this retrospective study, the health records of 420 women who underwent biopsies between 2010 and 2020 to confirm breast cancer were collected. A trained XGBoost algorithm was used to classify cancers as either DCIS or MIBC using clinical characteristics, mammographic findings, ultrasonographic findings, and histopathological features. Its performance was measured against other methods using area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy, precision, and F1 score. Results: The model was trained using 357 women and tested using 63 women with an overall 420 patients (mean [standard deviation] age, 57.1 [12.0] years). The model performed well when feature importance was determined, reaching an accuracy of 0.84 (95% confidence interval [CI], 0.76–0.91), an AUC of 0.93 (95% CI, 0.87–0.95), a specificity of 0.75 (95% CI, 0.67–0.83), and a sensitivity of 0.91 (95% CI, 0.76–0.94). Conclusion: The XGBoost model, combining clinical, mammographic, ultrasonographic, and histopathologic findings, can be used to discriminate DCIS from MIBC with an accuracy equivalent to that of experienced radiologists, thereby giving patients the widest range of therapeutic options.
KW - breast cancer
KW - ductal carcinoma in situ
KW - mammographic
KW - minimally invasive breast cancer
KW - ultrasonographic
KW - XGBoost
UR - http://www.scopus.com/inward/record.url?scp=85129966574&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85129966574&partnerID=8YFLogxK
U2 - 10.3390/cancers14102437
DO - 10.3390/cancers14102437
M3 - Article
AN - SCOPUS:85129966574
SN - 2072-6694
VL - 14
JO - Cancers
JF - Cancers
IS - 10
M1 - 2437
ER -