Background: Artificial intelligence approaches can integrate complex features and can be used to predict a patient’s risk of developing lung cancer, thereby decreasing the need for unnecessary and expensive diagnostic interventions. Objective: The aim of this study was to use electronic medical records to prescreen patients who are at risk of developing lung cancer. Methods: We randomly selected 2 million participants from the Taiwan National Health Insurance Research Database who received care between 1999 and 2013. We built a predictive lung cancer screening model with neural networks that were trained and validated using pre-2012 data, and we tested the model prospectively on post-2012 data. An age- and gender-matched subgroup that was 10 times larger than the original lung cancer group was used to assess the predictive power of the electronic medical record. Discrimination (area under the receiver operating characteristic curve [AUC]) and calibration analyses were performed. Results: The analysis included 11,617 patients with lung cancer and 1,423,154 control patients. The model achieved AUCs of 0.90 for the overall population and 0.87 in patients ≥55 years of age. The AUC in the matched subgroup was 0.82. The positive predictive value was highest (14.3%) among people aged ≥55 years with a pre-existing history of lung disease. Conclusions: Our model achieved excellent performance in predicting lung cancer within 1 year and has potential to be deployed for digital patient screening. Convolution neural networks facilitate the effective use of EMRs to identify individuals at high risk for developing lung cancer.

Original languageEnglish
Article numbere26256
JournalJournal of Medical Internet Research
Issue number8
Publication statusPublished - Aug 2021


  • Artificial intelligence
  • Electronic medical record
  • Lung cancer screening

ASJC Scopus subject areas

  • Health Informatics


Dive into the research topics of 'Artificial intelligence⇓based prediction of lung cancer risk using nonimaging electronic medical records: Deep learning approach'. Together they form a unique fingerprint.

Cite this