Early identification of epidermal growth factor receptor (EGFR) and Kirsten rat sarcoma viral oncogene homolog (KRAS) mutations is crucial for selecting a therapeutic strategy for patients with non-small-cell lung cancer (NSCLC). We proposed a machine learning-based model for feature selection and prediction of EGFR and KRAS mutations in patients with NSCLC by including the least number of the most semantic radiomics features. We included a cohort of 161 patients from 211 patients with NSCLC from The Cancer Imaging Archive (TCIA) and analyzed 161 low-dose computed tomography (LDCT) images for detecting EGFR and KRAS mutations. A total of 851 ra-diomics features, which were classified into 9 categories, were obtained through manual segmenta-tion and radiomics feature extraction from LDCT. We evaluated our models using a validation set consisting of 18 patients derived from the same TCIA dataset. The results showed that the genetic algorithm plus XGBoost classifier exhibited the most favorable performance, with an accuracy of 0.836 and 0.86 for detecting EGFR and KRAS mutations, respectively. We demonstrated that a noninvasive machine learning-based model including the least number of the most semantic radiomics signatures could robustly predict EGFR and KRAS mutations in patients with NSCLC.

Original languageEnglish
Article number9254
JournalInternational journal of molecular sciences
Issue number17
Publication statusPublished - Sept 1 2021


  • EGFR mutation
  • EXtreme Gradient Boosting
  • Feature selection
  • Genetic algorithm
  • KRAS mutation
  • Low-dose computed tomography
  • Machine learning
  • Non-small-cell lung carcinoma
  • Radiogenomics

ASJC Scopus subject areas

  • Catalysis
  • Molecular Biology
  • Spectroscopy
  • Computer Science Applications
  • Physical and Theoretical Chemistry
  • Organic Chemistry
  • Inorganic Chemistry


Dive into the research topics of 'Machine learning-based radiomics signatures for egfr and kras mutations prediction in non-small-cell lung cancer'. Together they form a unique fingerprint.

Cite this