Predicting the Mortality of ICU Patients by Topic Model with Machine-Learning Techniques

Chih Chou Chiu, Chung Min Wu, Te Nien Chien, Ling Jing Kao, Jiantai Timothy Qiu

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)


Predicting clinical patients’ vital signs is a leading critical issue in intensive care units (ICUs) related studies. Early prediction of the mortality of ICU patients can reduce the overall mortality and cost of complication treatment. Some studies have predicted mortality based on electronic health record (EHR) data by using machine learning models. However, the semi-structured data (i.e., patients’ diagnosis data and inspection reports) is rarely used in these models. This study utilized data from the Medical Information Mart for Intensive Care III. We used a Latent Dirichlet Allocation (LDA) model to classify text in the semi-structured data of some particular topics and established and compared the classification and regression trees (CART), logistic regression (LR), multivariate adaptive regression splines (MARS), random forest (RF), and gradient boosting (GB). A total of 46,520 ICU Patients were included, with 11.5% mortality in the Medical Information Mart for Intensive Care III group. Our results revealed that the semi-structured data (diagnosis data and inspection reports) of ICU patients contain useful information that can assist clinical doctors in making critical clinical decisions. In addition, in our comparison of five machine learning models (CART, LR, MARS, RF, and GB), the GB model showed the best performance with the highest area under the receiver operating characteristic curve (AUROC) (0.9280), specificity (93.16%), and sensitivity (83.25%). The RF, LR, and MARS models showed better performance (AUROC are 0.9096, 0.8987, and 0.8935, respectively) than the CART (0.8511). The GB model showed better performance than other machine learning models (CART, LR, MARS, and RF) in predicting the mortality of patients in the intensive care unit. The analysis results could be used to develop a clinically useful decision support system.

Original languageEnglish
Article number1087
JournalHealthcare (Switzerland)
Issue number6
Publication statusPublished - Jun 2022


  • electronic health records
  • intensive care units
  • latent dirichlet allocation
  • machine learning
  • topic model

ASJC Scopus subject areas

  • Leadership and Management
  • Health Policy
  • Health Informatics
  • Health Information Management


Dive into the research topics of 'Predicting the Mortality of ICU Patients by Topic Model with Machine-Learning Techniques'. Together they form a unique fingerprint.

Cite this