Deep-ADCA: Development and Validation of Deep Learning Model for Automated Diagnosis Code Assignment Using Clinical Notes in Electronic Medical Records

Jakir Hossain Bhuiyan Masud, Chiang Shun, Chen Cheng Kuo, Md Mohaimenul Islam, Chih Yang Yeh, Hsuan Chia Yang, Ming Chin Lin

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


Currently, the International Classification of Diseases (ICD) codes are being used to improve clinical, financial, and administrative performance. Inaccurate ICD coding can lower the quality of care, and delay or prevent reimbursement. However, selecting the appropriate ICD code from a patient’s clinical history is time-consuming and requires expert knowledge. The rapid spread of electronic medical records (EMRs) has generated a large amount of clinical data and provides an opportunity to predict ICD codes using deep learning models. The main objective of this study was to use a deep learning-based natural language processing (NLP) model to accurately predict ICD-10 codes, which could help providers to make better clinical decisions and improve their level of service. We retrospectively collected clinical notes from five outpatient departments (OPD) from one university teaching hospital between January 2016 and December 2016. We applied NLP techniques, including global vectors, word to vectors, and embedding techniques to process the data. The dataset was split into two independent training and testing datasets consisting of 90% and 10% of the entire dataset, respectively. A convolutional neural network (CNN) model was developed, and the performance was measured using the precision, recall, and F-score. A total of 21,953 medical records were collected from 5016 patients. The performance of the CNN model for the five different departments was clinically satisfactory (Precision: 0.50~0.69 and recall: 0.78~0.91). However, the CNN model achieved the best performance for the cardiology department, with a precision of 69%, a recall of 89% and an F-score of 78%. The CNN model for predicting ICD-10 codes provides an opportunity to improve the quality of care. Implementing this model in real-world clinical settings could reduce the manual coding workload, enhance the efficiency of clinical coding, and support physicians in making better clinical decisions.

Original languageEnglish
Article number707
JournalJournal of Personalized Medicine
Issue number5
Publication statusPublished - May 2022


  • clinical note
  • convolutional neural network
  • diagnosis codes
  • medication lists
  • natural language processing

ASJC Scopus subject areas

  • Medicine (miscellaneous)


Dive into the research topics of 'Deep-ADCA: Development and Validation of Deep Learning Model for Automated Diagnosis Code Assignment Using Clinical Notes in Electronic Medical Records'. Together they form a unique fingerprint.

Cite this