TY - JOUR
T1 - Section heading recognition in electronic health records using conditional random fields
AU - Chen, Chih Wei
AU - Chang, Nai Wen
AU - Chang, Yung Chun
AU - Dai, Hong Jie
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2014.
PY - 2014
Y1 - 2014
N2 - Electronic health records (EHRs) contain a wealth of information, such as discharge diagnoses, laboratory results, and pharmacy orders, which can be used to support clinical decision support systems and enable clinical and translational research. Unfortunately, the information is represented in a highly heterogeneous semi-structured or unstructured format with author- and domainspecific idiosyncrasies, acronyms and abbreviations. To take full advantage of health data, text-mining techniques have been applied by researchers to recognize named entities (NEs) mentioned in EHRs. However, the judgment of clinical data cannot be known solely from the NE level. For instance, a disease mention in the section of past medical history has different clinical significance when mentioned in the family medical history section. To obtain high-quality information and improve the understanding of clinical records, this work developed a machine learning-based section heading recognition system and evaluated its performance on a manually annotated corpus. The experiment results showed that the machine learning-based system achieved a satisfactory F-score of 0.939, which outperformed a dictionary-based system by 0.321.
AB - Electronic health records (EHRs) contain a wealth of information, such as discharge diagnoses, laboratory results, and pharmacy orders, which can be used to support clinical decision support systems and enable clinical and translational research. Unfortunately, the information is represented in a highly heterogeneous semi-structured or unstructured format with author- and domainspecific idiosyncrasies, acronyms and abbreviations. To take full advantage of health data, text-mining techniques have been applied by researchers to recognize named entities (NEs) mentioned in EHRs. However, the judgment of clinical data cannot be known solely from the NE level. For instance, a disease mention in the section of past medical history has different clinical significance when mentioned in the family medical history section. To obtain high-quality information and improve the understanding of clinical records, this work developed a machine learning-based section heading recognition system and evaluated its performance on a manually annotated corpus. The experiment results showed that the machine learning-based system achieved a satisfactory F-score of 0.939, which outperformed a dictionary-based system by 0.321.
KW - Electronic health record
KW - Information extraction
KW - Natural language processing
KW - Section recognition
UR - http://www.scopus.com/inward/record.url?scp=84911940910&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84911940910&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-13987-6_5
DO - 10.1007/978-3-319-13987-6_5
M3 - Article
AN - SCOPUS:84911940910
SN - 0302-9743
VL - 8916
SP - 47
EP - 55
JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ER -