TY - JOUR
T1 - TEMPTING system
T2 - A hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries
AU - Chang, Yung Chun
AU - Dai, Hong Jie
AU - Wu, Johnny Chi Yang
AU - Chen, Jian Ming
AU - Tsai, Richard Tzong Han
AU - Hsu, Wen Lian
N1 - Funding Information:
This research was supported by Informatics for Integrating Biology and the Bedside (i2b2), award number 2U54LM008748 from the NIH/National Library of Medicine (NLM); by the National Heart, Lung and Blood Institute (NHLBI); and by award number 1R13LM01141101 from the NIH NLM. The content is solely the responsibility of the authors and does not necessarily reflect the official views of the NLM, NHLBI, or the National Institutes of Health. The study was conducted under the “III Innovative and Prospective Technologies Project” of the Institute for Information Industry, which is subsidized by the Ministry of Economic Affairs of the Republic of China. Moreover, this research was also supported by the National Science Council of Taiwan under Grant NSC 101-2319-B-010-002, NSC 102-2319-B-010-002, and NSC-102-2218-E-038-001, the research Grant TMU101-AE1-B55 of Taipei Medical University.
PY - 2013
Y1 - 2013
N2 - Patient discharge summaries provide detailed medical information about individuals who have been hospitalized. To make a precise and legitimate assessment of the abundant data, a proper time layout of the sequence of relevant events should be compiled and used to drive a patient-specific timeline, which could further assist medical personnel in making clinical decisions. The process of identifying the chronological order of entities is called temporal relation extraction. In this paper, we propose a hybrid method to identify appropriate temporal links between a pair of entities. The method combines two approaches: one is rule-based and the other is based on the maximum entropy model. We develop an integration algorithm to fuse the results of the two approaches. All rules and the integration algorithm are formally stated so that one can easily reproduce the system and results. To optimize the system's configuration, we used the 2012 i2b2 challenge TLINK track dataset and applied threefold cross validation to the training set. Then, we evaluated its performance on the training and test datasets. The experiment results show that the proposed TEMPTING (TEMPoral relaTion extractING) system (ranked seventh) achieved an F-score of 0.563, which was at least 30% better than that of the baseline system, which randomly selects TLINK candidates from all pairs and assigns the TLINK types. The TEMPTING system using the hybrid method also outperformed the stage-based TEMPTING system. Its F-scores were 3.51% and 0.97% better than those of the stage-based system on the training set and test set, respectively.
AB - Patient discharge summaries provide detailed medical information about individuals who have been hospitalized. To make a precise and legitimate assessment of the abundant data, a proper time layout of the sequence of relevant events should be compiled and used to drive a patient-specific timeline, which could further assist medical personnel in making clinical decisions. The process of identifying the chronological order of entities is called temporal relation extraction. In this paper, we propose a hybrid method to identify appropriate temporal links between a pair of entities. The method combines two approaches: one is rule-based and the other is based on the maximum entropy model. We develop an integration algorithm to fuse the results of the two approaches. All rules and the integration algorithm are formally stated so that one can easily reproduce the system and results. To optimize the system's configuration, we used the 2012 i2b2 challenge TLINK track dataset and applied threefold cross validation to the training set. Then, we evaluated its performance on the training and test datasets. The experiment results show that the proposed TEMPTING (TEMPoral relaTion extractING) system (ranked seventh) achieved an F-score of 0.563, which was at least 30% better than that of the baseline system, which randomly selects TLINK candidates from all pairs and assigns the TLINK types. The TEMPTING system using the hybrid method also outperformed the stage-based TEMPTING system. Its F-scores were 3.51% and 0.97% better than those of the stage-based system on the training set and test set, respectively.
KW - Hybrid method
KW - Maximum entropy
KW - Natural language processing
KW - Temporal relation extraction
KW - Text mining
UR - http://www.scopus.com/inward/record.url?scp=84897052209&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84897052209&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2013.09.007
DO - 10.1016/j.jbi.2013.09.007
M3 - Article
C2 - 24060600
AN - SCOPUS:84897052209
SN - 1532-0464
VL - 46
SP - S54-S62
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
IS - SUPPL.
ER -