TY - GEN
T1 - Extracting eligibility criteria from the narrative text of scientific research articles
AU - Lin, Ching Yun
AU - Liou, Der Ming
AU - Pan, Mei Lien
N1 - Publisher Copyright:
© 2017 International Medical Informatics Association (IMIA) and IOS Press.
PY - 2017/1/1
Y1 - 2017/1/1
N2 - Eligibility criteria among hundreds of National Health Insurance Research Database (NHIRD) research papers have similar constituent elements, such as demographic characteristics or diagnostic codes. The study results of the same disease could vary among different research due to the variation of the criteria statements, therefore the narrative patterns analysis tool would be helpful for summarizing the knowledge implicitly contained in the eligibility criteria. In this study, we developed a series of R-based text processing methods to extract the narrative eligibility criteria in NHIRD papers by simplifying the article titles and content paragraphs, identifying medical concepts and abbreviations, then detecting basic demographic characteristics and ICD-9-CM diagnosis codes. Although there is still room for improvement on study type identifying, the high performance in classifying the study type, detecting age restrictions and extracting ICD-9-CM codes still shows the system usefulness for the analysis of eligibility criteria.
AB - Eligibility criteria among hundreds of National Health Insurance Research Database (NHIRD) research papers have similar constituent elements, such as demographic characteristics or diagnostic codes. The study results of the same disease could vary among different research due to the variation of the criteria statements, therefore the narrative patterns analysis tool would be helpful for summarizing the knowledge implicitly contained in the eligibility criteria. In this study, we developed a series of R-based text processing methods to extract the narrative eligibility criteria in NHIRD papers by simplifying the article titles and content paragraphs, identifying medical concepts and abbreviations, then detecting basic demographic characteristics and ICD-9-CM diagnosis codes. Although there is still room for improvement on study type identifying, the high performance in classifying the study type, detecting age restrictions and extracting ICD-9-CM codes still shows the system usefulness for the analysis of eligibility criteria.
KW - Database
KW - Natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85040510156&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85040510156&partnerID=8YFLogxK
U2 - 10.3233/978-1-61499-830-3-481
DO - 10.3233/978-1-61499-830-3-481
M3 - Conference contribution
C2 - 29295141
AN - SCOPUS:85040510156
T3 - Studies in Health Technology and Informatics
SP - 481
EP - 485
BT - MEDINFO 2017
A2 - Dongsheng, Zhao
A2 - Gundlapalli, Adi V.
A2 - Marie-Christine, Jaulent
PB - IOS Press
T2 - 16th World Congress of Medical and Health Informatics: Precision Healthcare through Informatics, MedInfo 2017
Y2 - 21 August 2017 through 25 August 2017
ER -