TY - JOUR
T1 - PICO element detection in medical text without metadata
T2 - Are first sentences enough?
AU - Huang, Ke Chun
AU - Chiang, I-Jen
AU - Xiao, Furen
AU - Liao, Chun Chih
AU - Liu, Charles Chih Ho
AU - Wong, Jau Min
PY - 2013/10
Y1 - 2013/10
N2 - Efficient identification of patient, intervention, comparison, and outcome (PICO) components in medical articles is helpful in evidence-based medicine. The purpose of this study is to clarify whether first sentences of these components are good enough to train naive Bayes classifiers for sentence-level PICO element detection. We extracted 19,854 structured abstracts of randomized controlled trials with any P/I/O label from PubMed for naive Bayes classifiers training. Performances of classifiers trained by first sentences of each section ( CF) and those trained by all sentences ( CA) were compared using all sentences by ten-fold cross-validation. The results measured by recall, precision, and F-measures show that there are no significant differences in performance between CF and CA for detection of O-element ( F-measure. = 0.731. ±. 0.009 vs. 0.738. ±. 0.010, p= 0.123). However, CA perform better for I-elements, in terms of recall (0.752. ±. 0.012 vs. 0.620. ±. 0.007, p<. 0.001) and F-measures (0.728. ±. 0.006 vs. 0.662. ±. 0.007, p<. 0.001). For P-elements, CF have higher precision (0.714. ±. 0.009 vs. 0.665. ±. 0.010, p<. 0.001), but lower recall (0.766. ±. 0.013 vs. 0.811. ±. 0.012, p<. 0.001). CF are not always better than CA in sentence-level PICO element detection. Their performance varies in detecting different elements.
AB - Efficient identification of patient, intervention, comparison, and outcome (PICO) components in medical articles is helpful in evidence-based medicine. The purpose of this study is to clarify whether first sentences of these components are good enough to train naive Bayes classifiers for sentence-level PICO element detection. We extracted 19,854 structured abstracts of randomized controlled trials with any P/I/O label from PubMed for naive Bayes classifiers training. Performances of classifiers trained by first sentences of each section ( CF) and those trained by all sentences ( CA) were compared using all sentences by ten-fold cross-validation. The results measured by recall, precision, and F-measures show that there are no significant differences in performance between CF and CA for detection of O-element ( F-measure. = 0.731. ±. 0.009 vs. 0.738. ±. 0.010, p= 0.123). However, CA perform better for I-elements, in terms of recall (0.752. ±. 0.012 vs. 0.620. ±. 0.007, p<. 0.001) and F-measures (0.728. ±. 0.006 vs. 0.662. ±. 0.007, p<. 0.001). For P-elements, CF have higher precision (0.714. ±. 0.009 vs. 0.665. ±. 0.010, p<. 0.001), but lower recall (0.766. ±. 0.013 vs. 0.811. ±. 0.012, p<. 0.001). CF are not always better than CA in sentence-level PICO element detection. Their performance varies in detecting different elements.
KW - Evidence-based medicine
KW - Information extraction
KW - Information retrieval
KW - Natural language processing
KW - Question answering
KW - Text mining
UR - http://www.scopus.com/inward/record.url?scp=84883806449&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84883806449&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2013.07.009
DO - 10.1016/j.jbi.2013.07.009
M3 - Article
C2 - 23899909
AN - SCOPUS:84883806449
SN - 1532-0464
VL - 46
SP - 940
EP - 946
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
IS - 5
ER -