TY - JOUR
T1 - Feature engineering for recognizing adverse drug reactions from twitter posts
AU - Dai, Hong Jie
AU - Touray, Musa
AU - Jonnagaddala, Jitendra
AU - Syed-Abdul, Shabbir
N1 - Publisher Copyright:
© 2016 by the authors.
PY - 2016/5/25
Y1 - 2016/5/25
N2 - Social media platforms are emerging digital communication channels that provide aneasy way for common people to share their health and medication experiences online. With morepeople discussing their health information online publicly, social media platforms present a richsource of information for exploring adverse drug reactions (ADRs). ADRs are major public healthproblems that result in deaths and hospitalizations of millions of people. Unfortunately, not allADRs are identified before a drug is made available in the market. In this study, an ADR eventmonitoring system is developed which can recognize ADR mentions from a tweet and classify itsassertion. We explored several entity recognition features, feature conjunctions, and feature selectionand analyzed their characteristics and impacts on the recognition of ADRs, which have never beenstudied previously. The results demonstrate that the entity recognition performance for ADR canachieve an F-score of 0.562 on the PSB Social Media Mining shared task dataset, which outperformsthe partial-matching-based method by 0.122. After feature selection, the F-score can be furtherimproved by 0.026. This novel technique of text mining utilizing shared online social media data willopen an array of opportunities for researchers to explore various health related issues.
AB - Social media platforms are emerging digital communication channels that provide aneasy way for common people to share their health and medication experiences online. With morepeople discussing their health information online publicly, social media platforms present a richsource of information for exploring adverse drug reactions (ADRs). ADRs are major public healthproblems that result in deaths and hospitalizations of millions of people. Unfortunately, not allADRs are identified before a drug is made available in the market. In this study, an ADR eventmonitoring system is developed which can recognize ADR mentions from a tweet and classify itsassertion. We explored several entity recognition features, feature conjunctions, and feature selectionand analyzed their characteristics and impacts on the recognition of ADRs, which have never beenstudied previously. The results demonstrate that the entity recognition performance for ADR canachieve an F-score of 0.562 on the PSB Social Media Mining shared task dataset, which outperformsthe partial-matching-based method by 0.122. After feature selection, the F-score can be furtherimproved by 0.026. This novel technique of text mining utilizing shared online social media data willopen an array of opportunities for researchers to explore various health related issues.
KW - Adverse drug reactions
KW - Named entity recognition
KW - Natural language processing
KW - Social media
KW - Word embedding
UR - http://www.scopus.com/inward/record.url?scp=84976524192&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84976524192&partnerID=8YFLogxK
U2 - 10.3390/info7020027
DO - 10.3390/info7020027
M3 - Article
AN - SCOPUS:84976524192
SN - 2078-2489
VL - 7
JO - Information (Switzerland)
JF - Information (Switzerland)
IS - 2
M1 - 27
ER -