TY - JOUR
T1 - A frame-based approach for reference metadata extraction
AU - Hsieh, Yu Lun
AU - Liu, Shih Hung
AU - Yang, Ting Hao
AU - Chen, Yu Hsuan
AU - Chang, Yung Chun
AU - Hsieh, Gladys
AU - Shih, Cheng Wei
AU - Lu, Chun Hung
AU - Hsu, Wen Lian
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2014.
PY - 2014
Y1 - 2014
N2 - In this paper, we propose a novel frame-based approach (FBA) and use reference metadata extraction as a case study to demonstrate its advantages. The main contributions of this research are three-fold. First, the new frame matching algorithm, based on sequence alignment, can compensate for the shortcomings of traditional rule-based approach, in which rule matching lacks flexibility and generality. Second, an approximate matching is adopted for capturing reasonable abbreviations or errors in the input reference string to further increase the coverage of the frames. Third, experiments conducted on extensive datasets show that the same knowledge framework performed equally well on various untrained domains. Comparing to a widely-used machine learning method, Conditional Random Fields (CRFs), the FBA can drastically reduce the average field error rate across all four independent test sets by 70%\ (2.24% vs. 7.54%).
AB - In this paper, we propose a novel frame-based approach (FBA) and use reference metadata extraction as a case study to demonstrate its advantages. The main contributions of this research are three-fold. First, the new frame matching algorithm, based on sequence alignment, can compensate for the shortcomings of traditional rule-based approach, in which rule matching lacks flexibility and generality. Second, an approximate matching is adopted for capturing reasonable abbreviations or errors in the input reference string to further increase the coverage of the frames. Third, experiments conducted on extensive datasets show that the same knowledge framework performed equally well on various untrained domains. Comparing to a widely-used machine learning method, Conditional Random Fields (CRFs), the FBA can drastically reduce the average field error rate across all four independent test sets by 70%\ (2.24% vs. 7.54%).
KW - Frame-based approach
KW - Knowledge representation
KW - Reference metadata extraction
UR - http://www.scopus.com/inward/record.url?scp=84911938604&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84911938604&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-13987-6_15
DO - 10.1007/978-3-319-13987-6_15
M3 - Article
AN - SCOPUS:84911938604
SN - 0302-9743
VL - 8916
SP - 154
EP - 163
JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ER -