ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper SVM-Based Biological Named Entity Recognition Using Minimum Edit-Distance Feature Boosted by Virtual Examples
Cited 4 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Eun Ji Yi, Gary Geunbae Lee, Yu Song, Soo Jun Park
Issue Date
2004-05
Citation
International Conference on Natural Language Processing (IJCNLP) 2004 (LNCS 3248), v.3248, pp.807-814
Publisher
Springer
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1007/978-3-540-30211-7_86
Abstract
In this paper, we propose two independent solutions to the problems of spelling variants and the lack of annotated corpus, which are the main difficulties in SVM(Support-Vector Machine) and other machine-learning based biological named entity recognition. To resolve the problem of spelling variants, we propose the use of edit-distance as a feature for SVM. To resolve the lack-of-corpus problem, we propose the use of virtual examples, by which the annotated corpus can be automatically expanded in a fast, efficient and easy way. The experimental results show that the introduction of edit-distance produces some improvements. And the model, which is trained with the corpus expanded by virtual examples, outperforms the model trained with the original corpus. Finally, we achieved the high performance of 71.46 % in F-measure (64.03 % in precision, 80.84 % in recall) in the experiment of five categories named entity recognition on GENIA corpus (version 3.0). © Springer-Verlag Berlin Heidelberg 2005.
KSP Keywords
Distance Feature, F-measure, High performance, Named Entity Recognition, Support VectorMachine(SVM), annotated corpus, machine Learning