ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술대회 Normalization of Gene/Protein Names in Biological Literatures using Vector-Space Model
Cited 4 time in scopus Download 0 time Share share facebook twitter linkedin kakaostory
저자
임준호, 장현철, 임재수, 박수준
발행일
200708
출처
International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS) 2007, pp.390-393
DOI
https://dx.doi.org/10.1109/IEMBS.2007.4352306
협약과제
07MB2700, 유비쿼터스 건강관리용 모듈 시스템, 박선희
초록
As the number of biological literatures grows exponentially, needs for text mining system are increased. In text mining area, normalization is mapping gene/protein names to a database. It is necessary to combine extracted information from various literatures and to curate a database or an ontology using literatures. Previous normalization researches used direct comparison methods between a database and literatures, but it is weak to extremely variational gene/protein names in literatures. Therefore, in this paper, we propose a normalization method using Vector-Space Model. For each gene/protein name, we rank identifiers using Vector-Space Model, and find the most similar identifier with the name. Experimental result shows the proposed method has 70.7% f-measure. © 2007 IEEE.
KSP 제안 키워드
Experimental Result, F-measure, Mining area, Mining system, Normalization method, comparison method, direct comparison, space model, text mining