ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술지 Probabilistic Co-Relevance for Query-Sensitive Similarity Measurement in Information Retrieval
Cited 4 time in scopus Download 0 time Share share facebook twitter linkedin kakaostory
저자
나승훈
발행일
201303
출처
Information Processing & Management, v.49 no.2, pp.558-575
ISSN
0306-4573
출판사
Elsevier
DOI
https://dx.doi.org/10.1016/j.ipm.2012.10.002
협약과제
12VS1500, 지식학습 기반의 다국어 확장이 용이한 관광/국제행사 통역률 90%급 자동 통번역 소프트웨어 원천 기술 개발, 김영길
초록
Interdocument similarities are the fundamental information source required in clusterbased retrieval, which is an advanced retrieval approach that significantly improves performance during information retrieval (IR). An effective similarity metric is query-sensitive similarity, which was introduced by Tombros and Rijsbergen as method to more directly satisfy the cluster hypothesis that forms the basis of cluster-based retrieval. Although this method is reported to be effective, existing applications of query-specific similarity are still limited to vector space models wherein there is no connection to probabilistic approaches. We suggest a probabilistic framework that defines query-sensitive similarity based on probabilistic co-relevance, where the similarity between two documents is proportional to the probability that they are both co-relevant to a specific given query. We further simplify the proposed co-relevance-based similarity by decomposing it into two separate relevance models. We then formulate all the requisite components for the proposed similarity metric in terms of scoring functions used by language modeling methods. Experimental results obtained using standard TREC test collections consistently showed that the proposed query-sensitive similarity measure performs better than term-based similarity and existing query-sensitive similarity in the context of Voorhees' nearest neighbor test (NNT). © 2012 Elsevier Ltd. All rights reserved.
키워드
Cluster hypothesis, Cluster-based retrieval, Inter-document similarity, Probabilistic co-relevance, Query-sensitive similarity
KSP 제안 키워드
Cluster-based retrieval, Information retrieval(IR), Information sources, Modeling method, Probabilistic approach, Probabilistic framework, Scoring functions, Similarity Measurement, Similarity Metric, Test collections, Vector space model(VSM)