ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술지 Selectivity Estimation Using Frequent Itemset Mining
Cited - time in scopus Download 7 time Share share facebook twitter linkedin kakaostory
저자
엄보윤, Christopher Jermaine, 이춘화
발행일
201502
출처
한국지식정보기술학회논문지, v.10 no.1, pp.69-78
ISSN
1975-7700
출판사
한국지식정보기술학회
협약과제
14MR3300, 다수의 비정형 스크린 분배 및 협업을 통한 오픈스크린 서비스 플랫폼 기술 개발, 이현우
초록
In query processing, query optimization is an important function of a database management system since overall query execution time can be significantly affected by the quality of the plan chosen by the query optimizer. Under cost-based optimization, a query optimizer estimates the cost for every possible query plans based on the underlying data distribution in synopses of database relations. The most common synopses in commercial databases have been histograms. However, when there is correlation among datum, one-dimensional histograms can provide poor estimation quality. Motivated by this, we propose a new approach to perform more accurate selectivity estimation, even for correlated data. To deal with the correlation that may exist among datum, we adopt well-known techniques in data mining and extract attribute values that occur together frequently using frequent itemsets mining. Through experimentation, we found that our approach is effective in modeling correlations and that this method approximates intermediate relations more accurately. In fact, it gives precise estimates, particularly for the correlated data.
KSP 제안 키워드
Correlated data, Data Distribution, Data mining(DM), Database Management System, Estimation quality, Frequent Itemsets mining, Frequent itemset mining, Important function, New approach, One-dimensional, Query Optimization