ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Projected Clustering for Categorical Datasets
Cited 21 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Min Ho Kim, R.S. Ramakrishna
Issue Date
2006-09
Citation
Pattern Recognition Letters, v.27, no.12, pp.1405-1417
ISSN
0167-8655
Publisher
Elsevier
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1016/j.patrec.2006.01.011
Abstract
This paper deals with the problem of clustering categorical datasets. Categorical data typically suffer from limited measuring levels and exhibit sparsity in a space of very high dimension. Conventional dissimilarity measures are, therefore, inadequate. We propose a new clustering algorithm based on projected clustering. The proposed algorithm, although hierarchical in essence, avoids the characteristic error propagation through reassignment and deletion of bad clusters. We also propose new indices for cluster validation in categorical datasets, an area that is almost unexplored. We present techniques for finding optimal number of clusters, and for initialization of centers of clusters. Experimental results demonstrate the effectiveness of the proposed clustering algorithm. The cluster validation for categorical datasets is also shown to be quite efficient. © 2006 Elsevier B.V. All rights reserved.
KSP Keywords
Categorical datasets, Cluster Validation, Clustering algorithm, High dimension, Optimal number of clusters, Projected Clustering, dissimilarity measure, error propagation