A new cell-based clustering method for large, high-dimensional data in data mining applications

Authors:
Jae-Woo Chang;Du-Seok Jin
Affiliations:
Chonbuk National University, Chonju, chonbuk 561-756, South Korea;Korea Institute of Science and Technology Information, Yusong, taejon, 305-333, South Korea
Venue:
Proceedings of the 2002 ACM symposium on Applied computing
Year:
2002

Citing 4
Cited 10

A cost model for nearest neighbor search in high-dimensional data space

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques

Data mining: concepts and techniques
STING: A Statistical Information Grid Approach to Spatial Data Mining

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases

Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning correlations using the mixture-of-subsets model

ACM Transactions on Knowledge Discovery from Data (TKDD)
SS-ClusterTree: a subspace clustering based indexing algorithm over high-dimensional image features

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Automatic parameter determination in subspace clustering with gravitation function

Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Making interval-based clustering rank-aware

Proceedings of the 14th International Conference on Extending Database Technology
A new clustering algorithm with the convergence proof

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part I
Feature interaction in subspace clustering using the Choquet integral

Pattern Recognition
Clustering in applications with multiple data sources-A mutual subspace clustering approach

Neurocomputing
Novel soft subspace clustering with multi-objective evolutionary approach for high-dimensional data

Pattern Recognition
A clustering ensemble framework based on elite selection of weighted clusters

Advances in Data Analysis and Classification

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently data mining applications require a large amount of high-dimensional data. However, most clustering methods for data miming do not work efficiently for dealing with large, high-dimensional data because of the so-called 'curse of dimensionality'[1] and the limitation of available memory. In this paper, we propose a new cell-based clustering method which is more efficient for large, high-dimensional data than the existing clustering methods. Our clustering method provides an efficient cell creation algorithm using a space-partitioning technique and uses a filtering-based index structure using an approximation technique. Finally, we compare the performance of our cell-based clustering method with the CLIQUE method in terms of cluster construction time, precision, and retrieval time. The experimental results show that our clustering method achieves better performance on cluster construction time and retrieval time.