Automatic parameter determination in subspace clustering with gravitation function

Authors:
Jiwu Zhao
Affiliations:
Heinrich-Heine University, Duesseldorf, Germany
Venue:
Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Year:
2010

Citing 12
Cited 0

Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding generalized projected clusters in high dimensional spaces

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Clustering through decision tree construction

Proceedings of the ninth international conference on Information and knowledge management
A new cell-based clustering method for large, high-dimensional data in data mining applications

Proceedings of the 2002 ACM symposium on Applied computing
A Monte Carlo algorithm for fast projective clustering

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
d-Clusters: Capturing Subspace Correlation in a Large Data Set

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A shrinking-based approach for multi-dimensional data analysis

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)
DENCLUE 2.0: fast clustering based on kernel density estimation

IDA'07 Proceedings of the 7th international conference on Intelligent data analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining is a process of discovering and exploiting hidden patterns from data. Clustering as an important task of data mining divides the observations into groups (clusters), which is according to the principle that the observations in the same cluster are similar, and the ones from different clusters are dissimilar to each other. Subspace clustering enables clustering in subspaces within a data set, which means the clusters could be found not only in the whole space but also in subspaces. The well-known subspace clustering methods have a common problem with determination of parameters. To face this issue, a new subspace clustering method based on bottom-up method is introduced in this article. In contrast to other methods, this approach applies a gravitation function to select data and dimensions by using a self-comparison technique. The new method can determine parameters simply and independently of amount of the data, which makes the subspace clustering more practical.