A Human-Computer Interactive Method for Projected Clustering

Authors:
Charu C. Aggarwal
Affiliations:
-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2004

Citing 25
Cited 7

Algorithms for clustering data

Algorithms for clustering data
Randomized algorithms

Randomized algorithms
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Visual classification: an interactive approach to decision tree construction

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding generalized projected clusters in high dimensional spaces

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Towards an effective cooperation of the user and the computer for classification

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Interactive exploration of very large relational datasets through 3D dynamic projections

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
ROCK: a robust clustering algorithm for categorical attributes

Information Systems
A human-computer cooperative system for effective high dimensional clustering

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Fuzzy Models and Algorithms for Pattern Recognition and Image Processing

Fuzzy Models and Algorithms for Pattern Recognition and Image Processing
Data Structures and Algorithms

Data Structures and Algorithms
HD-Eye: Visual Mining of High-Dimensional Data

IEEE Computer Graphics and Applications
Constraint-Based, Multidimensional Data Mining

Computer
A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
What Is the Nearest Neighbor in High Dimensional Spaces?

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Clustering data streams

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Clustering Large Datasets in Arbitrary Metric Spaces

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Collaborative Knowledge Acquisition with a Genetic Algorithm

ICTAI '97 Proceedings of the 9th International Conference on Tools with Artificial Intelligence

Xproj: a framework for projected structural clustering of xml documents

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A framework for projected clustering of high dimensional data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Incremental clustering of dynamic data streams using connectivity based representative points

Data & Knowledge Engineering
Connectivity based stream clustering using localised density exemplars

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Scalable Clustering for Mining Local-Correlated Clusters in High Dimensions and Large Datasets

Fundamenta Informaticae - Intelligent Data Analysis in Granular Computing
Generalized projected clustering in high-dimensional data streams

APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Decision theoretic fusion framework for actionability using data mining on an embedded system

Data Mining

Quantified Score

Hi-index	0.01

Visualization

Abstract

Abstract--Clustering is a central task in data mining applications such as customer segmentation. High-dimensional data has always been a challenge for clustering algorithms because of the inherent sparsity of the points. Therefore, techniques have recently been proposed to find clusters in hidden subspaces of the data. However, since the behavior of the data can vary considerably in different subspaces, it is often difficult to define the notion of a cluster with the use of simple mathematical formalizations. The widely used practice of treating clustering as the exact problem of optimizing an arbitrarily chosen objective function can often lead to misleading results. In fact, the proper clustering definition may vary not only with the application and data set but also with the perceptions of the end user. This makes it difficult to separate the definition of the clustering problem from the perception of an end-user. In this paper, we propose a system which performs high-dimensional clustering by cooperation between the human and the computer. The complex task of cluster creation is accomplished through a combination of human intuition and the computational support provided by the computer. The result is a system which leverages the best abilities of both the human and the computer for solving the clustering problem.