NP-hard problems in hierarchical-tree clustering
Acta Informatica
Algorithms for clustering data
Algorithms for clustering data
The Strength of Weak Learnability
Machine Learning
Machine Learning
Multilevel hypergraph partitioning: application in VLSI domain
DAC '97 Proceedings of the 34th annual Design Automation Conference
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
Clustering through decision tree construction
Proceedings of the ninth international conference on Information and knowledge management
Information Retrieval
Multi-Objective Optimization Using Evolutionary Algorithms
Multi-Objective Optimization Using Evolutionary Algorithms
A Monte Carlo algorithm for fast projective clustering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
When Is ''Nearest Neighbor'' Meaningful?
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Refining Initial Points for K-Means Clustering
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
What Is the Nearest Neighbor in High Dimensional Spaces?
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Voting-Merging: An Ensemble Method for Clustering
ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Finding Consistent Clusters in Data Partitions
MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
An Adaptive Meta-Clustering Approach: Combining the Information from Different Clustering Results
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Data Clustering Using Evidence Accumulation
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Bagging for Path-Based Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Solving cluster ensemble problems by bipartite graph partitioning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
HARP: A Practical Projected Clustering Algorithm
IEEE Transactions on Knowledge and Data Engineering
Density Connected Clustering with Local Subspace Preferences
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
SCHISM: A New Approach for Interesting Subspace Mining
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Iterative Projected Clustering by Subspace Mining
IEEE Transactions on Knowledge and Data Engineering
Projective Clustering by Histograms
IEEE Transactions on Knowledge and Data Engineering
Combining multiple clustering systems
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
On Discovery of Extremely Low-Dimensional Clusters Using Semi-Supervised Projected Clustering
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Clustering Ensembles: Models of Consensus and Weak Partitions
IEEE Transactions on Pattern Analysis and Machine Intelligence
Comparing clusterings: an axiomatic view
ICML '05 Proceedings of the 22nd international conference on Machine learning
A Generic Framework for Efficient Subspace Clustering of High-Dimensional Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Comparing Subspace Clusterings
IEEE Transactions on Knowledge and Data Engineering
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
ACM Transactions on Knowledge Discovery from Data (TKDD)
An aggregated clustering approach using multi-ant colonies algorithms
Pattern Recognition
Locally adaptive metrics for clustering high dimensional data
Data Mining and Knowledge Discovery
Data Clustering: Theory, Algorithms, and Applications (ASA-SIAM Series on Statistics and Applied Probability)
Muiltiobjective optimization using nondominated sorting in genetic algorithms
Evolutionary Computation
Knowledge and Information Systems
Solving Consensus and Semi-supervised Clustering Problems Using Nonnegative Matrix Factorization
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
EDSC: efficient density-based subspace clustering
Proceedings of the 17th ACM conference on Information and knowledge management
Weighted cluster ensembles: Methods and analysis
ACM Transactions on Knowledge Discovery from Data (TKDD)
ACM Transactions on Knowledge Discovery from Data (TKDD)
A Probability Model for Projective Clustering on High Dimensional Data
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Subspace and projected clustering: experimental evaluation and analysis
Knowledge and Information Systems
Projective Clustering Ensembles
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Evaluating clustering in subspace projections of high dimensional data
Proceedings of the VLDB Endowment
Finding natural clusters using multi-clusterer combiner based on shared nearest neighbors
MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
Detection and visualization of subspace cluster hierarchies
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Nonparametric Bayesian clustering ensembles
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Enhancing Single-Objective Projective Clustering Ensembles
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Statistical Analysis and Data Mining
A review: accuracy optimization in clustering ensembles using genetic algorithms
Artificial Intelligence Review
Advancing data clustering via projective clustering ensembles
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
The role of hubness in clustering high-dimensional data
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
DB-CSC: a density-based approach for subspace clustering in graphs with feature vectors
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Scalable density-based subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
External evaluation measures for subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Finding hierarchies of subspace clusters
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
A fast and elitist multiobjective genetic algorithm: NSGA-II
IEEE Transactions on Evolutionary Computation
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Hi-index | 0.00 |
A considerable amount of work has been done in data clustering research during the last four decades, and a myriad of methods has been proposed focusing on different data types, proximity functions, cluster representation models, and cluster presentation. However, clustering remains a challenging problem due to its ill-posed nature: it is well known that off-the-shelf clustering methods may discover different patterns in a given set of data, mainly because every clustering algorithm has its own bias resulting from the optimization of different criteria. This bias becomes even more important as in almost all real-world applications, data is inherently high-dimensional and multiple clustering solutions might be available for the same data collection. In this respect, the problems of projective clustering and clustering ensembles have been recently defined to deal with the high dimensionality and multiple clusterings issues, respectively. Nevertheless, despite such two issues can often be encountered together, existing approaches to the two problems have been developed independently of each other. In our earlier work Gullo et al. (Proceedings of the international conference on data mining (ICDM), 2009a) we introduced a novel clustering problem, called projective clustering ensembles (PCE): given a set (ensemble) of projective clustering solutions, the goal is to derive a projective consensus clustering, i.e., a projective clustering that complies with the information on object-to-cluster and the feature-to-cluster assignments given in the ensemble. In this paper, we enhance our previous study and provide theoretical and experimental insights into the PCE problem. PCE is formalized as an optimization problem and is designed to satisfy desirable requirements on independence from the specific clustering ensemble algorithm, ability to handle hard as well as soft data clustering, and different feature weightings. Two PCE formulations are defined: a two-objective optimization problem, in which the two objective functions respectively account for the object- and feature-based representations of the solutions in the ensemble, and a single-objective optimization problem, in which the object- and feature-based representations are embedded into a single function to measure the distance error between the projective consensus clustering and the projective ensemble. The significance of the proposed methods for solving the PCE problem has been shown through an extensive experimental evaluation based on several datasets and comparatively with projective clustering and clustering ensemble baselines.