Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
Advances in fuzzy integration for pattern recognition
Fuzzy Sets and Systems - Special issue on fuzzy methods for computer vision and pattern recognition
Optimal combinations of pattern classifiers
Pattern Recognition Letters
Combination of Multiple Classifiers Using Local Accuracy Estimates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Relationship-based clustering and cluster ensembles for high-dimensional data mining
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Agent-Based Non-distributed and Distributed Clustering
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Artificial Intelligence Review
Hi-index | 0.00 |
Ensemble methods create solutions to learning problems by constructing a set of individual (different) solutions, and subsequently suitably aggregating these, e.g., by weighted averaging of the predictions in regression, or by taking a weighted vote on the predictions in classification. Such methods, which include Bayesian model averaging, bagging and boosting, have already become very popular for supervised learning problems. For clustering, using ensembles can help to improve the quality and robustness of the results, to re-use existing "knowledge", and to deal with data-distributed situations where not all objects or features are simultaneously available for computations. Aggregation strategies can be based on the idea of minimizing "average" dissimilarity. If only the individual cluster memberships are used, this leads to an optimization problem which in general is computationally hard. For a specific similarity measure which in the crisp case uses overall discordance (modulo relabeling), the characterization of the optimal solution allows the construction of a greedy forward aggregation algorithm ("voting") which performs well on a number of clustering problems. Alternative aggregation strategies can be based on re-clustering the objects according to the rate of co-labeling, or by clustering the collection of memberships of all objects grouped according to the labels. We conclude with an outlook on possible further research on cluster ensembles.