Multilevel k-way partitioning scheme for irregular graphs
Journal of Parallel and Distributed Computing
Multilevel hypergraph partitioning: applications in VLSI domain
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
ACM Computing Surveys (CSUR)
Density biased sampling: an improved method for data mining and clustering
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Finding Consistent Clusters in Data Partitions
MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
SimRank: a measure of structural-context similarity
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets
IEEE Transactions on Knowledge and Data Engineering
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Combining Multiple Weak Clusterings
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Solving cluster ensemble problems by bipartite graph partitioning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Combining multiple clustering systems
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Combining Multiple Clusterings Using Evidence Accumulation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Link-based similarity measures for the classification of Web documents
Journal of the American Society for Information Science and Technology
Learning Pairwise Similarity for Data Clustering
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 01
Evaluation of Stability of k-Means Cluster Ensembles with Respect to Random Initialization
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Method of Clustering Combination Applied to Satellite Image Analysis
ICIAP '07 Proceedings of the 14th International Conference on Image Analysis and Processing
Discriminatively regularized least-squares classification
Pattern Recognition
A Density-Biased Sampling Technique to Improve Cluster Representativeness
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Weighted cluster ensembles: Methods and analysis
ACM Transactions on Knowledge Discovery from Data (TKDD)
Refining Pairwise Similarity Matrix for Cluster Ensemble Problem with Cluster Relations
DS '08 Proceedings of the 11th International Conference on Discovery Science
Data clustering: a user’s dilemma
PReMI'05 Proceedings of the First international conference on Pattern Recognition and Machine Intelligence
Analysing social networks within bibliographical data
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Hi-index | 0.00 |
Cluster ensemble methods have emerged as powerful techniques, aggregating several input data clusterings to generate a single output clustering, with improved robustness and stability. In particular, link-based similarity techniques have recently been introduced with superior performance to the conventional co-association method. Their potential and applicability are, however limited due to the underlying time complexity. In light of such shortcoming, this paper presents two approximate approaches that mitigate the problem of time complexity: the approximate algorithm approach (Approximate SimRank Based Similarity matrix) and the approximate data approach (Prototype-based cluster ensemble model). The first approach involves decreasing the computational requirement of the existing link-based technique; the second reduces the size of the problem by finding a smaller, representative, approximate dataset, derived by a density-biased sampling technique. The advantages of both approximate approaches are empirically demonstrated over 22 datasets (both artificial and real data) and statistical comparisons of performance (with 95% confidence level) with three well-known validity criteria. Results obtained from these experiments suggest that approximate techniques can efficiently help scaling up the application of link-based similarity methods to wider range of data sizes.