Information Retrieval
Evaluating contents-link coupled web page clustering for web search results
Proceedings of the eleventh international conference on Information and knowledge management
On Clustering Validation Techniques
Journal of Intelligent Information Systems
Clustering web documents: a phrase-based method for grouping search engine results
Clustering web documents: a phrase-based method for grouping search engine results
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Information Theory, Inference & Learning Algorithms
Information Theory, Inference & Learning Algorithms
Improving Web Clustering by Cluster Selection
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Standardized Evaluation Method for Web Clustering Results
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Query Directed Web Page Clustering
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
A K-means approach based on concept hierarchical tree for search results clustering
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Hi-index | 0.01 |
Many clustering algorithms have been developed and researchers need to be able to compare their effectiveness. For some clustering problems, like web page clustering, different algorithms produce clusterings with different characteristics: coarse vs fine granularity, disjoint vs overlapping, flat vs hierarchical. The lack of a clustering evaluation method that can evaluate clusterings with different characteristics has led to incomparable research and results. QC4 solves this by providing a new structure for defining general ideal clusterings and new measurements for evaluating clusterings with different characteristics with respect to a general ideal clustering. The paper describes QC4 and evaluates it within the web clustering domain by comparison to existing evaluation measurements on synthetic test cases and on real world web page clustering tasks. The synthetic test cases show that only QC4 can cope correctly with overlapping clusters, hierarchical clusterings, and all the difficult boundary cases. In the real world tasks, which represent simple clustering situations, QC4 is mostly consistent with the existing measurements and makes better conclusions in some cases.