Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
Generating hierarchical summaries for web searches
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Proceedings of the 13th international conference on World Wide Web
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A personalized search engine based on web-snippet hierarchical clustering
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
A search result clustering method using informatively named entities
Proceedings of the 7th annual ACM international workshop on Web information and data management
Journal of Intelligent Information Systems
Hi-index | 0.00 |
Search result clustering can help users quickly browse through the documents returned by search engine. Traditional clustering techniques are inadequate since they don't generate clusters with highly readable names. Label-based clustering is quite promising, which usually takes n-gram (usually bi-gram) as label candidates. However, meaningless n-grams are not removed from the candidates. In this paper, DF, user log and query context are introduced as label ranking features. An integrated model is used to combine these three features for cluster label ranking. Further more, a novel graph based clustering algorithm (GBCA) for hierarchical clustering is proposed. Experiments indicate that the cluster label extraction makes a great improvement (about 8%) over the baseline in precision, and GBCA outperforms STC and Snaket in F-Measure.