A Novel Method for Hierarchical Clustering of Search Results

Authors:
Gang Zhang;Yue Liu;Songbo Tan;Xueqi Cheng
Affiliations:
-;-;-;-
Venue:
WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Year:
2007

Citing 8
Cited 1

Reexamining the cluster hypothesis: scatter/gather on retrieval results

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Web document clustering: a feasibility demonstration

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Grouper: a dynamic clustering interface to Web search results

WWW '99 Proceedings of the eighth international conference on World Wide Web
Generating hierarchical summaries for web searches

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A hierarchical monothetic document clustering algorithm for summarization and browsing search results

Proceedings of the 13th international conference on World Wide Web
Learning to cluster web search results

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A personalized search engine based on web-snippet hierarchical clustering

WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
A search result clustering method using informatively named entities

Proceedings of the 7th annual ACM international workshop on Web information and data management

A transduction-based approach to fuzzy clustering, relevance ranking and cluster label generation on web search results

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Search result clustering can help users quickly browse through the documents returned by search engine. Traditional clustering techniques are inadequate since they don't generate clusters with highly readable names. Label-based clustering is quite promising, which usually takes n-gram (usually bi-gram) as label candidates. However, meaningless n-grams are not removed from the candidates. In this paper, DF, user log and query context are introduced as label ranking features. An integrated model is used to combine these three features for cluster label ranking. Further more, a novel graph based clustering algorithm (GBCA) for hierarchical clustering is proposed. Experiments indicate that the cluster label extraction makes a great improvement (about 8%) over the baseline in precision, and GBCA outperforms STC and Snaket in F-Measure.