Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
Findex: search result categories help users when document ranking fails
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A personalized search engine based on web-snippet hierarchical clustering
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
A Concept-Driven Algorithm for Clustering Search Results
IEEE Intelligent Systems
Standardized Evaluation Method for Web Clustering Results
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Clustering versus faceted categories for information exploration
Communications of the ACM - Supporting exploratory search
A New Web Search Result Clustering based on True Common Phrase Label Discovery
CIMCA '06 Proceedings of the International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce
A new algorithm for clustering search results
Data & Knowledge Engineering
A Novel Method for Hierarchical Clustering of Search Results
WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Search engine user behaviour: How can users be guided to quality content?
Information Services and Use - ICSTI 2007 and 2008
A survey of Web clustering engines
ACM Computing Surveys (CSUR)
Carrot2 and language properties in web search results clustering
AWIC'03 Proceedings of the 1st international Atlantic web intelligence conference on Advances in web intelligence
Ranking categories for web search
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Cluster generation and cluster labelling for web snippets: a fast and accurate hierarchical solution
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Hi-index | 0.00 |
This paper details a modular, self-contained web search results clustering system that enhances search results by (i) performing clustering on lists of web documents returned by queries to search engines, and (ii) ranking the results and labeling the resulting clusters, by using a calculated relevance value as a degree of membership to clusters. In addition, we demonstrate an external evaluation method based on precision for comparing fuzzy clustering techniques, as well as internal measures suitable for working on non-training data. The built-in label generator uses the membership degrees and relevance values to weight the most relevant results more heavily. The membership degrees of documents to fuzzy clusters also facilitate effective detection and removal of overly similar clusters. To achieve this, our transduction-based clustering algorithm (TCA) and its fuzzy counterpart (FTCA) employ a transduction-based relevance model (TRM) to consider local relationships between each web document. Results from testing on five different real-world and synthetic datasets results show favorable results compared to established label-based clustering algorithms Suffix Tree Clustering (STC) and Lingo.