Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
IEPAD: information extraction based on pattern discovery
Proceedings of the 10th international conference on World Wide Web
Clustering web documents: a phrase-based method for grouping search engine results
Clustering web documents: a phrase-based method for grouping search engine results
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A Concept-Driven Algorithm for Clustering Search Results
IEEE Intelligent Systems
A New Web Search Result Clustering based on True Common Phrase Label Discovery
CIMCA '06 Proceedings of the International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce
Carrot2: design of a flexible and efficient web information retrieval framework
AWIC'05 Proceedings of the Third international conference on Advances in Web Intelligence
STC+ and NM-STC: Two Novel Online Results Clustering Methods for Web Searching
WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
An application of improved gap-BIDE algorithm for discovering access patterns
Applied Computational Intelligence and Soft Computing - Special issue on Awareness Science and Engineering
Hi-index | 0.00 |
Web search results clustering are navigator for users to search needed results. With suffix tree clustering (STC), search results can be clustered fast, automatically, and each cluster is labeled with a common phrase. Due to the large memory requirement of suffix tree, some other approaches have been proposed, with lower memory requirement. But unlike other algorithms, STC is an incremental algorithm and a promising approach to work on a long list of snippets returned by search engines. In this paper we proposed an approach for web search results clustering and labeling, based on a new suffix tree data structure. The approach is an incremental and linear time algorithm, with significantly lower memory requirements. This approach also labels every final cluster a common phrase, thus it is suitable for quickly browsing by users. Experimental results show that the new approach has better performance than that of conventional web search result clustering.