Web Search Results Clustering Based on a Novel Suffix Tree Structure

  • Authors:
  • Junze Wang;Yijun Mo;Benxiong Huang;Jie Wen;Li He

  • Affiliations:
  • Institude of Communication Software and Switch Technology, Huazhong University of Science and Technology, Wuhan, China 430074;Institude of Communication Software and Switch Technology, Huazhong University of Science and Technology, Wuhan, China 430074;Institude of Communication Software and Switch Technology, Huazhong University of Science and Technology, Wuhan, China 430074;Institude of Communication Software and Switch Technology, Huazhong University of Science and Technology, Wuhan, China 430074;Institude of Communication Software and Switch Technology, Huazhong University of Science and Technology, Wuhan, China 430074

  • Venue:
  • ATC '08 Proceedings of the 5th international conference on Autonomic and Trusted Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web search results clustering are navigator for users to search needed results. With suffix tree clustering (STC), search results can be clustered fast, automatically, and each cluster is labeled with a common phrase. Due to the large memory requirement of suffix tree, some other approaches have been proposed, with lower memory requirement. But unlike other algorithms, STC is an incremental algorithm and a promising approach to work on a long list of snippets returned by search engines. In this paper we proposed an approach for web search results clustering and labeling, based on a new suffix tree data structure. The approach is an incremental and linear time algorithm, with significantly lower memory requirements. This approach also labels every final cluster a common phrase, thus it is suitable for quickly browsing by users. Experimental results show that the new approach has better performance than that of conventional web search result clustering.