A Concept-Driven Algorithm for Clustering Search Results

  • Authors:
  • Stanislaw Osinski;Dawid Weiss

  • Affiliations:
  • Poznan University of Technology;Poznan University of Technology

  • Venue:
  • IEEE Intelligent Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most search engines return search results in a single-dimensional ranking of relevance to a user's query. Although this method works well for specific information needs, it often fails when users submit broad, ambiguous queries, seeking a general cross-section of topics related to the query. Search result clustering has successfully served this purpose in both commercial and scientific systems. The proposed method separates search results (document references) into meaningful groups. Unlike previous clustering techniques that use some proximity measure between documents, this method tries to discover meaningful phrases that can become cluster descriptions and only then assign documents to those phrases to form clusters. This idea is the core of the Lingo algorithm, which combines common phrase discovery and latent semantic indexing techniques. Clusters created by Lingo are compared to those created by the classic suffix-tree clustering algorithm.