A New Search Results Clustering Algorithm Based on Formal Concept Analysis

  • Authors:
  • Yun Zhang;Boqin Feng;Yewei Xue

  • Affiliations:
  • -;-;-

  • Venue:
  • FSKD '08 Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 02
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Organizing web search results into a hierarchy of topics and subtopics facilitates browsing the collection and locating results of interest. In this paper, we propose a new method based on formal concept analysis (FCA) tobuild a two-level hierarchy for retrieved search results ofa query. After formal concepts are extracted using FCA, anew algorithm is proposed to extract concepts most relevant to the query and a two-level hierarchy is builtand presented to the user. Evaluating the quality of the resulting clusters is a non-trivial task. Two improved objective metrics of clustering quality, ANMI@K and ANCE@K, are proposed in this paper. We compare our method with three other search results clustering (SRC) algorithms: Suffix Tree Clustering (STC), Lingo, and Vivisimo, using a comprehensive set of documents obtained from the Open Directory Project hierarchy as benchmark. In addition to comparison based on objective measures, we also subjectively analyze the properties of cluster labels produced by different SRC algorithms. The experimental results show that our method outperforms the other three SRC algorithms, and is helpful for browsing and locating the results of interests.