A search result clustering method using informatively named entities

  • Authors:
  • Hiroyuki Toda;Ryoji Kataoka

  • Affiliations:
  • NTT Corporation, Kanagawa, Japan;NTT Corporation, Kanagawa, Japan

  • Venue:
  • Proceedings of the 7th annual ACM international workshop on Web information and data management
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering the results of a search helps the user to overview the information returned. In this paper, we regard the clustering task as indexing the search results. Here, an index means a structured label list that can makes it easier for the user to comprehend the labels and search results. To realize this goal, we make three proposals. First is to use Named Entity Extraction for term extraction. Second is a new label selecting criterion based on importance in the search result and the relation between terms and search queries. The third is label categorization using category information of labels, which is generated by NE extraction. We implement a prototype system based on these proposals and find that it offers much higher performance than existing methods; we focus on news articles in this paper.