Automatic selection of noun phrases as document descriptors in an FCA-Based information retrieval system

  • Authors:
  • Juan M. Cigarrán;Anselmo Peñas;Julio Gonzalo;Felisa Verdejo

  • Affiliations:
  • Dept. Lenguajes y Sistemas Informáticos, E.T.S.I. Informática, UNED, Madrid, Spain;Dept. Lenguajes y Sistemas Informáticos, E.T.S.I. Informática, UNED, Madrid, Spain;Dept. Lenguajes y Sistemas Informáticos, E.T.S.I. Informática, UNED, Madrid, Spain;Dept. Lenguajes y Sistemas Informáticos, E.T.S.I. Informática, UNED, Madrid, Spain

  • Venue:
  • ICFCA'05 Proceedings of the Third international conference on Formal Concept Analysis
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic attribute selection is a critical step when using Formal Concept Analysis (FCA) in a free text document retrieval framework. Optimal attributes as document descriptors should produce smaller, clearer and more browsable concept lattices with better clustering features. In this paper we focus on the automatic selection of noun phrases as document descriptors to build an FCA-based IR framework. We present three different phrase selection strategies which are evaluated using the Lattice Distillation Factor and the Minimal Browsing Area evaluation measures. Noun phrases are shown to produce lattices with good clustering properties, with the advantage (over simple terms) of being better intensional descriptors from the user's point of view.