Retrieval of Relevant Concepts from a Text Collection

  • Authors:
  • Henry Anaya-Sánchez;Rafael Berlanga-Llavori;Aurora Pons-Porrata

  • Affiliations:
  • Center of Pattern Recognition and Data Mining, Universidad de Oriente, Santiago de Cuba, Cuba;Departament de Llenguatges i Sistemes Informàtics, Universitat Jaume I, Castelló, Spain;Center of Pattern Recognition and Data Mining, Universidad de Oriente, Santiago de Cuba, Cuba

  • Venue:
  • Current Topics in Artificial Intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the characterization of a large text collection by introducing a method for retrieving sets of relevant WordNet concepts as descriptors of the collection contents. The method combines models for identifying interesting word co-occurrences with an extension of a word sense disambiguation algorithm in order to retrieve the concepts that better fit in with the collection topics. Multi-word nominal concepts that do not explicitly appear in the texts, can be found among the retrieved concepts. We evaluate our proposal using extensions of recall and precision that are also introduced in this paper.