A Document Descriptor Extractor Based on Relevant Expressions

  • Authors:
  • Joaquim Ferreira Silva;Gabriel Pereira Lopes

  • Affiliations:
  • DI/FCT Universidade Nova de Lisboa, Caparica, Portugal 2829-516;DI/FCT Universidade Nova de Lisboa, Caparica, Portugal 2829-516

  • Venue:
  • EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

People are often asked to associate keywords to documents to enable applications to access the summarized core content of documents. This fact was the main motivation to work on an approach that may contribute to move from this manual procedure to an automatic one. Since Relevant Expressions (REs) or multi-word term expressions can be automatically extracted using the LocalMaxs algorithm, the most relevant ones can be used to describe the core content of each document. In this paper we present a language-independent approach for automatic generation of document descriptors. Results are shown for three different European languages and comparisons are made concerning different metrics for selecting the most informative REs of each document.