A Document Descriptor Extractor Based on Relevant Expressions

Authors:
Joaquim Ferreira Silva;Gabriel Pereira Lopes
Affiliations:
DI/FCT Universidade Nova de Lisboa, Caparica, Portugal 2829-516;DI/FCT Universidade Nova de Lisboa, Caparica, Portugal 2829-516
Venue:
EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Year:
2009

Citing 7
Cited 1

Using LocalMaxs Algorithm for the Extraction of Contiguous and Non-contiguous Multiword Lexical Units

EPIA '99 Proceedings of the 9th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Enhanced web document summarization using hyperlinks

Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Mining web sites using adaptive information extraction

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Identification of relevant terms to support the construction of domain ontologies

HLTKM '01 Proceedings of the workshop on Human Language Technology and Knowledge Management - Volume 2001
A Novel Partitioning-Based Clustering Method and Generic Document Summarization

WI-IATW '06 Proceedings of the 2006 IEEE/WIC/ACM international conference on Web Intelligence and Intelligent Agent Technology
Using lexical chains for keyword extraction

Information Processing and Management: an International Journal
Automatic selection of noun phrases as document descriptors in an FCA-Based information retrieval system

ICFCA'05 Proceedings of the Third international conference on Formal Concept Analysis

Towards automatic building of document keywords

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters

Quantified Score

Hi-index	0.00

Visualization

Abstract

People are often asked to associate keywords to documents to enable applications to access the summarized core content of documents. This fact was the main motivation to work on an approach that may contribute to move from this manual procedure to an automatic one. Since Relevant Expressions (REs) or multi-word term expressions can be automatically extracted using the LocalMaxs algorithm, the most relevant ones can be used to describe the core content of each document. In this paper we present a language-independent approach for automatic generation of document descriptors. Results are shown for three different European languages and comparisons are made concerning different metrics for selecting the most informative REs of each document.