Instance pruning by filtering uninformative words: an information extraction case study

  • Authors:
  • Alfio Massimiliano Gliozzo;Claudio Giuliano;Raffaella Rinaldi

  • Affiliations:
  • Istituto per la Ricerca Scientifica e Tecnologica, ITC-irst, Trento, Italy;Istituto per la Ricerca Scientifica e Tecnologica, ITC-irst, Trento, Italy;Istituto per la Ricerca Scientifica e Tecnologica, ITC-irst, Trento, Italy

  • Venue:
  • CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a novel instance pruning technique for Information Extraction (IE). In particular, our technique filters out uninformative words from texts on the basis of the assumption that very frequent words in the language do not provide any specific information about the text in which they appear, therefore their expectation of being (part of) relevant entities is very low. The experiments on two benchmark datasets show that the computation time can be significantly reduced without any significant decrease in the prediction accuracy. We also report an improvement in accuracy for one task.