Tools for nominalization: an alternative for lexical normalization

  • Authors:
  • Marco Antonio Insaurriaga Gonzalez;Vera Lúcia Strube de Lima;José Valdeni de Lima

  • Affiliations:
  • PUCRS – Faculdade de Informática, Porto Alegre, Brazil;PUCRS – Faculdade de Informática, Porto Alegre, Brazil;UFRGS – Instituto de Informática, Porto Alegre, Brazil

  • Venue:
  • PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The recognition of morphological variation and conceptual proximity of the words is crucial for tasks where the lexical normalization is used, such as term generation and matching in an information retrieval environment. We present tools that automatically perform nominalization for lexical normalization in Portuguese. Comparing the effects of three alternative strategies (stemming, lemmatizing, and our proposal: nominalization), we demonstrate through an experimental evaluation that nominalization, as lexical normalization, contributes to the performance improvement in a probabilistic information retrieval approach for Portuguese.