Named entity recognition for Catalan using Spanish resources

  • Authors:
  • Xavier Carreras;Lluís Màrquez;Lluís Padró

  • Affiliations:
  • Universitat Politècnica de Catalunya, Jordi Girona, Barcelona;Universitat Politècnica de Catalunya, Jordi Girona, Barcelona;Universitat Politècnica de Catalunya, Jordi Girona, Barcelona

  • Venue:
  • EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.02

Visualization

Abstract

This work studies Named Entity Recognition (NER) for Catalan without making use of annotated resources of this language. The approach presented is based on machine learning techniques and exploits Spanish resources, either by first training models for Spanish and then translating them into Catalan, or by directly training bilingual models. The resulting models are retrained on unlabelled Catalan data using bootstrapping techniques. Exhaustive experimentation has been conducted on real data, showing competitive results for the obtained NER systems.