Self-training and co-training applied to spanish named entity recognition

  • Authors:
  • Zornitsa Kozareva;Boyan Bonev;Andres Montoyo

  • Affiliations:
  • Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Spain;Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Spain;Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Spain

  • Venue:
  • MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper discusses the usage of unlabeled data for Spanish Named Entity Recognition. Two techniques have been used: self-training for detecting the entities in the text and co-training for classifying these already detected entities. We introduce a new co-training algorithm, which applies voting techniques in order to decide which unlabeled example should be added into the training set at each iteration. A proposal for improving the performance of the detected entities has been made. A brief comparative study with already existing co-training algorithms is demonstrated.