Using machine learning techniques to build a comma checker for Basque

  • Authors:
  • Iñaki Alegria;Bertol Arrieta;Arantza Diaz de Ilarraza;Eli Izagirre;Montse Maritxalar

  • Affiliations:
  • University of the Basque Country, Basque Country, Spain;University of the Basque Country, Basque Country, Spain;University of the Basque Country, Basque Country, Spain;University of the Basque Country, Basque Country, Spain;University of the Basque Country, Basque Country, Spain

  • Venue:
  • COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe the research using machine learning techniques to build a comma checker to be integrated in a grammar checker for Basque. After several experiments, and trained with a little corpus of 100,000 words, the system guesses correctly not placing commas with a precision of 96% and a recall of 98%. It also gets a precision of 70% and a recall of 49% in the task of placing commas. Finally, we have shown that these results can be improved using a bigger and a more homogeneous corpus to train, that is, a bigger corpus written by one unique author.