The role of algorithm bias vs information source in learning algorithms for Morphosyntactic Disambiguation

  • Authors:
  • Guy De Pauw;Walter Daelemans

  • Affiliations:
  • University of Antwerp, Antwerpen, Belgium;University of Antwerp, Antwerpen, Belgium

  • Venue:
  • ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Morphosyntactic Disambiguation (Part of Speech tagging) is a useful benchmark problem for system comparison because it is typical for a large class of Natural Language Processing (NLP) problems that can be defined as disambiguation in local context. This paper adds to the literature on the systematic and objective evaluation of different methods to automatically learn this type of disambiguation problem. We systematically compare two inductive learning approaches to tagging: MX-POST (based on maximum entropy modeling) and MBT (based on memory-based learning). We investigate the effect of different sources of information on accuracy when comparing the two approaches under the same conditions. Results indicate that earlier observed differences in accuracy can be attributed largely to differences in information sources used, rather than to algorithm bias.