The role of algorithm bias vs information source in learning algorithms for Morphosyntactic Disambiguation

Authors:
Guy De Pauw;Walter Daelemans
Affiliations:
University of Antwerp, Antwerpen, Belgium;University of Antwerp, Antwerpen, Belgium
Venue:
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Year:
2000

Citing 7
Cited 2

Toward memory-based reasoning

Communications of the ACM - Special issue on parallelism
Domain-specific knowledge acquisition for conceptual sentence analysis

Domain-specific knowledge acquisition for conceptual sentence analysis
A maximum entropy approach to natural language processing

Computational Linguistics
Learning to resolve natural language ambiguities: a unified approach

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Forgetting Exceptions is Harmful in Language Learning

Machine Learning - Special issue on natural language learning
Learning in Natural Language

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Improving data driven wordclass tagging by system combination

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1

Shallow parsing with pos taggers and linguistic features

The Journal of Machine Learning Research
Data-Driven part-of-speech tagging of kiswahili

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

Morphosyntactic Disambiguation (Part of Speech tagging) is a useful benchmark problem for system comparison because it is typical for a large class of Natural Language Processing (NLP) problems that can be defined as disambiguation in local context. This paper adds to the literature on the systematic and objective evaluation of different methods to automatically learn this type of disambiguation problem. We systematically compare two inductive learning approaches to tagging: MX-POST (based on maximum entropy modeling) and MBT (based on memory-based learning). We investigate the effect of different sources of information on accuracy when comparing the two approaches under the same conditions. Results indicate that earlier observed differences in accuracy can be attributed largely to differences in information sources used, rather than to algorithm bias.