Ensemble of Classifiers for Noise Detection in PoS Tagged Corpora

Authors:
Harald Berthelsen;Beáta Megyesi
Affiliations:
-;-
Venue:
TDS '00 Proceedings of the Third International Workshop on Text, Speech and Dialogue
Year:
2000

Citing 4
Cited 0

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Induction of Decision Trees

Machine Learning
Classifier combination for improved lexical disambiguation

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Identifying and eliminating mislabeled training instances

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper we apply the ensemble approach to the identification of incorrectly annotated items (noise) in a training set. In a controlled experiment, memory-based, decision tree-based and transformation-based classifiers are used as a filter to detect and remove noise deliberately introduced into a manually tagged corpus. The results indicate that the method can be successfully applied to automatically detect errors in a corpus.