OE: WSD using optimal ensembling (OE) method

Authors:
Harri M. T. Saarikoski
Affiliations:
Helsinki University, Helsinki, Finland
Venue:
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Year:
2007

Citing 10
Cited 1

The nature of statistical learning theory

The nature of statistical learning theory
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Characterizing Model Erros and Differences

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Introduction to the special issue on evaluating word sense disambiguation systems

Natural Language Engineering
Evaluating sense disambiguation across diverse parameter spaces

Natural Language Engineering
Word sense disambiguation with pattern learning and automatic feature selection

Natural Language Engineering
Learning from little: comparison of classifiers given little training

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence

A part-of-speech lexicographic encoding for an evolutionary word sense disambiguation approach

EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Optimal ensembling (OE) is a word sense disambiguation (WSD) method using word-specific training factors (average positive vs negative training per sense, posex and negex) to predict best system (classifier algorithm / applicable feature set) for given target word. Our official entry (OE1) in Senseval-4 Task 17 (coarse-grained English lexical sample task) contained many design flaws and thus failed to show the whole potential of the method, finishing -4.9% behind top system (+0.5 gain over best base system). A fixed system (OE2) finished only -3.4% behind (+2.0% net gain). All our systems were 'closed', i.e. used the official training data only (average 56 training examples per each sense). We also show that the official evaluation measure tends to favor systems that do well with high-trained words.