The nature of statistical learning theory
The nature of statistical learning theory
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Characterizing Model Erros and Differences
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Introduction to the special issue on evaluating word sense disambiguation systems
Natural Language Engineering
Evaluating sense disambiguation across diverse parameter spaces
Natural Language Engineering
Word sense disambiguation with pattern learning and automatic feature selection
Natural Language Engineering
Learning from little: comparison of classifiers given little training
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
A part-of-speech lexicographic encoding for an evolutionary word sense disambiguation approach
EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part I
Hi-index | 0.00 |
Optimal ensembling (OE) is a word sense disambiguation (WSD) method using word-specific training factors (average positive vs negative training per sense, posex and negex) to predict best system (classifier algorithm / applicable feature set) for given target word. Our official entry (OE1) in Senseval-4 Task 17 (coarse-grained English lexical sample task) contained many design flaws and thus failed to show the whole potential of the method, finishing -4.9% behind top system (+0.5 gain over best base system). A fixed system (OE2) finished only -3.4% behind (+2.0% net gain). All our systems were 'closed', i.e. used the official training data only (average 56 training examples per each sense). We also show that the official evaluation measure tends to favor systems that do well with high-trained words.