Learning word sense disambiguation in biomedical text with difference between training and test distributions

Authors:
Jeong-Woo Son;Seong-Bae Park
Affiliations:
Department of Computer Engineering, Kyungpook National University, Daegu 702-701, Korea;Department of Computer Engineering, Kyungpook National University, Daegu 702-701, Korea
Venue:
International Journal of Data Mining and Bioinformatics
Year:
2012

Citing 12
Cited 0

An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
A Boosted Maximum Entropy Model for Learning Text Chunking

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning rules and their exceptions

The Journal of Machine Learning Research
Learning and evaluating classifiers under sample selection bias

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Supervised and unsupervised PCFG adaptation to novel domains

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
An empirical study of the domain dependence of supervised word sense disambiguation systems

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
One sense per collocation and genre/topic variations

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Speaker identification via support vector classifiers

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Semi-Supervised Learning

Semi-Supervised Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Word Sense Disambiguation methods based on machine learning techniques with lexical features suffer from the discordance between distributions of the training and test documents, due to the diversity of lexical space. To tackle this problem, this paper proposes Support Vector Machines with Example-wise Weights. In this method, the training distribution is matched with the test distribution by weighting training examples according to their similarity to all test data. The experimental results show the distribution change between the training and test data is actually recognised and the proposed method which considers this change in its training phase outperforms ordinary SVMs.