Investigating problems of semi-supervised learning for word sense disambiguation

Authors:
Anh-Cuong Le;Akira Shimazu;Le-Minh Nguyen
Affiliations:
School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan;School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan;School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Venue:
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Year:
2006

Citing 9
Cited 1

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Enhancing Supervised Learning with Unlabeled Data

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Word sense disambiguation using label propagation based semi-supervised learning

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Semi-supervised training of a kernel PCA-based model for word sense disambiguation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Word sense disambiguation with semi-supervised learning

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Combining classifiers based on OWA operators with an application to word sense disambiguation

RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part I

Classifier combination for contextual idiom detection without labelled data

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Word Sense Disambiguation (WSD) is the problem of determining the right sense of a polysemous word in a given context. In this paper, we will investigate the use of unlabeled data for WSD within the framework of semi supervised learning, in which the original labeled dataset is iteratively extended by exploiting unlabeled data. This paper addresses two problems occurring in this approach: determining a subset of new labeled data at each extension and generating the final classifier. By giving solutions for these problems, we generate some variants of bootstrapping algorithms and apply to word sense disambiguation. The experiments were done on the datasets of four words: interest, line, hard, and serve; and on English lexical sample of Senseval-3.