An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation

Authors:
Yoong Keok Lee;Hwee Tou Ng
Affiliations:
National University of Singapore, Singapore;National University of Singapore, Singapore
Venue:
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Year:
2002

Citing 16
Cited 63

C4.5: programs for machine learning

C4.5: programs for machine learning
The nature of statistical learning theory

The nature of statistical learning theory
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
The interaction of knowledge sources in word sense disambiguation

Computational Linguistics
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A maximum entropy approach to identifying sentence boundaries

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A decision tree of bigrams is an accurate predictor of word sense

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
An empirical study of the domain dependence of supervised word sense disambiguation systems

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
A new supervised learning algorithm for word sense disambiguation

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
SENSEVAL-2: overview

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Supervised sense tagging using support vector machines

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Pattern learning and active feature selection for word sense disambiguation

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Machine learning with lexical features: the Duluth approach to Senseval-2

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
KUNLP system using classification information model at SENSEVAL-2

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
The Johns Hopkins SENSEVAL2 system descriptions

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems

Exploiting parallel texts for word sense disambiguation: an empirical study

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The role of semantic roles in disambiguating verb senses

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Word sense disambiguation using label propagation based semi-supervised learning

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Estimating class priors in domain adaptation for word sense disambiguation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Word sense and subjectivity

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Word sense disambiguation criteria: a systematic study

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A semi-supervised feature clustering algorithm with application to word sense disambiguation

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Learning model order from labeled and unlabeled data for partially supervised classification, with application to word sense disambiguation

Computer Speech and Language
Wikify!: linking documents to encyclopedic knowledge

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Semi-supervised learning integrated with classifier combination for word sense disambiguation

Computer Speech and Language
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
Word sense disambiguation across two domains: Biomedical literature and clinical notes

Journal of Biomedical Informatics
A Method for Reinforcing Noun Countability Prediction

IEICE - Transactions on Information and Systems
Semi-supervised Word Sense Disambiguation Using the Web as Corpus

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
A Density-Based Re-ranking Technique for Active Learning for Data Annotations

ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Combining multiple evidence for gene symbol disambiguation

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Applying alternating structure optimization to word sense disambiguation

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Multilingual dependency-based syntactic and semantic parsing

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning: Shared Task
Good neighbors make good senses: exploiting distributional similarity for unsupervised WSD

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
OntoNotes: corpus cleanup of mistaken agreement using word sense disambiguation

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Multi-criteria-based strategy to stop active learning for data annotation

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Active learning with sampling by uncertainty and density for word sense disambiguation and text classification

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Bayesian word sense induction

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Partially supervised sense disambiguation by learning sense number from tagged and untagged corpora

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Construction of an idiom corpus and its application to idiom identification based on WSD incorporating idiom-specific features

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Word sense disambiguation using OntoNotes: an empirical study

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Scaling up word sense disambiguation via parallel texts

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Unsupervised multilingual word sense disambiguation via an interlingua

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Word sense disambiguation with semi-supervised learning

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
11,001 new features for statistical machine translation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
HIT-IR-WSD: a WSD system for English lexical sample task

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
I2R: three systems for word sense discrimination, Chinese word sense disambiguation, and English word sense disambiguation

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
NUS-ML: improving word sense disambiguation using topic features

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
NUS-PT: exploiting parallel texts for word sense disambiguation in the English all-words tasks

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
SRCB-WSD: supervised Chinese word sense disambiguation with key features

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Hierarchical semantic classification: word sense disambiguation with world knowledge

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Word sense disambiguation with distribution estimation

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Word sense disambiguation for all words without hard labor

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Discounting and Combination Scheme in Evidence Theory for Dealing with Conflict in Information Fusion

MDAI '09 Proceedings of the 6th International Conference on Modeling Decisions for Artificial Intelligence
Joint learning of preposition senses and semantic roles of prepositional phrases

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Confidence-based stopping criteria for active learning for data annotation

ACM Transactions on Speech and Language Processing (TSLP)
Adaptively entropy-based weighting classifiers in combination using Dempster-Shafer theory for word sense disambiguation

Computer Speech and Language
Mixture model based contextual image retrieval

Proceedings of the ACM International Conference on Image and Video Retrieval
WSD as a distributed constraint optimization problem

ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
It makes sense: a wide-coverage word sense disambiguation system for free text

ACLDemos '10 Proceedings of the ACL 2010 System Demonstrations
Active learning with sampling by uncertainty and density for data annotations

IEEE Transactions on Audio, Speech, and Language Processing
Assessing the challenge of fine-grained named entity recognition and classification

NEWS '10 Proceedings of the 2010 Named Entities Workshop
A unified framework for scope learning via simplified shallow semantic parsing

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

Language Resources and Evaluation
Class-based approach to disambiguating levin verbs

Natural Language Engineering
A probabilistic model based on n-grams for bilingual word sense disambiguation

MICAI'10 Proceedings of the 9th Mexican international conference on Advances in artificial intelligence: Part I
Incorporating coreference resolution into word sense disambiguation

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
A generalized method for word sense disambiguation based on wikipedia

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Using the mutual k-nearest neighbor graphs for semi-supervised classification of natural language data

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Does word sense disambiguation improve information retrieval?

Proceedings of the fourth workshop on Exploiting semantic annotations in information retrieval
Investigating problems of semi-supervised learning for word sense disambiguation

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Uncertainty-based active learning with instability estimation for text classification

ACM Transactions on Speech and Language Processing (TSLP)
Query difficulty prediction for contextual image retrieval

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Word sense disambiguation by semi-supervised learning

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Towards robust high performance word sense disambiguation of english verbs using rich linguistic features

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Word epoch disambiguation: finding how words change over time

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
A Two-Phase Framework for Learning Logical Structures of Paragraphs in Legal Articles

ACM Transactions on Asian Language Information Processing (TALIP)
Ontology-Based word sense disambiguation for scientific literature

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we evaluate a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on SENSEVAL-2 and SENSEVAL-1 data. Our knowledge sources include the part-of-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. The learning algorithms evaluated include Support Vector Machines (SVM), Naive Bayes, AdaBoost, and decision tree algorithms. We present empirical results showing the relative contribution of the component knowledge sources and the different learning algorithms. In particular, using all of these knowledge sources and SVM (i.e., a single learning algorithm) achieves accuracy higher than the best official scores on both SENSEVAL-2 and SENSEVAL-1 test data.