Word-sense disambiguation using statistical models of Roget's categories trained on large corpora

Authors:
David Yarowsky
Affiliations:
AT&T Bell Laboratories, Murray Hill, NJ
Venue:
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Year:
1992

Citing 10
Cited 179

Semantic interpretation and the resolution of ambiguity

Semantic interpretation and the resolution of ambiguity
An experiment in computational discrimination of English word senses

IBM Journal of Research and Development
Automatic text processing

Automatic text processing
A connectionist approach to word sense disambiguation

A connectionist approach to word sense disambiguation
Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Two languages are more informative than one

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Subject-dependent co-occurrence and word sense disambiguation

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Word-sense disambiguation using statistical methods

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Automatically extracting and representing collocations for language generation

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Word sense disambiguation with very large neural networks extracted from machine readable dictionaries

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2

Dimensions of meaning

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Subtopic structuring for full-length document access

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Word sense disambiguation and information retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Principled disambiguation: discriminating adjective senses with modified nouns

Computational Linguistics
Word sense disambiguation using a second language monolingual corpus

Computational Linguistics
Learning morpho-lexical probabilities from an untagged corpus with an application to Hebrew

Computational Linguistics
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Estimating lexical priors for low-frequency morphologically ambiguous forms

Computational Linguistics
Improving statistical language model performance with automatically generated word hierarchies

Computational Linguistics
Empirical acquisition of word-sense distinctions

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Similarity-Based Models of Word Cooccurrence Probabilities

Machine Learning - Special issue on natural language learning
An automatic method for generating sense tagged corpora

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
The use of word sense disambiguation in an information extraction system

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Automatic adaptation of proper noun dictionaries through cooperation of machine learning and probabilistic methods

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A natural language interface for information retrieval from forms on the World Wide Web

ICIS '99 Proceedings of the 20th international conference on Information Systems
A brief introduction to natural language processing for non-linguists

Learning language in logic
Corpus-based learning of semantic relations by the ILP system, Asium

Learning language in logic
Building a Chinese-English wordnet for translingual applications

ACM Transactions on Asian Language Information Processing (TALIP)
Toward Language-dependent Applications

Machine Translation
Collocation Dictionary Optimization Using WordNetand k-Nearest Neighbor Learning

Machine Translation
Retrieving with Good Sense

Information Retrieval
Extending a Lexical Ontology by a Combination of Distributional Semantics Signatures

EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
WSD Algorithm Applied to a NLP System

NLDB '00 Proceedings of the 5th International Conference on Applications of Natural Language to Information Systems-Revised Papers
A Hidden Markov Model Approach to Word Sense Disambiguation

IBERAMIA 2002 Proceedings of the 8th Ibero-American Conference on AI: Advances in Artificial Intelligence
Syntactic-Based Methods for Measuring Word Similarity

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Knowledge Sources for Word Sense Disambiguation

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Word Sense Disambiguation of Czech Texts

TSD '99 Proceedings of the Second International Workshop on Text, Speech and Dialogue
Specification Marks for Word Sense Disambiguation: New Development

CICLing '01 Proceedings of the Second International Conference on Computational Linguistics and Intelligent Text Processing
The Design and Implementation of an Electronic Lexical Knowledge Base

AI '01 Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Approximate Information Filtering on the Semantic Web

KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
Extraction of Word Senses from Human Factors in Knowledge Discovery

DS '02 Proceedings of the 5th International Conference on Discovery Science
Learning Rules for Large-Vocabulary Word Sense Disambiguation: A Comparison of Various Classifiers

NLP '00 Proceedings of the Second International Conference on Natural Language Processing
L&H Lexicography Toolkit for Machine Translation

AMTA '00 Proceedings of the 4th Conference of the Association for Machine Translation in the Americas on Envisioning Machine Translation in the Information Future
Mixing Semantic Networks and Conceptual Vectors: The Case of Hyperonymy

ICCI '03 Proceedings of the 2nd IEEE International Conference on Cognitive Informatics
The interaction of knowledge sources in word sense disambiguation

Computational Linguistics
TextTiling: segmenting text into multi-paragraph subtopic passages

Computational Linguistics
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Similarity-based word sense disambiguation

Computational Linguistics - Special issue on word sense disambiguation
Topical clustering of MRD senses based on information retrieval techniques

Computational Linguistics - Special issue on word sense disambiguation
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Using corpus statistics and WordNet relations for sense identification

Computational Linguistics - Special issue on word sense disambiguation
Generalizing case frames using a thesaurus and the MDL principle

Computational Linguistics
Selective sampling for example-based word sense disambiguation

Computational Linguistics
Unsupervised named entity recognition using syntactic and semantic contextual evidence

Computational Linguistics
Introduction to the special issue on evaluating word sense disambiguation systems

Natural Language Engineering
Semantic interpretation of deverbal nominalizations

Natural Language Engineering
Distinguishing systems and distinguishing senses: new evaluation methods for Word Sense Disambiguation

Natural Language Engineering
A corpus-based bootstrapping algorithm for Semi-Automated semantic lexicon construction

Natural Language Engineering
Finding a domain-appropriate sense inventory for semantically tagging a corpus

Natural Language Engineering
A broad-coverage word sense tagger

ANLC '97 Proceedings of the fifth conference on Applied natural language processing: Descriptions of system demonstrations and videos
Word sense disambiguation for cross-language information retrieval

Proceedings of the workshop on Student research
The grammar of sense: Using part-of-speech tags as a first step in semantic disambiguation

Natural Language Engineering
Developing a hybrid NP parser

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
An automatic extraction of key paragraphs based on context dependency

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Automatic selection of class labels from a thesaurus for an effective semantic tagging of corpora

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Word sense disambiguation in untagged text based on term weight learning

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Finding content-bearing terms using term similarities

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Combining unsupervised lexical knowledge methods for word sense disambiguation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Using syntactic dependency as local context to resolve word sense ambiguity

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Homonymy and polysemy in information retrieval

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Choosing the word most typical in context using a lexical co-occurrence network

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
MindNet: acquiring and structuring semantic information from text

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Building accurate semantic taxonomies from monolingual MRDs

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Structural disambiguation based on reliable estimation of strength of association

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Bridging the gap between dictionary and thesaurus

ACL '98 Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 2
Combining a Chinese thesaurus with a Chinese dictionary

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Corpus statistics meet the noun compound: some empirical results

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Statistical sense disambiguation with relatively small corpora using dictionary definitions

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Sense disambiguation using semantic relations and adjacency information

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Decision lists for lexical ambiguity resolution: application to accent restoration in Spanish and French

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Word-sense disambiguation using decomposable models

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Estimating upper and lower bounds on the performance of word-sense disambiguation programs

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Exogeneous and endogeneous approaches to semantic categorization of unknown technical terms

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Word sense disambiguation of adjectives using probabilistic networks

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
A class-based probabilistic approach to structural disambiguation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Word sense disambiguation and text segmentation based on lexical cohesion

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
An experiment on learning appropriate Selectional Restrictions from a parsed corpus

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Analysis of Japanese compound nouns using collocational information

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Catching the Cheshire Cat

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
The evolution of machine-tractable dictionaries

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Portable knowledge sources for machine translation

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Co-occurrence vectors from corpora vs. distance vectors from dictionaries

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
A "not-so-shallow" parser for collocational analysis

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Word sense disambiguation using Conceptual Density

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Role of word sense disambiguation in lexical acquisition: predicting semantics from syntactic cues

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
An automatic clustering of articles using dictionary definitions

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Positioning unknown words in a thesaurus by using information extracted from a corpus

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
AZuRE, a Scalable System for Automated Term Disambiguation of Gene and Protein Names

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Book Review: Word Sense Disambiguation: The Case for Combinations of Knowledge Sources, Mark Stevenson, Chicago IL, U.S.A.: The University of Chicago Press, 2003. Price: $25.00, xiii + 175 pages, Paperback, ISBN 1-57586-390-1

Journal of Logic, Language and Information
Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Covering ambiguity resolution in Chinese word segmentation based on contextual information

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Unsupervised word sense disambiguation using bilingual comparable corpora

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Location normalization for information extraction

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Maximum entropy models for word sense disambiguation

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
An unsupervised method for word sense tagging using parallel corpora

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Frequency estimates for statistical word similarity measures

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
One sense per discourse

HLT '91 Proceedings of the workshop on Speech and Natural Language
Corpus-based statistical sense resolution

HLT '93 Proceedings of the workshop on Human Language Technology
One sense per collocation

HLT '93 Proceedings of the workshop on Human Language Technology
A new approach to word sense disambiguation

HLT '94 Proceedings of the workshop on Human Language Technology
A Network Analysis Model for Disambiguation of Names in Lists

Computational & Mathematical Organization Theory
Book Reviews

Computational Linguistics
Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment

Journal of the American Society for Information Science and Technology
Word sense disambiguation with pictures

Artificial Intelligence - Special volume on connecting language to the world
An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation

WWSM '00 Proceedings of the ACL-2000 workshop on Word senses and multi-linguality - Volume 8
A statistical model for parsing and word-sense disambiguation

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Class Based Sense Definition Model for word sense tagging and disambiguation

SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
InfoXtract location normalization: a hybrid approach to geographic references in information extraction

HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
Practical Word-Sense Disambiguation Using Co-occurring Concept Codes

Machine Translation
Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary

Information Processing and Management: an International Journal
Automatic expansion of domain-specific lexicons by term categorization

ACM Transactions on Speech and Language Processing (TSLP)
Unsupervised sense disambiguation using bilingual probabilistic models

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Supersense tagging of unknown nouns using semantic similarity

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning semantic classes for word sense disambiguation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Word sense disambiguation using label propagation based semi-supervised learning

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Automated induction of sense in context

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary

Information Processing and Management: an International Journal
Word sense disambiguation using lexical cohesion in the context

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Opening the legal literature portal to multilingual access

DCMI '04 Proceedings of the 2004 international conference on Dublin Core and metadata applications: metadata across languages and cultures
Applications of corpus-based semantic similarity and word segmentation to database schema matching

The VLDB Journal — The International Journal on Very Large Data Bases
Using WordNet to Disambiguate Word Senses for Text Classification

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Word Sense Disambiguation of Farsi Homographs Using Thesaurus and Corpus

GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Passage relevance models for genomics search

Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
Computational linguistics for metadata building (CLiMB): using text mining for the automatic identification, categorization, and disambiguation of subject terms for image metadata

Multimedia Tools and Applications
Critical analysis of WSD algorithms

Proceedings of the International Conference on Advances in Computing, Communication and Control
Revisiting the Potentialities of a Mechanical Thesaurus

ICFCA '09 Proceedings of the 7th International Conference on Formal Concept Analysis
Unsupervised Word Sense Disambiguation Using The WWW

Proceedings of the 2006 conference on STAIRS 2006: Proceedings of the Third Starting AI Researchers' Symposium
Automatic term categorization by extracting knowledge from the Web

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Natural Language Processing as a Foundation of the Semantic Web

Foundations and Trends in Web Science
Good neighbors make good senses: exploiting distributional similarity for unsupervised WSD

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Distributional measures of concept-distance: a task-oriented evaluation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Bootstrapping distributional feature vector quality

Computational Linguistics
Unsupervised multilingual word sense disambiguation via an interlingua

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Tor, TorMd: distributional profiles of concepts for unsupervised word sense disambiguation

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
UPV-SI: word sense induction using self term expansion

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Learning rules for large vocabulary word sense disambiguation

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Combining weak knowledge sources for sense disambiguation

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation

WorkSense '00 Proceedings of the ACL-2000 Workshop on Word Senses and Multi-Linguality
Word sense disambiguation with pictures

Artificial Intelligence - Special volume on connecting language to the world
Hybrid word sense disambiguation using language resources for transliteration of Arabic numerals in Korean

Proceedings of the 2009 International Conference on Hybrid Information Technology
Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary

Information Processing and Management: an International Journal
The noisy channel model for unsupervised word sense disambiguation

Computational Linguistics
Evaluating automatically computed word similarity

PROPOR'03 Proceedings of the 6th international conference on Computational processing of the Portuguese language
Tool for computer-aided Spanish word sense disambiguation

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
A case-based approach to knowledge acquisition for domain-specific sentence analysis

AAAI'93 Proceedings of the eleventh national conference on Artificial intelligence
Discovering users' topics of interest on twitter: a first look

AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
Dialect recognition method using emotion judgment

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part IV
Automatic word sense disambiguation using cooccurrence and hierarchical information

NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems
Class-based approach to disambiguating levin verbs

Natural Language Engineering
Ontology population and enrichment: state of the art

Knowledge-driven multimedia information extraction and ontology evolution
Even the abstract have colour: consensus in word-colour associations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Colourful language: measuring word-colour associations

CMCL '11 Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics
Nonparametric Bayesian word sense induction

TextGraphs-6 Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing
Stochastic modelling of scientific terms distribution in publications

MKM'06 Proceedings of the 5th international conference on Mathematical Knowledge Management
Spanish all-words semantic class disambiguation using Cast3LB corpus

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Conceptual information-based sense disambiguation

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
From once upon a time to happily ever after: tracking emotions in novels and fairy tales

LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Tracking sentiment in mail: how genders differ on emotional axes

WASSA '11 Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis
Word sense disambiguation by relative selection

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Scalable semantic annotation of text using lexical and web resources

SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics

Web Semantics: Science, Services and Agents on the World Wide Web
An experimental study on unsupervised graph-based word sense disambiguation

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Homograph disambiguation using formal concept analysis

ICFCA'06 Proceedings of the 4th international conference on Formal Concept Analysis
A semi-supervised approach for key-synset extraction to be used in word sense disambiguation

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Minimizing user effort in XML grammar matching

Information Sciences: an International Journal
From once upon a time to happily ever after: Tracking emotions in mail and books

Decision Support Systems
A semi-supervised approach to extracting multiword entity names from user reviews

Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search
The UNED systems at Senseval-2

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Regular polysemy: a distributional model

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
The CQC algorithm: cycling in graphs to semantically enrich and enhance a bilingual dictionary

Journal of Artificial Intelligence Research
Text segmentation based on document understanding for information retrieval

NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
How many multiword expressions do people know?

ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 1
Semantic to intelligent web era: building blocks, applications, and current trends

Proceedings of the Fifth International Conference on Management of Emergent Digital EcoSystems
The CQC algorithm: cycling in graphs to semantically enrich and enhance a bilingual dictionary (extended abstract)

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
A new fuzzy rule-based classification system for word sense disambiguation

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a program that disambiguates English word senses in unrestricted text using statistical models of the major Roget's Thesaurus categories. Roget's categories serve as approximations of conceptual classes. The categories listed for a word in Roget's index tend to correspond to sense distinctions; thus selecting the most likely category provides a useful level of sense disambiguation. The selection of categories is accomplished by identifying and weighting words that are indicative of each category when seen in context, using a Bayesian theoretical framework.Other statistical approaches have required special corpora or hand-labeled training examples for much of the lexicon. Our use of class models overcomes this knowledge acquisition bottleneck, enabling training on unrestricted monolingual text without human intervention. Applied to the 10 million word Grolier's Encyclopedia, the system correctly disambiguated 92% of the instances of 12 polysemous words that have been previously studied in the literature.