Improvements in automatic thesaurus extraction

Authors:
James R. Curran;Marc Moens
Affiliations:
University of Edinburgh, Edinburgh, United Kingdom;University of Edinburgh, Edinburgh, United Kingdom
Venue:
ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Year:
2002

Citing 15
Cited 51

A cluster-based approach to thesaurus construction

SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Class-based n-gram models of natural language

Computational Linguistics
Distributional clustering of words for text classification

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Deriving concept hierarchies from text

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Automatic Detection of Thesaurus relations for Information Retrieval Applications

Foundations of Computer Science: Potential - Theory - Cognition, to Wilfried Brauer on the occasion of his sixtieth birthday
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Automatic construction of a hypernym-labeled noun hierarchy from text

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Scaling context space

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Class-based probability estimation using a semantic hierarchy

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Robust, applied morphological generation

INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14

Scaling context space

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity

Computational Linguistics
Ensemble methods for automatic thesaurus extraction

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Acquiring the meaning of discourse markers

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Supersense tagging of unknown nouns using semantic similarity

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Modelling the substitutability of discourse connectives

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Efficient unsupervised discovery of word categories using symmetric patterns and high frequency words

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scaling distributional similarity to large corpora

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Characterising measures of lexical distributional similarity

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Mining Domain-Specific Thesauri from Wikipedia: A Case Study

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Finding synonyms using automatic word alignment and measures of distributional similarity

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Exploiting web 2.0 forallknowledge-based information retrieval

Proceedings of the ACM first Ph.D. workshop in CIKM
A knowledge-based search engine powered by wikipedia

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Comparing Window and Syntax Based Strategies for Semantic Extraction

PROPOR '08 Proceedings of the 8th international conference on Computational Processing of the Portuguese Language
Automatic Acquisition of Attributes for Ontology Construction

ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Extracting concept descriptions from the Web: the importance of attributes and values

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
A supervised learning approach to automatic synonym identification based on distributional features

HLT-SRWS '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Student Research Workshop
MSDA: Wordsense Discrimination Using Context Vectors and Attributes

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Superior and efficient fully unsupervised pattern-based concept acquisition using an unsupervised parser

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Metric learning for synonym acquisition

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Translation and extension of concepts across languages

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Bootstrapping distributional feature vector quality

Computational Linguistics
The problem of ontology alignment on the web: a first report

WAC '06 Proceedings of the 2nd International Workshop on Web as Corpus
Approximate searching for distributional similarity

DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition
The distributional similarity of sub-parses

EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
Weakly supervised techniques for domain-independent sentiment classification

Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion
Relieving Polysemy Problem for Synonymy Detection

EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Comparing Different Properties Involved in Word Similarity Extraction

EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Geo-mining: discovery of road and transport networks using directional patterns

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
EEG responds to conceptual stimuli and corpus semantics

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Enhancement of lexical concepts using cross-lingual web mining

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
One distributional memory, many semantic spaces

GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Unsupervised classification with dependency based word spaces

GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Thesaurus-based 3D Object Retrieval with Part-in-Whole Matching

International Journal of Computer Vision
Multi-prototype vector-space models of word meaning

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
From frequency to meaning: vector space models of semantics

Journal of Artificial Intelligence Research
Cause identification from aviation safety incident reports via weakly supervised semantic lexicon construction

Journal of Artificial Intelligence Research
Automatic discovery of word semantic relations using paraphrase alignment and distributional lexical semantics analysis

Natural Language Engineering
Distributional memory: A general framework for corpus-based semantics

Computational Linguistics
Is singular value decomposition useful for word similarity extraction?

Language Resources and Evaluation
A supervised method of feature weighting for measuring semantic relatedness

Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
Comparing distributional and mirror translation similarities for extracting synonyms

Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
Acquiring thesauri from wikis by exploiting domain models and lexical substitution

ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
Automatic generation of bilingual dictionaries using intermediary languages and comparable corpora

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Statistical thesaurus construction for a morphologically rich language

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Selecting corpus-semantic models for neurolinguistic decoding

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
A study of hybrid similarity measures for semantic relation extraction

HYBRID '12 Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data
Polarity inducing latent semantic analysis

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Semi-automatic enrichment of crowdsourced synonymy networks: the WISIGOTH system applied to Wiktionary

Language Resources and Evaluation
Automatic extraction of function-behaviour-state information from patents

Advanced Engineering Informatics
Predicting part-of-speech tags and morpho-syntactic relations using similarity-based technique

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of semantic resources is common in modern NLP systems, but methods to extract lexical semantics have only recently begun to perform well enough for practical use. We evaluate existing and new similarity metrics for thesaurus extraction, and experiment with the trade-off between extraction performance and efficiency. We propose an approximation algorithm, based on canonical attributes and coarse- and fine-grained matching, that reduces the time complexity and execution time of thesaurus extraction with only a marginal performance penalty.