Automatic discovery of word semantic relations using paraphrase alignment and distributional lexical semantics analysis

Authors:
GaËl Dias;Rumen Moraliyski;JoÃo Cordeiro;Antoine Doucet;Helena Ahonen-myka
Affiliations:
Centre for hlt and bioinformatics, department of computer science, university of beira interior, 6201-001-covilhã, portugal emails: ddg@di.ubi.pt, rumen@penhas.di.ubi.pt, jpaulo@di.ubi.pt;Centre for hlt and bioinformatics, department of computer science, university of beira interior, 6201-001-covilhã, portugal emails: ddg@di.ubi.pt, rumen@penhas.di.ubi.pt, jpaulo@di.ubi.pt;Centre for hlt and bioinformatics, department of computer science, university of beira interior, 6201-001-covilhã, portugal emails: ddg@di.ubi.pt, rumen@penhas.di.ubi.pt, jpaulo@di.ubi.pt;Campus côte de nacre, boulevard du maréchal juin, university of caen, bp 5186-14032-caen cedex, france email: doucet@info.unicaen.fr;Department of computer science, university of helsinki, p.o. box 68 (gustaf hällströmin katu 2b), fi-00014, helsinki, finland email: helena.ahonen-myka@cs.helsinki.fi
Venue:
Natural Language Engineering
Year:
2010

Citing 26
Cited 1

Contextual correlates of synonymy

Communications of the ACM
Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery
Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Vector-Based Semantic Analysis Using Random Indexing for Cross-Lingual Query Expansion

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Cut and paste based text summarization

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Finding parts in very large corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Automatic construction of a hypernym-labeled noun hierarchy from text

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Learning to paraphrase: an unsupervised approach using multiple-sequence alignment

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Frequency estimates for statistical word similarity measures

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
One sense per discourse

HLT '91 Proceedings of the workshop on Speech and Natural Language
Improvements in automatic thesaurus extraction

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Using LSA and noun coordination information to improve the precision and recall of automatic hyponymy extraction

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Semantic taxonomy induction from heterogenous evidence

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Characterising measures of lexical distributional similarity

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Measuring semantic similarity between words using web search engines

Proceedings of the 16th international conference on World Wide Web
Identifying synonyms among distributionally similar words

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Biology based alignments of paraphrases for sentence compression

RTE '07 Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing
New experiments in distributional representations of synonymy

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Sentence co-occurrences as small-world graphs: a solution to automatic lexical disambiguation

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Measuring semantic distance using distributional profiles of concepts

Measuring semantic distance using distributional profiles of concepts

Paraphrase alignment for synonym evidence discovery

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Thesauri, which list the most salient semantic relations between words, have mostly been compiled manually. Therefore, the inclusion of an entry depends on the subjective decision of the lexicographer. As a consequence, those resources are usually incomplete. In this paper, we propose an unsupervised methodology to automatically discover pairs of semantically related words by highlighting their local environment and evaluating their semantic similarity in local and global semantic spaces. This proposal differs from all other research presented so far as it tries to take the best of two different methodologies, i.e. semantic space models and information extraction models. In particular, it can be applied to extract close semantic relations, it limits the search space to few, highly probable options and it is unsupervised.