Acquiring synonyms from monolingual comparable texts

Authors:
Mitsuo Shimohata;Eiichiro Sumita
Affiliations:
Oki Electric Industry Co., Ltd., Osaka City, Japan;ATR Spoken Language Translation Research Laboratories, Kyoto, Japan
Venue:
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Year:
2005

Citing 10
Cited 4

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Survey of Text Mining

Survey of Text Mining
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Extracting paraphrases from a parallel corpus

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Syntax-based alignment of multiple translations: extracting paraphrases and generating new sentences

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Identifying synonymous expressions from a bilingual corpus for example-based machine translation

COLING-MTIA '02 Proceedings of the 2002 COLING workshop on Machine translation in Asia - Volume 16
Identifying synonyms among distributionally similar words

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Combining open-source with research to re-engineer a hands-on introductory NLP course

TeachCL '08 Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics
Gram-free synonym extraction via suffix arrays

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Automatic extraction of Thai-English term translations and synonyms from medical web using iterative candidate generation with association measures

PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Generating phrasal and sentential paraphrases: A survey of data-driven methods

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a method for acquiring synonyms from monolingual comparable text (MCT). MCT denotes a set of monolingual texts whose contents are similar and can be obtained automatically. Our acquisition method takes advantage of a characteristic of MCT that included words and their relations are confined. Our method uses contextual information of surrounding one word on each side of the target words. To improve acquisition precision, prevention of outside appearance is used. This method has advantages in that it requires only part-of-speech information and it can acquire infrequent synonyms. We evaluated our method with two kinds of news article data: sentence-aligned parallel texts and document-aligned comparable texts. When applying the former data, our method acquires synonym pairs with 70.0% precision. Re-evaluation of incorrect word pairs with source texts indicates that the method captures the appropriate parts of source texts with 89.5% precision. When applying the latter data, acquisition precision reaches 76.0% in English and 76.3% in Japanese.