Word association norms, mutual information, and lexicography
Computational Linguistics
A systematic comparison of various statistical alignment models
Computational Linguistics
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Extraction of translation unit from Chinese-English parallel corpora
SIGHAN '02 Proceedings of the first SIGHAN workshop on Chinese language processing - Volume 18
Reducing parameter space for word alignment
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
A statistical approach to the semantics of verb-particles
MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
An empirical model of multiword expression decomposability
MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Automatic identification of non-compositional multi-word expressions using latent semantic analysis
MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
Semantics-based multiword expression extraction
MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
Choosing an optimal architecture for segmentation and POS-tagging of modern Hebrew
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Comparing and combining a semantic tagger and a statistical tool for MWE extraction
Computer Speech and Language
Statistically-driven alignment-based multiword expression identification for technical domains
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Exploiting translational correspondences for pattern-independent MWE identification
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Natural Language Processing with Python
Natural Language Processing with Python
Identifying multi-word expressions by leveraging morphological and syntactic idiosyncrasy
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Identifying multi-word expressions by leveraging morphological and syntactic idiosyncrasy
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Identification of multi-word expressions by combining multiple linguistic information sources
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Extraction of multi-word expressions from small parallel corpora
Natural Language Engineering
Hi-index | 0.00 |
We present a general methodology for extracting multi-word expressions (of various types), along with their translations, from small parallel corpora. We automatically align the parallel corpus and focus on misalignments; these typically indicate expressions in the source language that are translated to the target in a non-compositional way. We then use a large monolingual corpus to rank and filter the results. Evaluation of the quality of the extraction algorithm reveals significant improvements over naïve alignment-based methods. External evaluation shows an improvement in the performance of machine translation that uses the extracted dictionary.