Computational Linguistics - Special issue on web as corpus
Bitext maps and alignment via pattern recognition
Computational Linguistics
The surprise language exercises
ACM Transactions on Asian Language Information Processing (TALIP)
Automatic detection of omissions in translations
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Hi-index | 0.00 |
Numerous cross-lingual applications, including state-of-the-art machine translation systems, require parallel texts aligned at the sentence level. However, collections of such texts are often polluted by pairs of texts that are comparable but not parallel. Bitext maps can help to discriminate between parallel and comparable texts. Bitext mapping algorithms use a larger set of document features than competing approaches to this task, resulting in higher accuracy. In addition, good bitext mapping algorithms are not limited to documents with structural mark-up such as web pages. The task of filtering non-parallel text pairs represents a new application of bitext mapping algorithms.