Identifying word correspondence in parallel texts
HLT '91 Proceedings of the workshop on Speech and Natural Language
Using cognates to align sentences in bilingual corpora
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
A class-based approach to word alignment
Computational Linguistics
Aligning sentences in parallel corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Char_align: a program for aligning parallel texts at the character level
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning sentences in bilingual corpora using lexical information
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning a parallel English-Chinese corpus statistically with lexical criteria
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
K-vec: a new approach for aligning parallel texts
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Towards automatic extraction of monolingual and bilingual terminology
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Bitext correspondences through rich mark-up
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Constructing of a large-scale Chinese-English parallel corpus
COLING '02 Proceedings of the 3rd workshop on Asian language resources and international standardization - Volume 12
Hi-index | 0.00 |
This paper presents a new approach to bitext correspondence problem (BCP) of noisy bilingual corpora based on image processing (IP) techniques. By using one of several ways of estimating the lexical translation probability (LTP) between pairs of source and target words, we can turn a bitext into a discrete gray-level image. We contend that the BCP, when seen in the light, bears a striking resemblance to the line detection problem in IP. Therefore, BCPs, including sentence and word alignment, can benefit from a wealth of effective, well established IP techniques, including convolution-based filters, texture analysis and Hough transform. This paper describes a new program, PlotAlign that produces a word-level bitext map for noisy or non-literal bitext, based on these techniques.