A statistical approach to machine translation
Computational Linguistics
Dynamic Programming
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Two languages are more informative than one
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Aligning sentences in parallel corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Word-sense disambiguation using statistical methods
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
Bilingual Sentence Alignment: Balancing Robustness and Accuracy
Machine Translation
Bilingual Dictionary Based Sentence Alignment for Chinese English Bitext
ICMI '00 Proceedings of the Third International Conference on Advances in Multimodal Interfaces
Knowledge Extraction from Bilingual Corpora
Information Extraction: Towards Scalable, Adaptable Systems
A Multilingual Procedure for Dictionary-Based Sentence Alignment
AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
A Statistical View on Bilingual Lexicon Extraction: From Parallel Corpora to Non-parallel Corpora
AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Adaptive Bilingual Sentence Alignment
AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
Fast and Accurate Sentence Alignment of Bilingual Corpora
AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
A class-based approach to word alignment
Computational Linguistics
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora
Computational Linguistics
Automatic construction of parallel English-Chinese corpus for cross-language information retrieval
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
High-performance bilingual text alignment using statistical and dictionary information
Natural Language Engineering
Semi-automatic acquisition of domain-specific translation lexicons
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
An alignment method for noisy parallel corpora based on image processing techniques
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A portable algorithm for mapping bitext correspondence
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
An experiment in hybrid dictionary and statistical sentence alignment
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A pattern matching method for finding noun and proper noun translations from noisy parallel corpora
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Aligning a parallel English-Chinese corpus statistically with lexical criteria
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
High-performance bilingual text alignment using statistical and dictionary information
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Structural feature selection for English-Korean statistical machine translation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Bilingual text, matching using bilingual dictionary and statistics
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
K-vec: a new approach for aligning parallel texts
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Building an MT dictionary from parallel texts based on linguistic and statistical information
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
A part-of-speech-based alignment algorithm
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Extracting word correspondences from bilingual corpora based on word co-occurrences information
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Alignment of shared forests for bilingual corpora
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
A robust cross-style bilingual sentences alignment model
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Multipath translation lexicon induction via bridge languages
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
PENS: a machine-aided english writing system for Chinese users
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Translating collocations for use in bilingual lexicons
HLT '94 Proceedings of the workshop on Human Language Technology
DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
Constructing of a large-scale Chinese-English parallel corpus
COLING '02 Proceedings of the 3rd workshop on Asian language resources and international standardization - Volume 12
Efficient optimization for bilingual sentence alignment based on linear regression
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Automatic extraction of bilingual word pairs using inductive chain learning in various languages
Information Processing and Management: an International Journal
A DOM tree alignment model for mining parallel data from the web
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Sentence alignment using P-NNT and GMM
Computer Speech and Language
Cross Sentence Alignment for Structurally Dissimilar Corpus Based on Singular Value Decomposition
ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Artificial Intelligence
Improved sentence alignment on parallel web pages using a stochastic tree alignment model
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A hybrid approach to align sentences and words in English-Hindi parallel corpora
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Comparison, selection and use of sentence alignment algorithms for new language pairs
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Chinese-Uyghur sentence alignment: an approach based on anchor sentences
BUCC '09 Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
Local context selection for aligning sentences in parallel corpora
CONTEXT'07 Proceedings of the 6th international and interdisciplinary conference on Modeling and using context
Context-based sentence alignment in parallel corpora
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Improving corpus comparability for bilingual lexicon extraction from comparable corpora
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Improved unsupervised sentence alignment for symmetrical and asymmetrical parallel corpora
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Fast-Champollion: a fast and robust sentence alignment algorithm
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
An Expectation Maximization algorithm for textual unit alignment
BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Building a web-based parallel corpus and filtering out machine-translated text
BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Alignment of paragraphs in bilingual texts using bilingual dictionaries and dynamic programming
CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
A bilingual corpus of novels aligned at paragraph level
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Evaluation of alignment methods for HTML parallel text
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Maximum likelihood alignment of translation equivalents
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Bilingual sentence alignment based on punctuation statistics and lexicon
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Combining sentence length with location information to align monolingual parallel texts
AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Extracting parallel paragraphs and sentences from english-persian translated documents
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Hi-index | 0.00 |
In this paper, we describe a fast algorithm for aligning sentences with their translations in a bilingual corpus. Existing efficient algorithms ignore word identities and only consider sentence length (Brown et al., 1991b; Gale and Church, 1991). Our algorithm constructs a simple statistical word-to-word translation model on the fly during alignment. We find the alignment that maximizes the probability of generating the corpus with this translation model. We have achieved an error rate of approximately 0.4% on Canadian Hansard data, which is a significant improvement over previous results. The algorithm is language independent.