EPIA '99 Proceedings of the 9th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Small but Efficient: The Misconception of High-Frequency Words in Scandinavian Translation
AMTA '00 Proceedings of the 4th Conference of the Association for Machine Translation in the Americas on Envisioning Machine Translation in the Information Future
Using confidence bands for parallel texts alignment
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Alignment-Based Tools for Translation Self-Learning
Proceedings of the 2005 conference on Artificial Intelligence in Education: Supporting Learning through Intelligent and Socially Informed Technology
Measuring spelling similarity for cognate identification
EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
Hi-index | 0.00 |
This paper describes a language independent method for aligning parallel texts (texts that are translations of each other, or of a common source text), statistically supported. This new approach is inspired on previous work by Ribeiro et al (2000). The application of the second statistical filter, proposed by Ribeiro et al, based on Confidence Bands (CB), is substituted by the application of the Longest Sorted Sequence algorithm (LSSA). LSSA is described in this paper. As a result, 35% decrease in processing time and 18% increase in the number of aligned segments was obtained, for Portuguese-French alignments. Similar results were obtained regarding Portuguese-English alignments. Both methods are compared and evaluated, over a large parallel corpus made up of Portuguese, English and French parallel texts (approximately 250Mb of text per language).