A maximum entropy approach to natural language processing
Computational Linguistics
Input Segmentation of Spontaneous Speech in JANUS: A Speech-to-speech Translation System
ECAI '96 Workshop on Dialogue Processing in Spoken Language Systems
An Architecture for a Text Simplification System
LEC '02 Proceedings of the Language Engineering Conference (LEC'02)
Example retrieval from a translation memory
Natural Language Engineering
Splitting long or ill-formed input for robust spoken-language translation
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Experiments and prospects of Example-Based Machine Translation
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Example-based machine translation using DP-matching between word sequences
DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
Generation of word graphs in statistical machine translation
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Input sentence splitting and translating
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Hi-index | 0.00 |
In order to boost the translation quality of corpus-based MT systems for speech translation, the technique of splitting an input sentence appears promising. In previous research, many methods used N-gram clues to split sentences. In this paper, to supplement N-gram based splitting methods, we introduce another clue using sentence similarity based on edit-distance. In our splitting method, we generate candidates for sentence splitting based on N-grams, and select the best one by measuring sentence similarity. We conducted experiments using two EBMT systems, one of which uses a phrase and the other of which uses a sentence as a translation unit. The translation results on various conditions were evaluated by objective measures and a subjective measure. The experimental results show that the proposed method is valuable for both systems.