Input Segmentation of Spontaneous Speech in JANUS: A Speech-to-speech Translation System
ECAI '96 Workshop on Dialogue Processing in Spoken Language Systems
Disambiguation of proper names in text
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Automatic processing of proper names in texts
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Deterministic parsing of syntactic non-fluencies
ACL '83 Proceedings of the 21st annual meeting on Association for Computational Linguistics
Hi-index | 0.00 |
We describe an approach to Machine Translation of transcribed speech, as found in closed captions. We discuss how the colloquial nature and input format peculiarities of closed captions are dealt with in a pre-processing pipeline that prepares the input for effective processing by a core MT system. In particular, we describe components for proper name recognition and input segmentation. We evaluate the contribution of such modules to the system performance. The described methods have been implemented on an MT system for translating English closed captions to Spanish and Portuguese.