Prosody-based automatic segmentation of speech into sentences and topics
Speech Communication - Special issue on accessing information in spoken audio
Introduction To Automata Theory, Languages, And Computation
Introduction To Automata Theory, Languages, And Computation
Capitalization Recovery for Text
Information Retrieval Techniques for Speech Applications [this book is based on the workshop “Information Retrieval Techniques for Speech Applications”, held as part of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in New Orleans, USA, in September 2001].
Information extraction from voicemail
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Information extraction from voicemail transcripts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Capitalizing machine translation
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Restoring punctuation and capitalization in transcribed speech
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
OpenFst: a general and efficient weighted finite-state transducer library
CIAA'07 Proceedings of the 12th international conference on Implementation and application of automata
Enriching speech recognition with automatic detection of sentence boundaries and disfluencies
IEEE Transactions on Audio, Speech, and Language Processing
A monotonic statistical machine translation approach to speaking style transformation
Computer Speech and Language
Hi-index | 0.00 |
We address the problem of formatting the output of an automatic speech recognition (ASR) system for readability, while preserving word-level timing information of the transcript. Our system enriches the ASR transcript with punctuation, capitalization and properly written dates, times and other numeric entities, and our approach can be applied to other formatting tasks. The method we describe combines hand-crafted grammars with a class-based language model trained on written text and relies on Weighted Finite State Transducers (WFSTs) for the preservation of start and end time of each word.