Automated authoring of hypermedia documents of video programs
Proceedings of the third ACM international conference on Multimedia
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Integrated technologies for indexing spoken language
Communications of the ACM
Document centered approach to text normalization
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
A parser for real-time speech synthesis of conversational texts
ANLC '92 Proceedings of the third conference on Applied natural language processing
A maximum entropy approach to identifying sentence boundaries
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Nymble: a high-performance learning name-finder
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Disambiguation of proper names in text
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Query recovery of short user queries: on query expansion with stopwords
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Formatting time-aligned ASR transcripts for readability
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
Proper capitalization in text is a useful, often mandatory characteristic. Many text processing techniques rely on proper capitalization, and people can more easily read mixed case text. Proper capitalization, however, is often absent in a number of text sources, including automatic speech recognition output and closed caption text. The value of these text sources can be greatly enhanced with proper capitalization. We describe and evaluate a series of techniques that can recover proper capitalization. Our final system is able to recover more than 88% of the capitalized words with better than 90% accuracy.