From text to speech: the MITalk system
From text to speech: the MITalk system
Experience with a stack decoder-based HMM CSR and back-OFF N-gram language models
HLT '91 Proceedings of the workshop on Speech and Natural Language
On the interaction between true source, training, and testing language models
HLT '90 Proceedings of the workshop on Speech and Natural Language
HLT '89 Proceedings of the workshop on Speech and Natural Language
Spoken language resources for Cantonese speech processing
Speech Communication
Extension of Zipf's law to words and phrases
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Spontaneous speech collection for the CSR corpus
HLT '91 Proceedings of the workshop on Speech and Natural Language
DARPA February 1992 pilot corpus CSR "dry run" benchmark test results
HLT '91 Proceedings of the workshop on Speech and Natural Language
Applying SPHINX-II to the DARPA Wall Street Journal CSR task
HLT '91 Proceedings of the workshop on Speech and Natural Language
Identification of non-linguistic speech features
HLT '93 Proceedings of the workshop on Human Language Technology
Macrophone: an American English telephone speech corpus
HLT '94 Proceedings of the workshop on Human Language Technology
The hub and spoke paradigm for CSR evaluation
HLT '94 Proceedings of the workshop on Human Language Technology
On-line cursive handwriting recognition using hidden Markov models and statistical grammars
HLT '94 Proceedings of the workshop on Human Language Technology
A Neural Syntactic Language Model
Machine Learning
Dawn explorer: a framework for multimodal accessibility to computer systems
OZCHI '05 Proceedings of the 17th Australia conference on Computer-Human Interaction: Citizens Online: Considerations for Today and the Future
The Effect of Emotional Speech on a Smart-Home Application
IEA/AIE '08 Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence
Part-of-speech tagging for English-Spanish code-switched text
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Shrinking exponential language models
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Dynamic visual features for audio-visual speaker verification
Computer Speech and Language
Maximum penalized likelihood kernel regression for fast adaptation
IEEE Transactions on Audio, Speech, and Language Processing
AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
Uncertainty decoding on Frequency Filtered parameters for robust ASR
Speech Communication
The SEMAINE API: towards a standards-based framework for building emotion-oriented systems
Advances in Human-Computer Interaction - Special issue on emotion-aware natural interaction
Parallel implementation of Artificial Neural Network training for speech recognition
Pattern Recognition Letters
A hybrid approach to adapting acoustic and pronunciation models for non-native speech recognition
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
IEEE Transactions on Audio, Speech, and Language Processing
Proceedings of the 2010 conference on Human Language Technologies -- The Baltic Perspective: Proceedings of the Fourth International Conference Baltic HLT 2010
IEEE Transactions on Audio, Speech, and Language Processing
Cross-lingual experiments with phone recognition
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Large vocabulary continuous speech recognition of wall street journal data
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
ikannotate - a tool for labelling, transcription, and annotation of emotionally coloured speech
ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part I
Unsupervised speaker adaptation using reference speaker weighting
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
USC-HAD: a daily activity dataset for ubiquitous activity recognition using wearable sensors
Proceedings of the 2012 ACM Conference on Ubiquitous Computing
Development of the 2012 SJTU HVR system
Proceedings of the 14th ACM international conference on Multimodal interaction
Deep neural network language models
WLM '12 Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT
Ada and grace: direct interaction with museum visitors
IVA'12 Proceedings of the 12th international conference on Intelligent Virtual Agents
Characterizing Phonetic Transformations and Acoustic Differences Across English Dialects
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
The DARPA Spoken Language System (SLS) community has long taken a leadership position in designing, implementing, and globally distributing significant speech corpora widely used for advancing speech recognition research. The Wall Street Journal (WSJ) CSR Corpus described here is the newest addition to this valuable set of resources. In contrast to previous corpora, the WSJ corpus will provide DARPA its first general-purpose English, large vocabulary, natural language, high perplexity, corpus containing significant quantities of both speech data (400 hrs.) and text data (47M words), thereby providing a means to integrate speech recognition and natural language processing in application domains with high potential practical value. This paper presents the motivating goals, acoustic data design, text processing steps, lexicons, and testing paradigms incorporated into the multi-faceted WSJ CSR Corpus.