Analysis and processing of lecture audio data: preliminary investigations

Authors:
James Glass;Timothy J. Hazen;Lee Hetherington;Chao Wang
Affiliations:
MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA
Venue:
SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
Year:
2004

Citing 4
Cited 11

An overview of audio information retrieval

Multimedia Systems - Special issue on audio and multimedia
Indexing and retrieval of broadcast news

Speech Communication - Special issue on accessing information in spoken audio
Spoken document representations for probabilistic retrieval

Speech Communication - Special issue on accessing information in spoken audio
SCANMail: audio navigation in the voicemail domain

HLT '01 Proceedings of the first international conference on Human language technology research

Position specific posterior lattices for indexing speech

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Towards spoken-document retrieval for the internet: lattice indexing for large-scale web-search architectures

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
The MIT spoken lecture processing project

HLT-Demo '05 Proceedings of HLT/EMNLP on Interactive Demonstrations
Speech Ogle: indexing uncertainty for spoken document search

ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
Soft indexing of speech content for search in spoken documents

Computer Speech and Language
Multimodal redundancy across handwriting and speech during computer mediated human-human interactions

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Style & topic language model adaptation using HMM-LDA

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
N-gram weighting: reducing training data mismatch in cross-domain language model estimation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Learning sub-word units for open vocabulary speech recognition

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Classroom lecture recognition

PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
An automated analysis and indexing framework for lecture video portal

ICWL'12 Proceedings of the 11th international conference on Advances in Web-Based Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we report on our recent efforts to collect a corpus of spoken lecture material that will enable research directed towards fast, accurate, and easy access to lecture content. Thus far, we have collected a corpus of 270 hours of speech from a variety of undergraduate courses and seminars. We report on an initial analysis of the spontaneous speech phenomena present in these data and the vocabulary usage patterns across three courses. Finally, we examine language model perplexities trained from written and spoken materials, and describe an initial recognition experiment on one course.