The vocabulary problem in human-system communication
Communications of the ACM
SpeechSkimmer: interactively skimming recorded speech
UIST '93 Proceedings of the 6th annual ACM symposium on User interface software and technology
Phonetic confusion matrix based spoken document retrieval
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Building searchable collections of enterprise speech data
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
A broadband web-based application for video sharing and annotation
MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Learning Algorithms for Keyphrase Extraction
Information Retrieval
Four Paradigms for Indexing Video Conferences
IEEE MultiMedia
Speech Transcript Analysis for Automatic Search
HICSS '01 Proceedings of the 34th Annual Hawaii International Conference on System Sciences ( HICSS-34)-Volume 4 - Volume 4
Minimizing word error rate in textual summaries of spoken language
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Fast latent semantic indexing of spoken documents by using self-organizing maps
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04
Semantic similarity for detecting recognition errors in automatic speech transcripts
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Term clouds as surrogates for user generated speech
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Using speech recognition and intelligent search tools to enhance information accessibility
UAHCI'07 Proceedings of the 4th international conference on Universal access in human-computer interaction: applications and services
Spoken Content Retrieval: A Survey of Techniques and Technologies
Foundations and Trends in Information Retrieval
Hi-index | 0.00 |
Spoken audio documents are becoming more and more common on the World Wide Web, and this is likely to be accelerated by the widespread deployment of broadband technologies. Unfortunately, speech documents are inherently hard to browse because of their transient nature. One approach to this problem is to label segments of a spoken document with keyphrases that summarise them. In this paper, we investigate an approach for automatically extracting keyphrases from spoken audio documents. We use a keyphrase extraction system (Extractor) originally developed for text, and apply it to errorful Speech Recognition transcripts, which may contain multiple hypotheses for each of the utterances. We show that keyphrase extraction is an "easier" task than full text transcription and that keyphrases can be extracted with reasonable precision from transcripts with Word Error Rates (WER) as high as 62%. This robustness to noise can be attributed to the fact that keyphrase words have a lower WER than non-keyphrase words and that they tend to have more redundancy in the audio. From this we conclude that keyphrase extraction is feasible for a wide range of spoken documents, including less-than-broadcast casual speech. We also show that including multiple utterance hypotheses does not improve the precision of the extracted keyphrases.