Summarizing multiple spoken documents: finding evidence from untranscribed audio

Authors:
Xiaodan Zhu;Gerald Penn;Frank Rudzicz
Affiliations:
University of Toronto, Toronto, ON, Canada;University of Toronto, Toronto, ON, Canada;University of Toronto, Toronto, ON, Canada
Venue:
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Year:
2009

Citing 13
Cited 2

Generating summaries of multiple news articles

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Auto-summarization of audio-video presentations

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Comparing presentation summaries: slides vs. reading vs. listening

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis

Journal of Computer and System Sciences - Computational biology 2002
Generating natural language summaries from multiple on-line sources

Computational Linguistics - Special issue on natural language generation
Information fusion in the context of multi-document summarization

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Centroid-based summarization of multiple documents

Information Processing and Management: an International Journal
The effect of speech recognition accuracy rates on the usefulness and usability of webcast archives

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Numerical Recipes 3rd Edition: The Art of Scientific Computing

Numerical Recipes 3rd Edition: The Art of Scientific Computing
Unsupervised Pattern Discovery in Speech

IEEE Transactions on Audio, Speech, and Language Processing
Temporal Compression Of Speech: An Evaluation

IEEE Transactions on Audio, Speech, and Language Processing

Long story short - Global unsupervised models for keyphrase based meeting summarization

Speech Communication
Automatic summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a model for summarizing multiple untranscribed spoken documents. Without assuming the availability of transcripts, the model modifies a recently proposed unsupervised algorithm to detect re-occurring acoustic patterns in speech and uses them to estimate similarities between utterances, which are in turn used to identify salient utterances and remove redundancies. This model is of interest due to its independence from spoken language transcription, an error-prone and resource-intensive process, its ability to integrate multiple sources of information on the same topic, and its novel use of acoustic patterns that extends previous work on low-level prosodic feature detection. We compare the performance of this model with that achieved using manual and automatic transcripts, and find that this new approach is roughly equivalent to having access to ASR transcripts with word error rates in the 33--37% range without actually having to do the ASR, plus it better handles utterances with out-of-vocabulary words.