Imposing hierarchical browsing structures onto spoken documents

Authors:
Xiaodan Zhu;Colin Cherry;Gerald Penn
Affiliations:
National Research Council Canada;National Research Council Canada;University of Toronto
Venue:
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Year:
2010

Citing 14
Cited 1

Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
The Theory and Practice of Discourse Parsing and Summarization

The Theory and Practice of Discourse Parsing and Summarization
A critique and improvement of an evaluation metric for text segmentation

Computational Linguistics
Using hidden Markov modeling to decompose human-written summaries

Computational Linguistics - Summarization
Synchronization of lecture videos and electronic slides by video text analysis

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
TextTiling: segmenting text into multi-paragraph subtopic passages

Computational Linguistics
Minimizing word error rate in textual summaries of spoken language

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
The effect of speech recognition accuracy rates on the usefulness and usability of webcast archives

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Matching slides to presentation videos using SIFT and scene background matching

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Numerical Recipes 3rd Edition: The Art of Scientific Computing

Numerical Recipes 3rd Edition: The Art of Scientific Computing
Automatic broadcast news speech summarization

Automatic broadcast news speech summarization
Style & topic language model adaptation using HMM-LDA

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
A new approach to automatic speech summarization

IEEE Transactions on Multimedia
Summarizing spoken documents through utterance selection

Summarizing spoken documents through utterance selection

A normalized-cut alignment model for mapping hierarchical semantic structures onto spoken documents

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper studies the problem of imposing a known hierarchical structure onto an unstructured spoken document, aiming to help browse such archives. We formulate our solutions within a dynamic-programming-based alignment framework and use minimum error-rate training to combine a number of global and hierarchical constraints. This pragmatic approach is computationally efficient. Results show that it outperforms a baseline that ignores the hierarchical and global features and the improvement is consistent on transcripts with different WERs. Directly imposing such hierarchical structures onto raw speech without using transcripts yields competitive results.