Extractive spoken document summarization for information retrieval

Authors:
Berlin Chen;Yi-Ting Chen
Affiliations:
Department of Computer Science and Information Engineering, National Taiwan Normal University, Taipei, Taiwan;Department of Computer Science and Information Engineering, National Taiwan Normal University, Taipei, Taiwan
Venue:
Pattern Recognition Letters
Year:
2008

Citing 23
Cited 4

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Summarizing text documents: sentence selection and evaluation metrics

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
Generic text summarization using relevance measure and latent semantic analysis

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The development of the HTK Broadcast News transcription system: an overview

Speech Communication - Special issue on automatic transcription of broadcast news data
Modern Information Retrieval

Modern Information Retrieval
Advances in Automatic Text Summarization

Advances in Automatic Text Summarization
Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition

Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition
Language Modeling for Information Retrieval

Language Modeling for Information Retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents

ACM Transactions on Asian Language Information Processing (TALIP)
Meta-evaluation of summaries in a cross-lingual environment using content-based metrics

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Automatic summarization of voicemail messages using lexical and prosodic features

ACM Transactions on Speech and Language Processing (TSLP)
Exploring the use of latent topical information for statistical Chinese spoken document retrieval

Pattern Recognition Letters
Extractive summarization using inter- and intra- event relevance

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A bottom-up approach to sentence ordering for multi-document summarization

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Maximum likelihood discriminant feature spaces

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Extractive summarization based on event term clustering

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A skip-chain conditional random field for ranking meeting utterances by importance

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Summarizing speech without text using hidden Markov models

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Machine-made index for technical literature: an experiment

IBM Journal of Research and Development
Automatic text summarization based on word-clusters and ranking algorithms

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research

Minimum tag error for discriminative training of conditional random fields

Information Sciences: an International Journal
A Comparative Study of Probabilistic Ranking Models for Chinese Spoken Document Summarization

ACM Transactions on Asian Language Information Processing (TALIP)
Is the contextual information relevant in text clustering by compression?

Expert Systems with Applications: An International Journal
Unsupervised topic modeling approaches to decision summarization in spoken meetings

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Quantified Score

Hi-index	0.10

Visualization

Abstract

The purpose of extractive summarization is to automatically select a number of indicative sentences, passages, or paragraphs from the original document according to a target summarization ratio and then sequence them to form a concise summary. In this paper, we proposed the use of probabilistic latent topical information for extractive summarization of spoken documents. Various kinds of modeling structures and learning approaches were extensively investigated. In addition, the summarization capabilities were verified by comparison with several conventional spoken document summarization models. The experiments were performed on the Chinese broadcast news collected in Taiwan. Noticeable performance gains were obtained. The proposed summarization technique has also been properly integrated into our prototype system for voice retrieval of Mandarin broadcast news via mobile devices.