Extractive speech summarization using evaluation metric-related training criteria

Authors:
Berlin Chen;Shih-Hsiang Lin;Yu-Mei Chang;Jia-Wen Liu
Affiliations:
Department of Computer Science & Information Engineering, National Taiwan Normal University, Taipei, Taiwan;Department of Computer Science & Information Engineering, National Taiwan Normal University, Taipei, Taiwan;Department of Computer Science & Information Engineering, National Taiwan Normal University, Taipei, Taiwan;Department of Computer Science & Information Engineering, National Taiwan Normal University, Taipei, Taiwan
Venue:
Information Processing and Management: an International Journal
Year:
2013

Citing 29
Cited 1

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Generic text summarization using relevance measure and latent semantic analysis

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Advances in Automatic Text Summarization

Advances in Automatic Text Summarization
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
A support vector method for multivariate performance measures

ICML '05 Proceedings of the 22nd international conference on Machine learning
Adapting ranking SVM to document retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Bayesian query-focused summarization

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Discriminative n-gram language modeling

Computer Speech and Language
Soft indexing of speech content for search in spoken documents

Computer Speech and Language
AdaRank: a boosting algorithm for information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Multi-document summarization using cluster-based link analysis

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Word Topic Models for Spoken Document Retrieval and Transcription

ACM Transactions on Asian Language Information Processing (TALIP)
A Comparative Study of Probabilistic Ranking Models for Chinese Spoken Document Summarization

ACM Transactions on Asian Language Information Processing (TALIP)
LexRank: graph-based lexical centrality as salience in text summarization

Journal of Artificial Intelligence Research
Document summarization using conditional random fields

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Machine-made index for technical literature: an experiment

IBM Journal of Research and Development
Improving supervised learning for meeting summarization using sampling and regression

Computer Speech and Language
Exploring correlation between ROUGE and human evaluation on meeting summaries

IEEE Transactions on Audio, Speech, and Language Processing
Applying regression models to query-focused multi-document summarization

Information Processing and Management: an International Journal
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Subset ranking using regression

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Extractive chinese spoken document summarization using probabilistic ranking models

ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Leveraging Kullback–Leibler Divergence Measures and Information-Rich Cues for Speech Summarization

IEEE Transactions on Audio, Speech, and Language Processing
A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization

IEEE Transactions on Audio, Speech, and Language Processing
A Cascaded Broadcast News Highlighter

IEEE Transactions on Audio, Speech, and Language Processing

Leveraging relevance cues for language modeling in speech recognition

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The purpose of extractive speech summarization is to automatically select a number of indicative sentences or paragraphs (or audio segments) from the original spoken document according to a target summarization ratio and then concatenate them to form a concise summary. Much work on extractive summarization has been initiated for developing machine-learning approaches that usually cast important sentence selection as a two-class classification problem and have been applied with some success to a number of speech summarization tasks. However, the imbalanced-data problem sometimes results in a trained speech summarizer with unsatisfactory performance. Furthermore, training the summarizer by improving the associated classification accuracy does not always lead to better summarization evaluation performance. In view of such phenomena, we present in this paper an empirical investigation of the merits of two schools of training criteria to alleviate the negative effects caused by the aforementioned problems, as well as to boost the summarization performance. One is to learn the classification capability of a summarizer on the basis of the pair-wise ordering information of sentences in a training document according to a degree of importance. The other is to train the summarizer by directly maximizing the associated evaluation score or optimizing an objective that is linked to the ultimate evaluation. Experimental results on the broadcast news summarization task suggest that these training criteria can give substantial improvements over a few existing summarization methods.