Summarizing text documents: sentence selection and evaluation metrics
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Generic text summarization using relevance measure and latent semantic analysis
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
Advances in Automatic Text Summarization
Advances in Automatic Text Summarization
Language Modeling for Information Retrieval
Language Modeling for Information Retrieval
A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents
ACM Transactions on Asian Language Information Processing (TALIP)
Automatic summarization of voicemail messages using lexical and prosodic features
ACM Transactions on Speech and Language Processing (TSLP)
A Comparative Study of Probabilistic Ranking Models for Chinese Spoken Document Summarization
ACM Transactions on Asian Language Information Processing (TALIP)
Extractive speech summarization using shallow rhetorical structure modeling
IEEE Transactions on Audio, Speech, and Language Processing
Extractive speech summarization using evaluation metric-related training criteria
Information Processing and Management: an International Journal
Hi-index | 0.00 |
The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling framework. We investigate the use of the hidden Markov model (HMM) for spoken document summarization, in which each sentence of a spoken document is treated as an HMM for generating the document, and the sentences are ranked and selected according to their likelihoods. In addition, the relevance model (RM) of each sentence, estimated from a contemporary text collection, is integrated with the HMM model to improve the representation of the sentence model. The experiments were performed on Chinese broadcast news compiled in Taiwan. The proposed approach achieves noticeable performance gains over conventional summarization approaches.