Assessing agreement on classification tasks: the kappa statistic
Computational Linguistics
SUMMAC: a text summarization evaluation
Natural Language Engineering
Automatic evaluation of summaries using N-gram co-occurrence statistics
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A skip-chain conditional random field for ranking meeting utterances by importance
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Improving supervised learning for meeting summarization using sampling and regression
Computer Speech and Language
A pilot study of opinion summarization in conversations
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Hi-index | 0.00 |
Significant research efforts have been devoted to speech summarization, including automatic approaches and evaluation metrics. However, a fundamental problem about what summaries are for the speech data and whether humans agree with each other remains unclear. This paper performs an analysis of human annotated extractive summaries using the ICSI meeting corpus with an aim to examine their consistency and the factors impacting human agreement. In addition to using Kappa statistics and ROUGE scores, we also proposed a sentence distance score and divergence distance as a quantitative measure. This study is expected to help better define the speech summarization problem.