Segmentation similarity and agreement

Authors:
Chris Fournier;Diana Inkpen
Affiliations:
University of Ottawa, Ottawa, ON, Canada;University of Ottawa, Ottawa, ON, Canada
Venue:
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Year:
2012

Citing 12
Cited 1

Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
A technique for computer detection and correction of spelling errors

Communications of the ACM
A critique and improvement of an evaluation metric for text segmentation

Computational Linguistics
TextTiling: segmenting text into multi-paragraph subtopic passages

Computational Linguistics
Intention-based segmentation: human reliability and correlation with linguistic cues

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Estimating upper and lower bounds on the performance of word-sense disambiguation programs

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
User-oriented text segmentation evaluation measure

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
On Evaluation Methodologies for Text Segmentation Algorithms

ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Inter-coder agreement for computational linguistics

Computational Linguistics
An analysis of quantitative aspects in the evaluation of thematic segmentation algorithms

SigDIAL '06 Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue
Human-competitive tagging using automatic keyphrase extraction

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Topical segmentation: a study of human performance and a new measure of quality

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Topical segmentation: a study of human performance and a new measure of quality

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new segmentation evaluation metric, called segmentation similarity (S), that quantifies the similarity between two segmentations as the proportion of boundaries that are not transformed when comparing them using edit distance, essentially using edit distance as a penalty function and scaling penalties by segmentation size. We propose several adapted inter-annotator agreement coefficients which use S that are suitable for segmentation. We show that S is configurable enough to suit a wide variety of segmentation evaluations, and is an improvement upon the state of the art. We also propose using inter-annotator agreement coefficients to evaluate automatic segmenters in terms of human performance.