Statistical Models for Text Segmentation
Machine Learning - Special issue on natural language learning
A critique and improvement of an evaluation metric for text segmentation
Computational Linguistics
Topic segmentation: algorithms and applications
Topic segmentation: algorithms and applications
Lexical cohesion computed by thesaural relations as an indicator of the structure of text
Computational Linguistics
TextTiling: segmenting text into multi-paragraph subtopic passages
Computational Linguistics
Advances in domain independent linear text segmentation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Text segmentation based on similarity between words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Word sense disambiguation and text segmentation based on lexical cohesion
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
A statistical model for domain-independent text segmentation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
SeLeCT: a lexical cohesion based news story segmentation system
AI Communications - STAIRS 2002
Story segmentation of brodcast news in English, Mandarin and Arabic
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
NewsViz: emotional visualization of news stories
CAAGET '10 Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text
Getting more from segmentation evaluation
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
In this paper we describe a novel approach to lexical chain based segmentation of broadcast news stories. Our segmentation system SeLeCT is evaluated with respect to two other lexical cohesion based segmenters TextTiling and C99. Using the Pk and WindowDiff evaluation metrics we show that SeLeCT outperforms both systems on spoken news transcripts (CNN) while the C99 algorithm performs best on the written newswire collection (Reuters). We also examine the differences between spoken and written news styles and how these differences can affect segmentation accuracy.