Statistical Models for Text Segmentation
Machine Learning - Special issue on natural language learning
Information Retrieval
A critique and improvement of an evaluation metric for text segmentation
Computational Linguistics
ECDL '97 Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries
Topic segmentation: algorithms and applications
Topic segmentation: algorithms and applications
Lexical cohesion computed by thesaural relations as an indicator of the structure of text
Computational Linguistics
TextTiling: segmenting text into multi-paragraph subtopic passages
Computational Linguistics
Advances in domain independent linear text segmentation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Intention-based segmentation: human reliability and correlation with linguistic cues
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Text segmentation based on similarity between words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Word sense disambiguation and text segmentation based on lexical cohesion
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Spoken and written news story segmentation using lexical chains
NAACLstudent '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Proceedings of the HLT-NAACL 2003 student research workshop - Volume 3
Semantic passage segmentation based on sentence topics for question answering
Information Sciences: an International Journal
Subword Lexical Chaining for Automatic Story Segmentation in Chinese Broadcast News
PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Measuring semantic relatedness using people and WordNet
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Using readers to identify lexical cohesive structures in texts
ACLstudent '05 Proceedings of the ACL Student Research Workshop
A Subword Normalized Cut Approach to Automatic Story Segmentation of Chinese Broadcast News
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
IEEE Transactions on Audio, Speech, and Language Processing
DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
TextLec: a novel method of segmentation by topic using lower windows and lexical cohesion
CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
Linear text segmentation using classification techniques
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Text segmentation by clustering cohesion
CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
Information Sciences: an International Journal
Text segmentation: A topic modeling perspective
Information Processing and Management: an International Journal
ACM Transactions on Speech and Language Processing (TSLP)
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Lexical chains using distributional measures of concept distance
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
In this paper we compare the performance of three distinct approaches to lexical cohesion based text segmentation. Most work in this area has focused on the discovery of textual units that discuss subtopic structure within documents. In contrast our segmentation task requires the discovery of topical units of text i.e., distinct news stories from broadcast news programmes. Our approach to news story segmentation (the SeLeCT system) is based on an analysis of lexical cohesive strength between textual units using a linguistic technique called lexical chaining. We evaluate the relative performance of SeLeCT with respect to two other cohesion based segmenters: TextTiling and C99. Using a recently introduced evaluation metric WindowDiff, we contrast the segmentation accuracy of each system on both "spoken" (CNN news transcripts) and "written" (Reuters newswire) news story test sets extracted from the TDT1 corpus.