Attention, intentions, and the structure of discourse
Computational Linguistics
Subtopic structuring for full-length document access
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Models for Text Segmentation
Machine Learning - Special issue on natural language learning
Foundations of statistical natural language processing
Foundations of statistical natural language processing
The Theory and Practice of Discourse Parsing and Summarization
The Theory and Practice of Discourse Parsing and Summarization
A critique and improvement of an evaluation metric for text segmentation
Computational Linguistics
Topic segmentation: algorithms and applications
Topic segmentation: algorithms and applications
TextTiling: segmenting text into multi-paragraph subtopic passages
Computational Linguistics
Discourse segmentation by human and automated means
Computational Linguistics
Advances in domain independent linear text segmentation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A prosodic analysis of discourse segments in direction-giving monologues
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Using collocations for topic segmentation and link detection
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A statistical model for domain-independent text segmentation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Discourse segmentation of multi-party conversation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Feature-based segmentation of narrative documents
FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Segmentation similarity and agreement
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Getting more from segmentation evaluation
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Contextual web searches in Facebook using learning materials and discussion messages
Computers in Human Behavior
Hi-index | 0.00 |
We consider here the task of linear the-matic segmentation of text documents, by using features based on word distributions in the text. For this task, a typical and often implicit assumption in previous studies is that a document has just one topic and therefore many algorithms have been tested and have shown encouraging results on artificial data sets, generated by putting together parts of different documents. We show that evaluation on synthetic data is potentially misleading and fails to give an accurate evaluation of the performance on real data. Moreover, we provide a critical review of existing evaluation metrics in the literature and we propose an improved evaluation metric.