Discrete-time signal processing
Discrete-time signal processing
Subtopic structuring for full-length document access
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic text structuring and summarization
Information Processing and Management: an International Journal - Special issue: methods and tools for the automatic construction of hypertext
Computational Linguistics
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
Multifunction thesaurus for Russian word processing
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
How to thematically segment texts by using lexical cohesion?
ACL '98 Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 2
Optimal multi-paragraph text segmentation by dynamic programming
ACL '98 Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 2
Thematic segmentation of texts: two methods for two kinds of texts
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Text segmentation using reiteration and collocation
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Text segmentation based on similarity between words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Combining multiple knowledge sources for discourse segmentation
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Multi-paragraph segmentation of expository text
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Automatic detection of discourse structure by checking surface information in sentences
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
A grammatico-statistical approach to discourse partitioning
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Writing for Computer Science
A paragraph boundary detection system
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Combining sources of evidence for recognition of relevant passages in texts
ISSADS'05 Proceedings of the 5th international conference on Advanced Distributed Systems
Hi-index | 0.00 |
The problem of automatic text segmentation is subcategorized into two different problems: thematic segmentation into rather large topically self-contained sections and splitting into paragraphs, i.e., lexico-grammatical segmentation of lower level. In this paper we consider the latter problem. We propose a method of reasonably splitting text into paragraph based on a text cohesion measure. Specifically, we propose a method of quantitative evaluation of text cohesion based on a large linguistic resource - a collocation network. At each step, our algorithm compares word occurrences in a text against a large DB of collocations and semantic links between words in the given natural language. The procedure consists in evaluation of the cohesion function, its smoothing, normalization, and comparing with a specially constructed threshold.