An unsupervised topic segmentation model incorporating word order
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
Text segmentation is important for many natural language processing tasks, such as passage retrieval and summarization. This paper uses suffix tree model for the text representation and introduces a new measure, subsequence-based coherence, to represent the coherence between sentences and utilize the word order information. This paper also introduces a text segmentation algorithm, subsequence-based maximum cut, and a passage labeling approach based on subsequences. The educational text segmentation results show that our method outperforms some of the existing methods, and the passage labeling result is approving.