A dynamic programming model for text segmentation based on min-max similarity

Authors:
Na Ye;Jingbo Zhu;Yan Zheng;Matthew Y. Ma;Huizhen Wang;Bin Zhang
Affiliations:
Institute of Computer Software and Theory, Northeastern University, Shenyang, China;Institute of Computer Software and Theory, Northeastern University, Shenyang, China;Institute of Computer Software and Theory, Northeastern University, Shenyang, China;IPVALUE Management Inc., Bridgewater, NJ;Institute of Computer Software and Theory, Northeastern University, Shenyang, China;Institute of Computer Applications, Northeastern University, Shenyang, China
Venue:
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Year:
2008

Citing 13
Cited 1

Topic-based document segmentation with probabilistic latent semantic analysis

Proceedings of the eleventh international conference on Information and knowledge management
A critique and improvement of an evaluation metric for text segmentation

Computational Linguistics
Discourse Segmentation in Aid of Document Summarization

HICSS '00 Proceedings of the 33rd Hawaii International Conference on System Sciences-Volume 3 - Volume 3
Domain-independent text segmentation using anisotropic diffusion and dynamic programming

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Advances in domain independent linear text segmentation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Optimal multi-paragraph text segmentation by dynamic programming

ACL '98 Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 2
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
An automatic method of finding topic boundaries

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A Dynamic Programming Algorithm for Linear Text Segmentation

Journal of Intelligent Information Systems
A statistical model for domain-independent text segmentation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Improving Text Segmentation Using Latent Semantic Analysis: A Reanalysis of Choi, Wiemer-Hastings, and Moore (2001)

Computational Linguistics
Minimum cut model for spoken lecture segmentation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Using multiple discriminant analysis approach for linear text segmentation

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

An iterative approach to text segmentation

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text segmentation has a wide range of applications such as information retrieval, question answering and text summarization. In recent years, the use of semantics has been proven to be effective in improving the performance of text segmentation. Particularly, in finding the subtopic boundaries, there have been efforts in focusing on either maximizing the lexical similarity within a segment or minimizing the similarity between adjacent segments. However, no optimal solutions have been attempted to simultaneously achieve maximum within-segment similarity and minimum between-segment similarity. In this paper, a domain independent model based on min-max similarity (MMS) is proposed in order to fill the void. Dynamic programming is adopted to achieve global optimization of the segmentation criterion function. Comparative experimental results on real corpus have shown that MMS model outperforms previous segmentation approaches.