Subtopic structuring for full-length document access
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Models for Text Segmentation
Machine Learning - Special issue on natural language learning
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised Learning by Probabilistic Latent Semantic Analysis
Machine Learning
A critique and improvement of an evaluation metric for text segmentation
Computational Linguistics
Advances in domain independent linear text segmentation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Text segmentation based on similarity between words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Measures of distributional similarity
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Topic analysis using a finite mixture model
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Domain-independent text segmentation using anisotropic diffusion and dynamic programming
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A System for new event detection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Web usage mining based on probabilistic latent semantic analysis
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Test Data Likelihood for PLSA Models
Information Retrieval
Web-assisted annotation, semantic indexing and search of television and radio news
WWW '05 Proceedings of the 14th international conference on World Wide Web
Story link detection and new event detection are asymmetric
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Optimizing story link detection is not equivalent to optimizing new event detection
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
CUTS: CUrvature-based development pattern analysis and segmentation for blogs and other Text Streams
Proceedings of the seventeenth conference on Hypertext and hypermedia
Broad coverage paragraph segmentation across languages and domains
ACM Transactions on Speech and Language Processing (TSLP)
Semantic passage segmentation based on sentence topics for question answering
Information Sciences: an International Journal
Topic segmentation with shared topic detection and alignment of multiple documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Text segmentation with LDA-based Fisher kernel
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Probabilistic latent semantic user segmentation for behavioral targeted advertising
Proceedings of the Third International Workshop on Data Mining and Audience Intelligence for Advertising
Word distributions for thematic segmentation in a support vector machine approach
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Feature-based segmentation of narrative documents
FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing
Online New Event Detection Based on IPLSA
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Efficient linear text segmentation based on information retrieval techniques
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
Text segmentation via topic modeling: an analytical study
Proceedings of the 18th ACM conference on Information and knowledge management
Randomized Probabilistic Latent Semantic Analysis for Scene Recognition
CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Data mining for web personalization
The adaptive web
MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
A dynamic programming model for text segmentation based on min-max similarity
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
A mixture model for expert finding
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Linear text segmentation using classification techniques
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Building adaptive systems for collaborative e-work: the e-workbench approach
Intelligent Decision Technologies - Special issue on knowledge-based environments and services in human-computer interaction
Text segmentation: A topic modeling perspective
Information Processing and Management: an International Journal
A statistical model for topically segmented documents
DS'11 Proceedings of the 14th international conference on Discovery science
Proceedings of the 20th ACM international conference on Information and knowledge management
A unified probabilistic framework for clustering correlated heterogeneous web objects
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Using probabilistic latent semantic analysis for personalized web search
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Unsupervised topic detection model and its application in text categorization
Proceedings of the CUBE International Information Technology Conference
Optimizing temporal topic segmentation for intelligent text visualization
Proceedings of the 2013 international conference on Intelligent user interfaces
Topic-based Amharic text summarization with probabilistic latent semantic analysis
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
Extracting news blog hot topics based on the W2T Methodology
World Wide Web
Hi-index | 0.00 |
This paper presents a new method for topic-based document segmentation, i.e., the identification of boundaries between parts of a document that bear on different topics. The method combines the use of the Probabilistic Latent Semantic Analysis (PLSA) model with the method of selecting segmentation points based on the similarity values between pairs of adjacent blocks. The use of PLSA allows for a better representation of sparse information in a text block, such as a sentence or a sequence of sentences. Furthermore, segmentation performance is improved by combining different instantiations of the same model, either using different random initializations or different numbers of latent classes. Results on commonly available data sets are significantly better than those of other state-of-the-art systems.