Context and structure in automated full-text information access
Context and structure in automated full-text information access
Statistical Models for Text Segmentation
Machine Learning - Special issue on natural language learning
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Domain-independent text segmentation using anisotropic diffusion and dynamic programming
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A Dynamic Programming Algorithm for Linear Text Segmentation
Journal of Intelligent Information Systems
Augmented segmentation and visualization for presentation videos
Proceedings of the 13th annual ACM international conference on Multimedia
CUTS: CUrvature-based development pattern analysis and segmentation for blogs and other Text Streams
Proceedings of the seventeenth conference on Hypertext and hypermedia
Unsupervised learning of field segmentation models for information extraction
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Unsupervised topic modelling for multi-party spoken discourse
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Topic segmentation with shared topic detection and alignment of multiple documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Question-driven segmentation of lecture speech text: Towards intelligent e-learning systems
Journal of the American Society for Information Science and Technology
Modeling online reviews with multi-grain topic models
Proceedings of the 17th international conference on World Wide Web
Unsupervised methods of topical text segmentation for Polish
ACL '07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies
Word distributions for thematic segmentation in a support vector machine approach
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Hierarchical text segmentation from multi-scale lexical cohesion
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Locating case discussion segments in recorded medical team meetings
SSCS '09 Proceedings of the third workshop on Searching spontaneous conversational speech
Content modeling using latent permutations
Journal of Artificial Intelligence Research
Contextually-mediated semantic similarity graphs for topic segmentation
TextGraphs-5 Proceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing
Improved latent concept expansion using hierarchical markov random fields
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Multi-document topic segmentation
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
System for supporting web-based public debate using transcripts of face-to-face meeting
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part III
Text segmentation: A topic modeling perspective
Information Processing and Management: an International Journal
Structural topic model for latent topical structure analysis
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
ACM Transactions on Speech and Language Processing (TSLP)
A multimodal discourse ontology for meeting understanding
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Linear text segmentation using affinity propagation
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Computational Intelligence
The nonverbal structure of patient case discussions in multidisciplinary medical team meetings
ACM Transactions on Information Systems (TOIS)
Modelling sequential text with an adaptive topic model
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Identifying event sequences using hidden Markov model
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Joint training of non-negative Tucker decomposition and discrete density hidden Markov models
Computer Speech and Language
An unsupervised topic segmentation model incorporating word order
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Commonsense-based topic modeling
Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining
On handling textual errors in latent document modeling
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Topic segmentation and labeling in asynchronous conversations
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
We present a novel probabilistic method for topic segmentation on unstructured text. One previous approach to this problem utilizes the hidden Markov model (HMM) method for probabilistically modeling sequence data [7]. The HMM treats a document as mutually independent sets of words generated by a latent topic variable in a time series. We extend this idea by embedding Hofmann's aspect model for text [5] into the segmenting HMM to form an aspect HMM (AHMM). In doing so, we provide an intuitive topical dependency between words and a cohesive segmentation model. We apply this method to segment unbroken streams of New York Times articles as well as noisy transcripts of radio programs on SpeechBot, an online audio archive indexed by an automatic speech recognition engine. We provide experimental comparisons which show that the AHMM outperforms the HMM for this task.