Topic analysis using a finite mixture model

Authors:
Hang Li;Kenji Yamanishi
Affiliations:
NEC Corporation;NEC Corporation
Venue:
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Year:
2000

Citing 13
Cited 10

Elements of information theory

Elements of information theory
Class-based n-gram models of natural language

Computational Linguistics
Training algorithms for linear text classifiers

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Text classification using ESC-based stochastic decision lists

Proceedings of the eighth international conference on Information and knowledge management
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Maximizing Text-Mining Performance

IEEE Intelligent Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
TextTiling: segmenting text into multi-paragraph subtopic passages

Computational Linguistics
Document classification using a finite mixture model

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Statistical models for topic segmentation

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Fisher information and stochastic complexity

IEEE Transactions on Information Theory

Mining from open answers in questionnaire data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Topic-based document segmentation with probabilistic latent semantic analysis

Proceedings of the eleventh international conference on Information and knowledge management
Mining Open Answers in Questionnaire Data

IEEE Intelligent Systems
Test Data Likelihood for PLSA Models

Information Retrieval
Semantic passage segmentation based on sentence topics for question answering

Information Sciences: an International Journal
VIBES: visualizing changing emotional states in personal stories

SRMC '08 Proceedings of the 2nd ACM international workshop on Story representation, mechanism and context
Feature-based segmentation of narrative documents

FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing
Topic detection by topic model induced distance using biased initiation

AST/UCMA/ISA/ACN'10 Proceedings of the 2010 international conference on Advances in computer science and information technology
Applying collocation segmentation to the ACL anthology reference corpus

ACL '12 Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
A Graph Analytical Approach for Topic Detection

ACM Transactions on Internet Technology (TOIT)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the issue of 'topic analysis,' by which is determined a text's topic structure, which indicates what topics are included in a text, and how topics change within the text. We propose a novel approach to this issue, one based on statistical modeling and learning. We represent topics by means of word clusters, and employ a finite mixture model to represent a word distribution within a text. Our experimental results indicate that our method significantly outperforms a method that combines existing techniques.