A Combined Latent Class and Trait Model for the Analysis and Visualization of Discrete Data

Authors:
Ata Kabán;Mark Girolami
Affiliations:
Univ. of Paisley, Paisley, Scotland;Univ. of Paisley, Paisley, Scotland
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2001

Citing 19
Cited 14

Elements of information theory

Elements of information theory
Bayesian classification (AutoClass): theory and results

Advances in knowledge discovery and data mining
Self-organizing maps

Self-organizing maps
GTM: the generative topographic mapping

Neural Computation
Inductive learning algorithms and representations for text categorization

Proceedings of the seventh international conference on Information and knowledge management
A unifying review of linear Gaussian models

Neural Computation
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic visualisation of high-dimensional binary data

Proceedings of the 1998 conference on Advances in neural information processing systems II
Analysis of latent structure models with multidimensional latent variables

Statistics and neural networks
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Self-Organising Neural Networks: Independent Component Analysis and Blind Source Separation

Self-Organising Neural Networks: Independent Component Analysis and Blind Source Separation
Advances in Independent Component Analysis

Advances in Independent Component Analysis
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Using machine learning to improve information access

Using machine learning to improve information access
Feature selection and feature extraction for text categorization

HLT '91 Proceedings of the workshop on Speech and Natural Language
ProbMap -- A probabilistic approach for mapping large document collections

Intelligent Data Analysis
Geometric implications of the naive Bayes assumption

UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
A common neural-network model for unsupervised exploratory data analysis and independent component analysis

IEEE Transactions on Neural Networks
Fast and robust fixed-point algorithms for independent component analysis

IEEE Transactions on Neural Networks

A Dynamic Probabilistic Model to Visualise Topic Evolution in Text Streams

Journal of Intelligent Information Systems
Topic Identification in Dynamical Text by Complexity Pursuit

Neural Processing Letters
A General Framework for a Principled Hierarchical Visualization of Multivariate Data

IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
A generative probabilistic approach to visualizing sets of symbolic sequences

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Semisupervised Learning of Hierarchical Latent Trait Models for Data Visualization

IEEE Transactions on Knowledge and Data Engineering
Making sense of sparse rating data in collaborative filtering via topographic organization of user preference patterns

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Predictive Modelling of Heterogeneous Sequence Collections by Topographic Ordering of Histories

Machine Learning
The Block Generative Topographic Mapping

ANNPR '08 Proceedings of the 3rd IAPR workshop on Artificial Neural Networks in Pattern Recognition
Document analysis and visualization with zero-inflated poisson

Data Mining and Knowledge Discovery
Visualization of Structured Data via Generative Probabilistic Modeling

Similarity-Based Clustering
Metric properties of structured data visualizations through generative probabilistic modeling

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Self-organizing mixture models

Neurocomputing
Probabilistic self-organizing maps for qualitative data

Neural Networks
Weighted topological clustering for categorical data

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I

Quantified Score

Hi-index	0.14

Visualization

Abstract

We present a general framework for data analysis and visualization by means of topographic organization and clustering. Imposing distributional assumptions on the assumed underlying latent factors makes the proposed model suitable for both visualization and clustering. The system noise will be modeled in parametric form, as a member of the exponential family of distributions and this allows us to deal with different (continuous or discrete) types of observables in a unified framework. In this paper, we focus on discrete case formulations which, contrary to self organizing methods for continuous data, imply variants of Bregman divergencies as measures of dissimilarity between data and reference points and, also, define the matching nonlinear relation between latent and observable variables. Therefore, the trait variant of the model can be seen as a data-driven noisy nonlinear Independent Component Analysis, which is capable of revealing meaningful structure in the multivariate observable data and visualizing it in two dimensions. The class variant (which performs the clustering) of our model performs data-driven parametric mixture modeling. The combined (trait and class) model along with the associated estimation procedures allows us to interpret the visualization result, in the sense of a topographic ordering. One important application of this work is the discovery of underlying semantic structure in text-based documents. Experimental results on various subsets of the 20-News groups text corpus and binary coded digits data are given by way of demonstration.