Elements of information theory
Elements of information theory
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
Self-organizing maps
GTM: the generative topographic mapping
Neural Computation
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
A unifying review of linear Gaussian models
Neural Computation
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic visualisation of high-dimensional binary data
Proceedings of the 1998 conference on Advances in neural information processing systems II
Analysis of latent structure models with multidimensional latent variables
Statistics and neural networks
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Self-Organising Neural Networks: Independent Component Analysis and Blind Source Separation
Self-Organising Neural Networks: Independent Component Analysis and Blind Source Separation
Advances in Independent Component Analysis
Advances in Independent Component Analysis
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Using machine learning to improve information access
Using machine learning to improve information access
Feature selection and feature extraction for text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
ProbMap -- A probabilistic approach for mapping large document collections
Intelligent Data Analysis
Geometric implications of the naive Bayes assumption
UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
IEEE Transactions on Neural Networks
Fast and robust fixed-point algorithms for independent component analysis
IEEE Transactions on Neural Networks
A Dynamic Probabilistic Model to Visualise Topic Evolution in Text Streams
Journal of Intelligent Information Systems
Topic Identification in Dynamical Text by Complexity Pursuit
Neural Processing Letters
A General Framework for a Principled Hierarchical Visualization of Multivariate Data
IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
A generative probabilistic approach to visualizing sets of symbolic sequences
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Semisupervised Learning of Hierarchical Latent Trait Models for Data Visualization
IEEE Transactions on Knowledge and Data Engineering
Neural Networks - 2004 Special issue: New developments in self-organizing systems
The Block Generative Topographic Mapping
ANNPR '08 Proceedings of the 3rd IAPR workshop on Artificial Neural Networks in Pattern Recognition
Document analysis and visualization with zero-inflated poisson
Data Mining and Knowledge Discovery
Visualization of Structured Data via Generative Probabilistic Modeling
Similarity-Based Clustering
Metric properties of structured data visualizations through generative probabilistic modeling
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Self-organizing mixture models
Neurocomputing
Probabilistic self-organizing maps for qualitative data
Neural Networks
Weighted topological clustering for categorical data
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I
Hi-index | 0.14 |
We present a general framework for data analysis and visualization by means of topographic organization and clustering. Imposing distributional assumptions on the assumed underlying latent factors makes the proposed model suitable for both visualization and clustering. The system noise will be modeled in parametric form, as a member of the exponential family of distributions and this allows us to deal with different (continuous or discrete) types of observables in a unified framework. In this paper, we focus on discrete case formulations which, contrary to self organizing methods for continuous data, imply variants of Bregman divergencies as measures of dissimilarity between data and reference points and, also, define the matching nonlinear relation between latent and observable variables. Therefore, the trait variant of the model can be seen as a data-driven noisy nonlinear Independent Component Analysis, which is capable of revealing meaningful structure in the multivariate observable data and visualizing it in two dimensions. The class variant (which performs the clustering) of our model performs data-driven parametric mixture modeling. The combined (trait and class) model along with the associated estimation procedures allows us to interpret the visualization result, in the sense of a topographic ordering. One important application of this work is the discovery of underlying semantic structure in text-based documents. Experimental results on various subsets of the 20-News groups text corpus and binary coded digits data are given by way of demonstration.