Latent variable models and factors analysis
Latent variable models and factors analysis
Elements of information theory
Elements of information theory
Hierarchical mixtures of experts and the EM algorithm
Neural Computation
The nature of statistical learning theory
The nature of statistical learning theory
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
Efficient Approximations for the MarginalLikelihood of Bayesian Networks with Hidden Variables
Machine Learning - Special issue on learning with probabilistic representations
Latent semantic indexing: a probabilistic analysis
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
A Hierarchical Latent Variable Model for Data Visualization
IEEE Transactions on Pattern Analysis and Machine Intelligence
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Learning to extract symbolic knowledge from the World Wide Web
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Bringing order to the Web: automatically categorizing search results
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
An Introduction to Variational Methods for Graphical Models
Machine Learning
Exploiting generative models in discriminative classifiers
Proceedings of the 1998 conference on Advances in neural information processing systems II
Proceedings of the 1998 conference on Advances in neural information processing systems II
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Information Retrieval
Exploiting Hierarchy in Text Categorization
Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Maximum entropy discrimination
Maximum entropy discrimination
Learning with mixtures of trees
The Journal of Machine Learning Research
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Score and information for recursive exponential models with incomplete data
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
A Hierarchical Model for Clustering and Categorising Documents
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Asymptotic properties of the Fisher kernel
Neural Computation
Classifying web documents in a hierarchy of categories: a comprehensive study
Journal of Intelligent Information Systems
Boosting multi-label hierarchical text categorization
Information Retrieval
PLSI: The True Fisher Kernel and beyond
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Revisiting fisher kernels for document similarities
ECML'06 Proceedings of the 17th European conference on Machine Learning
TreeBoost.MH: a boosting algorithm for multi-label hierarchical text categorization
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Hi-index | 0.00 |
This paper presents a probabilistic mixture modeling framework for the hierarchic organisation of document collections. It is demonstrated that the probabilistic corpus model which emerges from the automatic or unsupervised hierarchical organisation of a document collection can be further exploited to create a kernel which boosts the performance of state-of-the-art support vector machine document classifiers. It is shown that the performance of such a classifier is further enhanced when employing the kernel derived from an appropriate hierarchic mixture model used for partitioning a document corpus rather than the kernel associated with a flat non-hierarchic mixture model. This has important implications for document classification when a hierarchic ordering of topics exists. This can be considered as the effective combination of documents with no topic or class labels (unlabeled data), labeled documents, and prior domain knowledge (in the form of the known hierarchic structure), in providing enhanced document classification performance.