A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting generative models in discriminative classifiers
Proceedings of the 1998 conference on Advances in neural information processing systems II
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Document language models, query models, and risk minimization for information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A Probabilistic Framework for the Hierarchic Organisation and Classification of Document Collections
Journal of Intelligent Information Systems
A Hierarchical Model for Clustering and Categorising Documents
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
The Journal of Machine Learning Research
Web usage mining based on probabilistic latent semantic analysis
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic author-topic models for information discovery
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
PLSA-based image auto-annotation: constraining the latent space
Proceedings of the 12th annual ACM international conference on Multimedia
Modeling Scenes with Local Descriptors and Latent Aspects
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Insights from Viewing Ranked Retrieval as Rank Aggregation
WIRI '05 Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration
The rate adapting poisson model for information retrieval and object recognition
ICML '06 Proceedings of the 23rd international conference on Machine learning
A mixture model for contextual text mining
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Modeling Semantic Aspects for Cross-Media Image Indexing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayesian Folding-In with Dirichlet Kernels for PLSI
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Statistical Language Models for Information Retrieval A Critical Review
Foundations and Trends in Information Retrieval
Revisiting fisher kernels for document similarities
ECML'06 Proceedings of the 17th European conference on Machine Learning
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Investigating Topic Models' Capabilities in Expression Microarray Data Classification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Textual Similarity with a Bag-of-Embedded-Words Model
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Hi-index | 0.00 |
The Probabilistic Latent Semantic Indexing model, introduced by T. Hofmann (1999), has engendered applications in numerous fields, notably document classification and information retrieval. In this context, the Fisher kernel was found to be an appropriate document similarity measure. However, the kernels published so far contain unjustified features, some of which hinder their performances. Furthermore, PLSI is not generative for unknown documents, a shortcoming usually remedied by "folding them in" the PLSI parameter space. This paper contributes on both points by (1) introducing a new, rigorous development of the Fisher kernel for PLSI, addressing the role of the Fisher Information Matrix, and uncovering its relation to the kernels proposed so far; and (2) proposing a novel and theoretically sound document similarity, which avoids the problem of "folding in" unknown documents. For both aspects, experimental results are provided on several information retrieval evaluation sets.