Revisiting fisher kernels for document similarities

Authors:
Martin Nyffenegger;Jean-Cédric Chappelier;Éric Gaussier
Affiliations:
Ecole Polytechnique Fédérale de Lausanne, Switzerland;Ecole Polytechnique Fédérale de Lausanne, Switzerland;Xerox Research Center Europe, Meylan, France
Venue:
ECML'06 Proceedings of the 17th European conference on Machine Learning
Year:
2006

Citing 8
Cited 2

Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting generative models in discriminative classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Modern Information Retrieval

Modern Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
A Probabilistic Framework for the Hierarchic Organisation and Classification of Document Collections

Journal of Intelligent Information Systems
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Web usage mining based on probabilistic latent semantic analysis

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

An analysis of active learning strategies for sequence labeling tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
PLSI: The True Fisher Kernel and beyond

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new metric to compute similarities between textual documents, based on the Fisher information kernel as proposed by T. Hofmann. By considering a new point-of-view on the embedding vector space and proposing a more appropriate way of handling the Fisher information matrix, we derive a new form of the kernel that yields significant improvements on an information retrieval task. We apply our approach to two different models: Naive Bayes and PLSI.