Nonextensive Information Theoretic Kernels on Measures

Authors:
André F. T. Martins;Noah A. Smith;Eric P. Xing;Pedro M. Q. Aguiar;Mário A. T. Figueiredo
Affiliations:
-;-;-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2009

Citing 0
Cited 15

On reverse feature engineering of syntactic tree kernels

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Information theoretical Kernels for generative embeddings based on hidden Markov models

SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
Biologically-aware latent dirichlet allocation (BaLDA) for the classification of expression microarray

PRIB'10 Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics
Graph clustering using the Jensen-Shannon Kernel

CAIP'11 Proceedings of the 14th international conference on Computer analysis of images and patterns - Volume Part I
Relaxed exponential kernels for unsupervised learning

DAGM'11 Proceedings of the 33rd international conference on Pattern recognition
Renal cancer cell classification using generative embeddings and information theoretic kernels

PRIB'11 Proceedings of the 6th IAPR international conference on Pattern recognition in bioinformatics
Combining information theoretic kernels with generative embeddings for classification

Neurocomputing
Investigating Topic Models' Capabilities in Expression Microarray Data Classification

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Graph complexity from the jensen-shannon divergence

SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
A jensen-shannon kernel for hypergraphs

SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Graph ambiguity

Fuzzy Sets and Systems
Graph Kernels from the Jensen-Shannon Divergence

Journal of Mathematical Imaging and Vision
Information-Theoretic dissimilarities for graphs

SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition
Exploiting geometry in counting grids

SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition
Depth-based complexity traces of graphs

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Positive definite kernels on probability measures have been recently applied to classification problems involving text, images, and other types of structured data. Some of these kernels are related to classic information theoretic quantities, such as (Shannon's) mutual information and the Jensen-Shannon (JS) divergence. Meanwhile, there have been recent advances in nonextensive generalizations of Shannon's information theory. This paper bridges these two trends by introducing nonextensive information theoretic kernels on probability measures, based on new JS-type divergences. These new divergences result from extending the the two building blocks of the classical JS divergence: convexity and Shannon's entropy. The notion of convexity is extended to the wider concept of q-convexity, for which we prove a Jensen q-inequality. Based on this inequality, we introduce Jensen-Tsallis (JT) q-differences, a nonextensive generalization of the JS divergence, and define a k-th order JT q-difference between stochastic processes. We then define a new family of nonextensive mutual information kernels, which allow weights to be assigned to their arguments, and which includes the Boolean, JS, and linear kernels as particular cases. Nonextensive string kernels are also defined that generalize the p-spectrum kernel. We illustrate the performance of these kernels on text categorization tasks, in which documents are modeled both as bags of words and as sequences of characters.