Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Independent component analysis, a new concept?
Signal Processing - Special issue on higher order statistics
A minimum spanning tree algorithm with inverse-Ackermann type complexity
Journal of the ACM (JACM)
The Earth Mover's Distance as a Metric for Image Retrieval
International Journal of Computer Vision
Information Theoretic Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Sparse Greedy Matrix Approximation for Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
On the influence of the kernel on the consistency of support vector machines
The Journal of Machine Learning Research
Efficient svm training using low-rank kernel representations
The Journal of Machine Learning Research
Kernel independent component analysis
The Journal of Machine Learning Research
Rademacher and gaussian complexities: risk bounds and structural results
The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces
The Journal of Machine Learning Research
Protein function prediction via graph kernels
Bioinformatics
Estimating the Support of a High-Dimensional Distribution
Neural Computation
Kernel Methods for Measuring Independence
The Journal of Machine Learning Research
All of Nonparametric Statistics (Springer Texts in Statistics)
All of Nonparametric Statistics (Springer Texts in Statistics)
Nonparametric Quantile Estimation
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Detecting change in data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Tailoring density estimation via reproducing kernel moment matching
Proceedings of the 25th international conference on Machine learning
A Hilbert Space Embedding for Distributions
ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Sample Selection Bias Correction Theory
ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Support Vector Machines
A kernel approach to comparing distributions
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Fast kernel-based independent component analysis
IEEE Transactions on Signal Processing
Hilbert Space Embeddings and Metrics on Probability Measures
The Journal of Machine Learning Research
Information, Divergence and Risk for Binary Experiments
The Journal of Machine Learning Research
Universality, Characteristic Kernels and RKHS Embedding of Measures
The Journal of Machine Learning Research
Measuring statistical dependence with hilbert-schmidt norms
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Maximum entropy distribution estimation with generalized regularization
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Unifying divergence minimization and statistical inference via convex duality
COLT'06 Proceedings of the 19th annual conference on Learning Theory
On the asymptotic properties of a nonparametric L1-test statistic of homogeneity
IEEE Transactions on Information Theory
Querying discriminative and representative samples for batch mode active learning
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Geometric tree kernels: classification of COPD from airway tree geometry
IPMI'13 Proceedings of the 23rd international conference on Information Processing in Medical Imaging
Hi-index | 0.00 |
We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD).We present two distribution free tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efficient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics on distributions are obtained when alternative function classes are used in place of an RKHS. We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests.