Supervised feature selection via dependence estimation
Proceedings of the 24th international conference on Machine learning
Covariate Shift Adaptation by Importance Weighted Cross Validation
The Journal of Machine Learning Research
A Hilbert Space Embedding for Distributions
ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Distribution-based similarity measures for multi-dimensional point set retrieval applications
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Unsupervised Classifier Selection Based on Two-Sample Test
DS '08 Proceedings of the 11th International Conference on Discovery Science
A kernel approach to comparing distributions
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Permanents, transport polytopes and positive definite kernels on histograms
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Domain adaptation via transfer component analysis
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
A Least-squares Approach to Direct Importance Estimation
The Journal of Machine Learning Research
A large-scale active learning system for topical categorization on the web
Proceedings of the 19th international conference on World wide web
Semi-supervised speaker identification under covariate shift
Signal Processing
Hilbert Space Embeddings and Metrics on Probability Measures
The Journal of Machine Learning Research
Novel kernel-based recognizers of human actions
EURASIP Journal on Advances in Signal Processing - Special issue on video analysis for human behavior understanding
Neural Networks
Multi-source domain adaptation and its application to early detection of fatigue
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Information, Divergence and Risk for Binary Experiments
The Journal of Machine Learning Research
Adaptive boosting for transfer learning using dynamic updates
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Compact coding for hyperplane classifiers in heterogeneous environment
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Feature selection for transfer learning
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Learning the mean: A neural network approach
Neurocomputing
Transfer tagging from image to video
MM '11 Proceedings of the 19th ACM international conference on Multimedia
The Journal of Machine Learning Research
On minimum distribution discrepancy support vector machine for domain adaptation
Pattern Recognition
Batch mode active sampling based on marginal probability distribution matching
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Domain transfer dimensionality reduction via discriminant kernel learning
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Multisource domain adaptation and its application to early detection of fatigue
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Predicting domain adaptivity: redo or recycle?
Proceedings of the 20th ACM international conference on Multimedia
Image annotation by semi-supervised cross-domain learning with group sparsity
Journal of Visual Communication and Image Representation
Effective transfer tagging from image to video
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Querying discriminative and representative samples for batch mode active learning
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Discriminative feature selection for multi-view cross-domain learning
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
OMS-TL: a framework of online multiple source transfer learning
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
An Ensemble Model for Mobile Device based Arrhythmia Detection
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Batch Mode Active Sampling Based on Marginal Probability Distribution Matching
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on ACM SIGKDD 2012
Beyond cross-domain learning: Multiple-domain nonnegative matrix factorization
Engineering Applications of Artificial Intelligence
Hi-index | 3.84 |
Motivation: Many problems in data integration in bioinformatics can be posed as one common question: Are two sets of observations generated by the same distribution? We propose a kernel-based statistical test for this problem, based on the fact that two distributions are different if and only if there exists at least one function having different expectation on the two distributions. Consequently we use the maximum discrepancy between function means as the basis of a test statistic. The Maximum Mean Discrepancy (MMD) can take advantage of the kernel trick, which allows us to apply it not only to vectors, but strings, sequences, graphs, and other common structured data types arising in molecular biology. Results: We study the practical feasibility of an MMD-based test on three central data integration tasks: Testing cross-platform comparability of microarray data, cancer diagnosis, and data-content based schema matching for two different protein function classification schemas. In all of these experiments, including high-dimensional ones, MMD is very accurate in finding samples that were generated from the same distribution, and outperforms its best competitors. Conclusions: We have defined a novel statistical test of whether two samples are from the same distribution, compatible with both multivariate and structured data, that is fast, easy to implement, and works well, as confirmed by our experiments. Availability: Contact: kb@dbs.ifi.lmu.de