Automatic Choice of Control Measurements

Authors:
Gayle Leen;David R. Hardoon;Samuel Kaski
Affiliations:
Department of Information and Computer Science, Helsinki University of Technology, Finland FIN-02015;Dept. of Computer Science, University College London, London, U.K. WC1E 6BT;Department of Information and Computer Science, Helsinki University of Technology, Finland FIN-02015
Venue:
ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Year:
2009

Citing 7
Cited 0

Learning Gaussian processes from multiple tasks

ICML '05 Proceedings of the 22nd international conference on Machine learning
Variational Bayesian multinomial probit regression with Gaussian process priors

Neural Computation
Joint cluster analysis of attribute data and relationship data: The connected k-center problem, algorithms and applications

ACM Transactions on Knowledge Discovery from Data (TKDD)
Multi-task learning for HIV therapy screening

Proceedings of the 25th international conference on Machine learning
Covariate Shift Adaptation by Importance Weighted Cross Validation

The Journal of Machine Learning Research
Flexible latent variable models for multi-task learning

Machine Learning
Domain adaptation for statistical classifiers

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

In experimental design, a standard approach for distinguishing experimentally induced effects from unwanted effects is to design control measurements that differ only in terms of the former. However, in some cases, it may be problematic to design and measure controls specifically for an experiment. In this paper, we investigate the possibility of learning to choose suitable controls from a database of potential controls, which differ in their degree of relevance to the experiment. This approach is especially relevant in the field of bioinformatics where experimental studies are predominantly small-scale, while vast amounts of biological measurements are becoming increasingly available. We focus on finding controls for differential gene expression studies (case vs control) of various cancers. In this situation, the ideal control would be a healthy sample from the same tissue (the same mixture of cells as the tumor tissue), under the same conditions except for cancer-specific effects, which is almost impossible to obtain in practice. We formulate the problem of learning to choose the control in a Gaussian process classification framework, as a novel paired multitask learning problem. The similarities between the underlying set of classifiers are learned from the set of control tissue gene expression profiles.