A statistical framework for genomic data fusion

Authors:
Gert R. G. Lanckriet;Tijl De Bie;Nello Cristianini;Michael I. Jordan;William Stafford Noble
Affiliations:
Department of Electrical Engineering and Computer Science,;Department of Electrical Engineering, ESAT-SCD, Katholieke Universiteit Leuven 3001, Belgium,;Department of Statistics, University of California, Davis 95618, USA;Division of Computer Science, Department of Statistics, University of California, Berkeley 94720, USA,;Department of Genome Sciences, University of Washington, Seattle 98195, USA
Venue:
Bioinformatics
Year:
2004

Citing 0
Cited 66

Optimal kernel selection in Kernel Fisher discriminant analysis

ICML '06 Proceedings of the 23rd international conference on Machine learning
Efficient Margin Maximizing with Boosting

The Journal of Machine Learning Research
Machine learning methods for transcription data integration

IBM Journal of Research and Development - Systems biology
Brief communication: Integrating subcellular location for improving machine learning models of remote homology detection in eukaryotic organisms

Computational Biology and Chemistry
Large Scale Multiple Kernel Learning

The Journal of Machine Learning Research
More efficiency in multiple kernel learning

Proceedings of the 24th international conference on Machine learning
Discriminant kernel and regularization parameter learning via semidefinite programming

Proceedings of the 24th international conference on Machine learning
Multiclass multiple kernel learning

Proceedings of the 24th international conference on Machine learning
Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Kernel-based learning for biomedical relation extraction

Journal of the American Society for Information Science and Technology
Localized multiple kernel learning

Proceedings of the 25th international conference on Machine learning
Multi-class Discriminant Kernel Learning via Convex Programming

The Journal of Machine Learning Research
Consistency of the Group Lasso and Multiple Kernel Learning

The Journal of Machine Learning Research
Heterogeneous data fusion for alzheimer's disease study

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-stream Fusion for Speaker Classification

Speaker Classification I
An Automated Combination of Kernels for Predicting Protein Subcellular Localization

WABI '08 Proceedings of the 8th international workshop on Algorithms in Bioinformatics
Protein functional class prediction with a combined graph

Expert Systems with Applications: An International Journal
Gene Clustering via Integrated Markov Models Combining Individual and Pairwise Features

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Beyond clustering of array expressions

International Journal of Bioinformatics Research and Applications
A Study of Semi-supervised Generative Ensembles

MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Evolutionary Optimization of Kernel Weights Improves Protein Complex Comembership Prediction

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Margin and Radius Based Multiple Kernel Learning

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Managing Knowledge in Light of Its Evolution Process: An Empirical Study on Citation Network-Based Patent Classification

Journal of Management Information Systems
Robust label propagation on multiple networks

IEEE Transactions on Neural Networks
Learning the optimal neighborhood kernel for classification

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Comparing early and late data fusion methods for gene function prediction

Proceedings of the 2009 conference on Neural Nets WIRN09: Proceedings of the 19th Italian Workshop on Neural Nets, Vietri sul Mare, Salerno, Italy, May 28--30 2009
Multiple Kernel Learning of Environmental Data. Case Study: Analysis and Mapping of Wind Fields

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Data-Fusion in Clustering Microarray Data: Balancing Discovery and Interpretability

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Letters: Integration of heterogeneous data sources for gene function prediction using decision templates and ensembles of learning machines

Neurocomputing
Cost-conscious multiple kernel learning

Pattern Recognition Letters
Using the Gene Ontology hierarchy when predicting gene function

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Use of MKL as symbol classifier for Gujarati character recognition

DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Gene function prediction with gene interaction networks: a context graph kernel approach

IEEE Transactions on Information Technology in Biomedicine
A Novel Regularization Learning for Single-View Patterns: Multi-View Discriminative Regularization

Neural Processing Letters
Multi-network fusion for collective inference

Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Multi-model classification method in heterogeneous image databases

Pattern Recognition
Regularizing multiple kernel learning using response surface methodology

Pattern Recognition
Robust prediction from multiple heterogeneous data sources with partial information

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Data fusion and feature selection for Alzheimer's diagnosis

BI'10 Proceedings of the 2010 international conference on Brain informatics
Learning a combination of heterogeneous dissimilarities from incomplete knowledge

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
Correlation features and a linear transform specific reproducing kernel

TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
Multiple kernel learning for image indexing

Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing
Design of a multiple kernel learning algorithm for LS-SVM by convex programming

Neural Networks
Multiple Kernel Learning Algorithms

The Journal of Machine Learning Research
Learning bounds for support vector machines with learned kernels

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Improved modeling of clinical data with kernel methods

Artificial Intelligence in Medicine
Prediction of protein complexes based on protein interaction data and functional annotation data using kernel methods

ICIC'06 Proceedings of the 2006 international conference on Computational Intelligence and Bioinformatics - Volume Part III
A Bayesian integration model for improved gene functional inference from heterogeneous data sources

Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Weighted kernel Fisher discriminant analysis for integrating heterogeneous data

Computational Statistics & Data Analysis
Learning interpretable SVMs for biological sequence classification

RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
Optimization with Sparsity-Inducing Penalties

Foundations and Trends® in Machine Learning
Predicting Protein Function by Multi-Label Correlated Semi-Supervised Learning

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Transductive multi-label ensemble classification for protein function prediction

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning the coordinate gradients

Advances in Computational Mathematics
Video analysis based on Multi-Kernel Representation with automatic parameter choice

Neurocomputing
Synergistic effect of different levels of genomic data for cancer clinical outcome prediction

Journal of Biomedical Informatics
Separable approximate optimization of support vector machines for distributed sensing

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Simultaneous learning of localized multiple kernels and classifier with weighted regularization

SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Multi-source learning with block-wise missing data for Alzheimer's disease prediction

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Multiple Kernel Learning with Fisher Kernels for High Frequency Currency Prediction

Computational Economics
A novel multiple Nyström-approximating kernel discriminant analysis

Neurocomputing
Texture classification using kernel-based techniques

IWANN'13 Proceedings of the 12th international conference on Artificial Neural Networks: advances in computational intelligence - Volume Part I
Protein function prediction by integrating multiple kernels

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Off-line hand written input based identity determination using multi kernel feature combination

Pattern Recognition Letters
Protein Function Prediction using Multi-label Ensemble Classification

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Analytic center cutting plane method for multiple kernel learning

Annals of Mathematics and Artificial Intelligence

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: During the past decade, the new focus on genomics has highlighted a particular challenge: to integrate the different views of the genome that are provided by various types of experimental data. Results: This paper describes a computational framework for integrating and drawing inferences from a collection of genome-wide measurements. Each dataset is represented via a kernel function, which defines generalized similarity relationships between pairs of entities, such as genes or proteins. The kernel representation is both flexible and efficient, and can be applied to many different types of data. Furthermore, kernel functions derived from different types of data can be combined in a straightforward fashion. Recent advances in the theory of kernel methods have provided efficient algorithms to perform such combinations in a way that minimizes a statistical loss function. These methods exploit semidefinite programming techniques to reduce the problem of finding optimizing kernel combinations to a convex optimization problem. Computational experiments performed using yeast genome-wide datasets, including amino acid sequences, hydropathy profiles, gene expression data and known protein--protein interactions, demonstrate the utility of this approach. A statistical learning algorithm trained from all of these data to recognize particular classes of proteins---membrane proteins and ribosomal proteins---performs significantly better than the same algorithm trained on any single type of data. Availability: Supplementary data at http://noble.gs.washington.edu/proj/sdp-svm