Protein function prediction by integrating multiple kernels

Authors:
Guoxian Yu;Huzefa Rangwala;Carlotta Domeniconi;Guoji Zhang;Zili Zhang
Affiliations:
School of Computer Sci. and Eng., South China University of Technology, Guangzhou, China and School of Computer and Information Science, Southwest University, Chongqing, China;Department of Computer Science, George Mason University, VA;Department of Computer Science, George Mason University, VA;School of Sciences, South China University of Technology, Guangzhou, China;School of Computer and Information Science, Southwest University, Chongqing, China
Venue:
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Year:
2013

Citing 11
Cited 0

Convex Optimization

Convex Optimization
Fast String Kernels using Inexact Matching for Protein Sequences

The Journal of Machine Learning Research
A statistical framework for genomic data fusion

Bioinformatics
Semi-supervised protein classification using cluster kernels

Bioinformatics
Fast protein classification with multiple networks

Bioinformatics
Unified video annotation via multigraph learning

IEEE Transactions on Circuits and Systems for Video Technology
On multiple kernel learning with multiple labels

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Fast integration of heterogeneous data sources for predicting gene function with limited annotation

Bioinformatics
Synergy of multi-label hierarchical ensembles, data fusion, and cost-sensitive methods for gene functional inference

Machine Learning
Transductive multi-label ensemble classification for protein function prediction

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Transductive Multilabel Learning via Label Set Propagation

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Determining protein function constitutes an exercise in integrating information derived from several heterogeneous high-throughput experiments. To utilize the information spread across multiple sources in a combined fashion, these data sources are transformed into kernels. Several protein function prediction methods follow a two-phased approach: they first optimize the weights on individual kernels to produce a composite kernel, and then train a classifier on the composite kernel. As such, these methods result in an optimal composite kernel, but not necessarily in an optimal classifier. On the other hand, some methods optimize the loss of binary classifiers, and learn weights for the different kernels iteratively. A protein has multiple functions, and each function can be viewed as a label. These methods solve the problem of optimizing weights on the input kernels for each of the labels. This is computationally expensive and ignores inter-label correlations. In this paper, we propose a method called Protein Function Prediction by Integrating Multiple Kernels (ProMK). ProMK iteratively optimizes the phases of learning optimal weights and reducing the empirical loss of a multi-label classifier for each of the labels simultaneously, using a combined objective function. ProMK can assign larger weights to smooth kernels and downgrade the weights on noisy kernels. We evaluate the ability of ProMK to predict the function of proteins using several standard benchmarks. We show that our approach performs better than previously proposed protein function prediction approaches that integrate data from multiple networks, and multi-label multiple kernel learning methods.