Novel biological network features discovery for in silico identification of drug targets

Authors:
Jintao Zhang;Jun Huan
Affiliations:
University of Kansas, Lawrence, KS, USA;University of Kansas, Lawrence, KS, USA
Venue:
Proceedings of the 1st ACM International Health Informatics Symposium
Year:
2010

Citing 4
Cited 0

A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression

The Journal of Machine Learning Research
Properties and identification of human protein drug targets

Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In silico identification of potential drug targets is a crucial task for drug discovery. Traditional approaches utilize only protein sequence or structural information to predict drug targets, and achieve limited successes. Since cellular proteins function in the context of interaction networks by interacting with other cellular macromolecules, analysis of topological features of proteins in such networks reveal important insights on the potential druggability of proteins. In this paper, we first introduced ten novel topological features extracted from the human protein-protein interaction network. When designing these new features, we specially emphasized the roles of three disease-related groups of proteins: known drug targets, disease genes, and essential genes. Based on these novel network features, we built highly accurate models with up to 80% classification accuracy using support vector machines, L1-regularized logistic regression, and k-nearest neighbors to predict drug target, and analyzed the relevance of each feature to the proteins' druggability. Moreover, we combined our network features with a set of protein sequence features, and achieved more robust experimental performance. With the framework of integrating both network and sequence features, our method can also be used to prioritize multiple candidate proteins according to their predicted druggability.