2005 Speical Issue: Graph kernels for chemical informatics

Authors:
Liva Ralaivola;Sanjay J. Swamidass;Hiroto Saigo;Pierre Baldi
Affiliations:
School of Information and Computer Sciences, University of California, Irvine, CA 92697-3425, USA and Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697-3425, US ...;School of Information and Computer Sciences, University of California, Irvine, CA 92697-3425, USA and Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697-3425, US ...;School of Information and Computer Sciences, University of California, Irvine, CA 92697-3425, USA and Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697-3425, US ...;School of Information and Computer Sciences, University of California, Irvine, CA 92697-3425, USA and Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697-3425, US ...
Venue:
Neural Networks - Special issue on neural networks and kernel methods for structured domains
Year:
2005

Citing 18
Cited 31

Support-Vector Networks

Machine Learning
Theories for mutagenicity: a study in first-order and feature-based induction

Artificial Intelligence - Special volume on empirical methods
Inductive learning algorithms and representations for text categorization

Proceedings of the seventh international conference on Information and knowledge management
A tutorial on learning with Bayesian networks

Learning in graphical models
Large Margin Classification Using the Perceptron Algorithm

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Bioinformatics: the machine learning approach

Bioinformatics: the machine learning approach
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
The Kernel-Adatron Algorithm: A Fast and Simple Learning Procedure for Support Vector Machines

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
The principled design of large-scale recursive neural network architectures--dag-rnns and the protein structure prediction problem

The Journal of Machine Learning Research
Cyclic pattern kernels for predictive graph mining

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Extensions of marginalized graph kernels

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity

Bioinformatics
Prediction methods and databases within chemoinformatics: emphasis on drugs and drug candidates

Bioinformatics
2005 Special Issue: On the relationship between deterministic and probabilistic directed Graphical models: From Bayesian networks to recursive neural networks

Neural Networks - Special issue on neural networks and kernel methods for structured domains
Hybrid modeling, hmm/nn architectures, and protein applications

Neural Computation
Supervised neural networks for the classification of structures

IEEE Transactions on Neural Networks
A general framework for adaptive processing of data structures

IEEE Transactions on Neural Networks
An introduction to kernel-based learning algorithms

IEEE Transactions on Neural Networks

Learning from interpretations: a rooted kernel for ordered hypergraphs

Proceedings of the 24th international conference on Machine learning
Partial least squares regression for graph mining

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Reducing the Dimensionality of Vector Space Embeddings of Graphs

MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning

SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Graph kernels based on tree patterns for molecules

Machine Learning
gBoost: a mathematical programming approach to graph classification and regression

Machine Learning
Chronic Rat Toxicity Prediction of Chemical Compounds Using Kernel Machines

EvoBIO '09 Proceedings of the 7th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics
Fuzzy lattice reasoning (FLR) type neural computation for weighted graph partitioning

Neurocomputing
Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Partitional Clustering of Protein Sequences --- An Inductive Logic Programming Approach

IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part II: Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living
PCA-Based Representations of Graphs for Prediction in QSAR Studies

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
A graph matching method and a graph matching distance based on subgraph assignments

Pattern Recognition Letters
GPD: A Graph Pattern Diffusion Kernel for Accurate Graph Classification with Applications in Cheminformatics

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Graph Kernels

The Journal of Machine Learning Research
Structured output prediction of anti-cancer drug activity

PRIB'10 Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics
Graph kernels for chemical compounds using topological and three-dimensional local atom pair environments

Neurocomputing
Recent advances in graph-based pattern recognition with applications in document analysis

Pattern Recognition
Improving vector space embedding of graphs through feature selection algorithms

Pattern Recognition
Approximation of graph kernel similarities for chemical graphs by kernel principal component analysis

EvoBIO'11 Proceedings of the 9th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
Generating constrained random graphs using multiple edge switches

Journal of Experimental Algorithmics (JEA)
Inexact graph matching based on kernels for object retrieval in image databases

Image and Vision Computing
Multi-task drug bioactivity classification with graph labeling ensembles

PRIB'11 Proceedings of the 6th IAPR international conference on Pattern recognition in bioinformatics
Adaptive matching based kernels for labelled graphs

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Towards the unification of structural and statistical pattern recognition

Pattern Recognition Letters
Statistical distribution of chemical fingerprints

WILF'05 Proceedings of the 6th international conference on Fuzzy Logic and Applications
Effective graph classification based on topological and label attributes

Statistical Analysis and Data Mining
Conceptual clustering of multi-relational data

ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming
Feature selection on node statistics based embedding of graphs

Pattern Recognition Letters
Discriminative prototype selection methods for graph embedding

Pattern Recognition
Structural detection of android malware using embedded call graphs

Proceedings of the 2013 ACM workshop on Artificial intelligence and security
Comparative analysis of the use of chemoinformatics-based and substructure-based descriptors for quantitative structure-activity relationship QSAR modeling

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Increased availability of large repositories of chemical compounds is creating new challenges and opportunities for the application of machine learning methods to problems in computational chemistry and chemical informatics. Because chemical compounds are often represented by the graph of their covalent bonds, machine learning methods in this domain must be capable of processing graphical structures with variable size. Here, we first briefly review the literature on graph kernels and then introduce three new kernels (Tanimoto, MinMax, Hybrid) based on the idea of molecular fingerprints and counting labeled paths of depth up to d using depth-first search from each possible vertex. The kernels are applied to three classification problems to predict mutagenicity, toxicity, and anti-cancer activity on three publicly available data sets. The kernels achieve performances at least comparable, and most often superior, to those previously reported in the literature reaching accuracies of 91.5% on the Mutag dataset, 65-67% on the PTC (Predictive Toxicology Challenge) dataset, and 72% on the NCI (National Cancer Institute) dataset. Properties and tradeoffs of these kernels, as well as other proposed kernels that leverage 1D or 3D representations of molecules, are briefly discussed.