ChemDB: a public database of small molecules and related chemoinformatics resources

Authors:
Jonathan Chen;S. Joshua Swamidass;Yimeng Dou;Jocelyne Bruand;Pierre Baldi
Affiliations:
Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California Irvine, CA, USA;Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California Irvine, CA, USA;Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California Irvine, CA, USA;Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California Irvine, CA, USA;Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California Irvine, CA, USA
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 14

Partial least squares regression for graph mining

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Reducing the dimensionality of dissimilarity space embedding graph kernels

Engineering Applications of Artificial Intelligence
Sourcerer: mining and searching internet-scale software repositories

Data Mining and Knowledge Discovery
Graph kernels based on tree patterns for molecules

Machine Learning
L2 norm regularized feature kernel regression for graph data

Proceedings of the 18th ACM conference on Information and knowledge management
Approximation of graph kernel similarities for chemical graphs by kernel principal component analysis

EvoBIO'11 Proceedings of the 9th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
Mining frequent closed graphs on evolving data streams

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Parallel structural graph clustering

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Application of chemoinformatics to the structural elucidation of natural compounds

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
A structural cluster kernel for learning on graphs

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A tree-structured covalent-bond-driven molecular memetic algorithm for optimization of ring-deficient molecules

Computers & Mathematics with Applications
HmSearch: an efficient hamming distance query processing algorithm

Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Succinct interval-splitting tree for scalable similarity search of compound-protein pairs with property constraints

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Subtree selection in kernels for graph classification

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: The development of chemoinformatics has been hampered by the lack of large, publicly available, comprehensive repositories of molecules, in particular of small molecules. Small molecules play a fundamental role in organic chemistry and biology. They can be used as combinatorial building blocks for chemical synthesis, as molecular probes in chemical genomics and systems biology, and for the screening and discovery of new drugs and other useful compounds. Results: We describe ChemDB, a public database of small molecules available on the Web. ChemDB is built using the digital catalogs of over a hundred vendors and other public sources and is annotated with information derived from these sources as well as from computational methods, such as predicted solubility and three-dimensional structure. It supports multiple molecular formats and is periodically updated, automatically whenever possible. The current version of the database contains approximately 4.1 million commercially available compounds and 8.2 million counting isomers. The database includes a user-friendly graphical interface, chemical reactions capabilities, as well as unique search capabilities. Availability: Database and datasets are available on http://cdb.ics.uci.edu Contact: pfbaldi@ics.uci.edu Supplementary information: Supplementary materials are available on http://cdb.ics.uci.edu