Browsing large scale cheminformatics data with dimension reduction

  • Authors:
  • Jong Youl Choi;Seung-Hee Bae;Judy Qiu;Geoffrey Fox;Bin Chen;David Wild

  • Affiliations:
  • Indiana University, Bloomington, IN;Indiana University, Bloomington, IN;Indiana University, Bloomington, IN;Indiana University, Bloomington, IN;Indiana University, Bloomington, IN;Indiana University, Bloomington, IN

  • Venue:
  • Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Visualization of large-scale high dimensional data tool is highly valuable for scientific discovery in many fields. We present Pub Chem Browse, a customized visualization tool for cheminformatics research. It provides a novel 3D data point browser that displays complex properties of massive data on commodity clients. As in GIS browsers for Earth and Environment data, chemical compounds with similar properties are nearby in the browser. PubChemBrowse is built around in-house high performance parallel MDS (Multi-Dimensional Scaling) and GTM (Generative Topographic Mapping) services and supports fast interaction with an external property database. These properties can be overlaid on 3D mapped compound space or queried for individual points. We prototype use with Chem2Bio2RDF system using SPARQL query language to access over 20 publicly accessible bioinformatics databases. We describe our design and implementation of the integrated Pub Chem Browse application and outline its use in drug discovery. The same core technologies can be used to develop similar high dimensional browsers in other scientific areas.