Data access performance in a large and dynamic pharmaceutical drug candidate database

  • Authors:
  • Zina Ben-Miled;Yang Liu;Michael Bem;Robert Jones;Robert Oppelt;Samuel Milosevich;Dave Powers;Omran Bukhres

  • Affiliations:
  • ECE Department, Indiana University Purdue University Indianapolis, IN;CS Department, Indiana University, Purdue University Indianapolis, IN;Eli Lilly & Company, Indianapolis, IN;Eli Lilly & Company, Indianapolis, IN;Eli Lilly & Company, Indianapolis, IN;Eli Lilly & Company, Indianapolis, IN;Eli Lilly & Company, Indianapolis, IN;CS Department, Indiana University, Purdue University Indianapolis, IN

  • Venue:
  • Proceedings of the 2000 ACM/IEEE conference on Supercomputing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

An explosion in the amount of data generated through chemical and biological experimentation has been observed in recent years. This rapid proliferation of vast amounts of data has led to a set of cheminformatics and bioinformatics applications that manipulate dynamic, heterogeneous and massive data. An example of such applications in the pharmaceutical industry is the computational process involved in the early discovery of lead drug candidates for a given target disease. This computational process includes repeated sequential and random accesses to a drug candidate database. Using the above pharmaceutical application, an experimental study was conducted which shows that for optimal performance, the degree of parallelism exploited in the application should be adjusted according to the drug candidate database instance size and the machine size. Additionally, different degrees of parallelism should be used depending on whether the access to the drug candidate database is random or sequential.