Data access performance in a large and dynamic pharmaceutical drug candidate database

Authors:
Zina Ben-Miled;Yang Liu;Michael Bem;Robert Jones;Robert Oppelt;Samuel Milosevich;Dave Powers;Omran Bukhres
Affiliations:
ECE Department, Indiana University Purdue University Indianapolis, IN;CS Department, Indiana University, Purdue University Indianapolis, IN;Eli Lilly & Company, Indianapolis, IN;Eli Lilly & Company, Indianapolis, IN;Eli Lilly & Company, Indianapolis, IN;Eli Lilly & Company, Indianapolis, IN;Eli Lilly & Company, Indianapolis, IN;CS Department, Indiana University, Purdue University Indianapolis, IN
Venue:
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Year:
2000

Citing 8
Cited 0

Parallel database systems: the future of high performance database systems

Communications of the ACM
An overview of data warehousing and OLAP technology

ACM SIGMOD Record
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads

Proceedings of the 25th annual international symposium on Computer architecture
Starfire: Extending the SMP Envelope

IEEE Micro
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Mainstream Parallelism: Taking Sides on the SMP/MPP/Cluster Debate

Euro-Par '95 Proceedings of the First International Euro-Par Conference on Parallel Processing
A Case for Parallelism in Data Warehousing and OLAP

DEXA '98 Proceedings of the 9th International Workshop on Database and Expert Systems Applications
Efficient Schema Design for a Pharmaceutical Data Repository

CBMS '00 Proceedings of the 13th IEEE Symposium on Computer-Based Medical Systems (CBMS'00)

Quantified Score

Hi-index	0.00

Visualization

Abstract

An explosion in the amount of data generated through chemical and biological experimentation has been observed in recent years. This rapid proliferation of vast amounts of data has led to a set of cheminformatics and bioinformatics applications that manipulate dynamic, heterogeneous and massive data. An example of such applications in the pharmaceutical industry is the computational process involved in the early discovery of lead drug candidates for a given target disease. This computational process includes repeated sequential and random accesses to a drug candidate database. Using the above pharmaceutical application, an experimental study was conducted which shows that for optimal performance, the degree of parallelism exploited in the application should be adjusted according to the drug candidate database instance size and the machine size. Additionally, different degrees of parallelism should be used depending on whether the access to the drug candidate database is random or sequential.