Scheduling functional regression tests for IBM DB2 products

Authors:
Edward Xia;Igor Jurisica;Julie Waterhouse;Valerie Sloan
Affiliations:
Department of Computer Science, University of Toronto, Toronto, ON, Canada;Department of Computer Science, University of Toronto, Toronto, ON, Canada;IBM Toronto Software Lab, IBM Canada Ltd., Markham, ON, Canada;IBM Toronto Software Lab, IBM Canada Ltd., Markham, ON, Canada
Venue:
CASCON '05 Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research
Year:
2005

Citing 27
Cited 2

Allocating Modules to Processors in a Distributed System

IEEE Transactions on Software Engineering
Case-based reasoning

Case-based reasoning
Task matching and scheduling in heterogeneous computing environments using a genetic-algorithm-based approach

Journal of Parallel and Distributed Computing - Special issue on parallel evolutionary computing
Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors

Journal of the ACM (JACM)
The network weather service: a distributed resource performance forecasting service for metacomputing

Future Generation Computer Systems - Special issue on metacomputing
Dynamic mapping of a class of independent tasks onto heterogeneous computing systems

Journal of Parallel and Distributed Computing - Special issue on software support for distributed computing
A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems

Journal of Parallel and Distributed Computing
Case-Based Reasoning: Experiences, Lessons and Future Directions

Case-Based Reasoning: Experiences, Lessons and Future Directions
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Incremental Iterative Retrieval and Browsingfor Efficient Conversational CBR Systems

Applied Intelligence
Dynamically forecasting network performance using the Network Weather Service

Cluster Computing
Predicting Queue Times on Space-Sharing Parallel Computers

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Distributed Dynamic Scheduling of Composite Tasks on Grid Computing Systems

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Historical Application Profiler for Use by Parallel Schedulers

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Predicting Application Run Times Using Historical Information

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Adaptive Computing on the Grid Using AppLeS

IEEE Transactions on Parallel and Distributed Systems
Optimal task assignment in heterogeneous computing systems

HCW '97 Proceedings of the 6th Heterogeneous Computing Workshop (HCW '97)
Scheduling Resources in Multi-User, Heterogeneous, Computing Environments with SmartNet

HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
Heuristics for Scheduling Parameter Sweep Applications in Grid Environments

HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Scheduling From the Perspective of the Application

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Overview of a Performance Evaluation System for Global Computing Scheduling Algorithms

HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Design and Evaluation of a Resource Selection Framework for Grid Applications

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
UMR: A Multi-Round Algorithm for Scheduling Divisible Workloads

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Case-Based Classification Using Similarity-Based Retrieval

ICTAI '96 Proceedings of the 8th International Conference on Tools with Artificial Intelligence
A Comparison among Grid Scheduling Algorithms for Independent Coarse-Grained Tasks

SAINT-W '04 Proceedings of the 2004 Symposium on Applications and the Internet-Workshops (SAINT 2004 Workshops)
The Grid 2: Blueprint for a New Computing Infrastructure

The Grid 2: Blueprint for a New Computing Infrastructure
The Anatomy of the Grid: Enabling Scalable Virtual Organizations

International Journal of High Performance Computing Applications

The impact of runtime estimation inaccuracy on scheduler performance

PDCS '07 Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems
Runtime estimation using the case-based reasoning approach for scheduling in a grid environment

ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

Functional Regression Testing (FRT) is performed to ensure that a new version of a product functions properly as designed. In a corporate environment, the large numbers of test jobs and the complexity of scheduling the jobs on different platforms make performance of this testing an important issue. A grid provides an infrastructure for applications to use shared heterogeneous resources. Such an infrastructure may be used to solve large-scale testing problems or to improve application performance. FRT is a good candidate application for running on a grid because each test job can run separately, in parallel. However, experience indicates that such applications may suffer performance problems without a proper cost-based grid scheduling strategy.The Database Technology (DBT) Regression Test Team at IBM conducts the FRT for IBM® DB2® Universal DatabaseTM (DB2 UDB) products. As a case study, we examined the current test scheduling approach for the DB2 products. We found that the performance of the test scheduler suffers because it does not incorporate cost-dependent selection of jobs and slaves (testing IDs). Therefore, we have replaced the DB2 test scheduler with one that estimates jobs' run times, and then chooses slaves using those times. Although knowing a job's actual run time is difficult, we can use case-based reasoning to estimate it based on past experience. We create a case base to store historical data, and design an algorithm to estimate new jobs' run times by identifying cases that have executed in the past. The performance evaluation of our new scheduler shows a significant performance benefit over the original scheduler. In this paper, we also examine how machine specifications, such as the number of slaves running on a machine and the machine speed, affect application performance and run time estimation accuracy.