Reliable Effects Screening: A Distributed Continuous Quality Assurance Process for Monitoring Performance Degradation in Evolving Software Systems

Authors:
Cemal Yilmaz;Adam Porter;Arvind S. Krishna;Atif M. Memon;Douglas C. Schmidt;Aniruddha S. Gokhale;Balachandran Natarajan
Affiliations:
IEEE Computer Society;IEEE;-;IEEE Computer Society;-;-;-
Venue:
IEEE Transactions on Software Engineering
Year:
2007

Citing 23
Cited 3

Orthogonal Latin squares: an application of experiment design to compiler testing

Communications of the ACM
Applying design of experiments to software testing: experience report

ICSE '97 Proceedings of the 19th international conference on Software engineering
Model-based testing in practice

Proceedings of the 21st international conference on Software engineering
Hierarchical GUI Test Case Generation Using Automated Planning

IEEE Transactions on Software Engineering - Special issue on 1999 international conference on software engineering
Pattern-Oriented Software Architecture: Patterns for Concurrent and Networked Objects

Pattern-Oriented Software Architecture: Patterns for Concurrent and Networked Objects
Designing a test suite for empirically-based middleware performance prediction

CRPIT '02 Proceedings of the Fortieth International Conference on Tools Pacific: Objects for internet, mobile and embedded applications
Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms*

Real-Time Systems
Composing Domain-Specific Design Environments

Computer
A comparison of empirical and model-driven optimization

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
An Investigation of the Applicability of Design of Experiments to Software Testing

SEW '02 Proceedings of the 27th Annual NASA Goddard Software Engineering Workshop (SEW-27'02)
Continuous Compilation: A New Approach to Aggressive and Adaptive Code Transformation

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
ControlWare: A Middleware Architecture for Feedback Control of Software Performance

ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Generation of Distributed System Test-Beds from High-Level Software Architecture Descriptions

Proceedings of the 16th IEEE international conference on Automated software engineering
A solver for the network testbed mapping problem

ACM SIGCOMM Computer Communication Review
Addressing the middleware configuration challenges using model-based techniques

ACM-SE 42 Proceedings of the 42nd annual Southeast regional conference
Skoll: Distributed Continuous Quality Assurance

Proceedings of the 26th International Conference on Software Engineering
CCMPerf: A Benchmarking Tool for CORBA Component Model Implementations

RTAS '04 Proceedings of the 10th IEEE Real-Time and Embedded Technology and Applications Symposium
Covering arrays for efficient fault characterization in complex configuration spaces

ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
An integrated experimental environment for distributed systems and networks

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Main effects screening: a distributed continuous quality assurance process for monitoring performance degradation in evolving software systems

Proceedings of the 27th international conference on Software engineering
Covering Arrays for Efficient Fault Characterization in Complex Configuration Spaces

IEEE Transactions on Software Engineering
Guest Editor's Introduction: Model-Driven Engineering

Computer
Developing Applications Using Model-Driven Design Environments

Computer

Community-based, collaborative testing and analysis

Proceedings of the FSE/SDP workshop on Future of software engineering research
Where has all my memory gone?: determining memory characteristics of product variants using virtual-machine-level monitoring

Proceedings of the Eighth International Workshop on Variability Modelling of Software-Intensive Systems
An evaluation model for dependability of Internet-scale software on basis of Bayesian Networks and trustworthiness

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

Developers of highly configurable performance-intensive software systems often use in-house performance-oriented "regression testing" to ensure that their modifications do not adversely affect their software's performance across its large configuration space. Unfortunately, time and resource constraints can limit in-house testing to a relatively small number of possible configurations, followed by unreliable extrapolation from these results to the entire configuration space. As a result, many performance bottlenecks escape detection until systems are fielded. In our earlier work, we improved the situation outlined above by developing an initial quality assurance process called "main effects screening." This process 1) executes formally designed experiments to identify an appropriate subset of configurations on which to base the performance-oriented regression testing, 2) executes benchmarks on this subset whenever the software changes, and 3) provides tool support for executing these actions on in-the-field and in-house computing resources. Our initial process had several limitations, however, since it was manually configured (which was tedious and error-prone) and relied on strong and untested assumptions for its accuracy (which made its use unacceptably risky in practice). This paper presents a new quality assurance process called "reliable effects screening” that provides three significant improvements to our earlier work. First, it allows developers to economically verify key assumptions during process execution. Second, it integrates several model-driven engineering tools to make process configuration and execution much easier and less error prone. Third, we evaluate this process via several feasibility studies of three large, widely used per-for-mance-intensive software frameworks. Our results indicate that reliable effects screening can detect performance degradation in large-scale systems more reliably and with significantly less resources than conventional techniques.