Pooled ANOVA

Authors:
Michael Last;Gheorghe Luta;Alex Orso;Adam Porter;Stan Young
Affiliations:
National Institute of Statistical Sciences, PO Box 14006, Research Triangle Park, NC, 27709, United States;National Institute of Statistical Sciences, PO Box 14006, Research Triangle Park, NC, 27709, United States;Georgia Institute of Technology, Atlanta, GA, United States;University of Maryland, College Park, MD, United States;National Institute of Statistical Sciences, PO Box 14006, Research Triangle Park, NC, 27709, United States
Venue:
Computational Statistics & Data Analysis
Year:
2008

Citing 2
Cited 0

Main effects screening: a distributed continuous quality assurance process for monitoring performance degradation in evolving software systems

Proceedings of the 27th international conference on Software engineering
State-of-the-Art Review: A User's Guide to the Brave New World of Designing Simulation Experiments

INFORMS Journal on Computing

Quantified Score

Hi-index	0.03

Visualization

Abstract

We introduce Pooled ANOVA, a greedy algorithm to sequentially select the rare important factors from a large set of factors. Problems such as computer simulations and software performance tuning involve a large number of factors, few of which have an important effect on the outcome or performance measure. We pool multiple factors together, and test the pool for significance. If the pool has a significant effect we retain the factors for deconfounding. If not, we either declare that none of the factors are important, or retain them for follow-up decoding, depending on our assumptions and stage of testing. The sparser important factors are, the bigger the savings. Pooled ANOVA requires fewer assumptions than other, similar methods (e.g. sequential bifurcation), such as not requiring all important effects to have the same sign. We demonstrate savings of 25%-35% when compared to a conventional ANOVA, and also the ability to work in a setting where Sequential Bifurcation fails.