Bias-variance tradeoffs in program analysis

Authors:
Rahul Sharma;Aditya V. Nori;Alex Aiken
Affiliations:
Stanford University, Stanford, CA, USA;Microsoft Research, Bangalore, India;Stanford University, Stanford, CA, USA
Venue:
Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
Year:
2014

Citing 42
Cited 0

A theory of the learnable

Communications of the ACM
Learnability and the Vapnik-Chervonenkis dimension

Journal of the ACM (JACM)
Neural networks and the bias/variance dilemma

Neural Computation
An introduction to computational learning theory

An introduction to computational learning theory
Noise-tolerant distribution-free learning of general geometric concepts

Journal of the ACM (JACM)
Algorithmic stability and sanity-check bounds for leave-one-out cross-validation

Neural Computation
Automatic discovery of linear restraints among variables of a program

POPL '78 Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints

POPL '77 Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Preventing "Overfitting" of Cross-Validation Data

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Separation Logic: A Logic for Shared Mutable Data Structures

LICS '02 Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science
Comparing the Galois Connection and Widening/Narrowing Approaches to Abstract Interpretation

PLILP '92 Proceedings of the 4th International Symposium on Programming Language Implementation and Logic Programming
Counterexample-Guided Abstraction Refinement

CAV '00 Proceedings of the 12th International Conference on Computer Aided Verification
Abstractions from proofs

Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Predicate Abstraction of ANSI-C Programs Using SAT

Formal Methods in System Design
Precise widening operators for convex polyhedra

Science of Computer Programming - Special issue: Static analysis symposium (SAS 2003)
An interpolating theorem prover

Theoretical Computer Science - Tools and algorithms for the construction and analysis of systems (TACAS 2004)
The octagon abstract domain

Higher-Order and Symbolic Computation
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Program analysis as constraint solving

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
A Numerical Abstract Domain Based on Expression Abstraction and Max Operator with Application in Timing Analysis

CAV '08 Proceedings of the 20th international conference on Computer Aided Verification
Compositional shape analysis by means of bi-abduction

Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
From Tests to Proofs

TACAS '09 Proceedings of the 15th International Conference on Tools and Algorithms for the Construction and Analysis of Systems: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009,
Program Analysis with Dynamic Precision Adjustment

ASE '08 Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering
Compositional may-must program analysis: unleashing the power of alternation

Proceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Why does Astrée scale up?

Formal Methods in System Design
seL4: formal verification of an operating-system kernel

Communications of the ACM
Precise fixpoint computation through strategy iteration

ESOP'07 Proceedings of the 16th European conference on Programming
An empirical study of optimizations in YOGI

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation

The Journal of Machine Learning Research
Learning minimal abstractions

Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Using bounded model checking to focus fixpoint iterations

SAS'11 Proceedings of the 18th international conference on Static analysis
Static analysis in disjunctive numerical domains

SAS'06 Proceedings of the 13th international conference on Static Analysis
Scalable analysis of linear systems using mathematical programming

VMCAI'05 Proceedings of the 6th international conference on Verification, Model Checking, and Abstract Interpretation
A practical and complete approach to predicate refinement

TACAS'06 Proceedings of the 12th international conference on Tools and Algorithms for the Construction and Analysis of Systems
Discovering invariants via simple component analysis

Journal of Symbolic Computation
A few useful things to know about machine learning

Communications of the ACM
Precise relational invariants through strategy iteration

CSL'07/EACSL'07 Proceedings of the 21st international conference, and Proceedings of the 16th annuall conference on Computer Science Logic
Stratified Static Analysis Based on Variable Dependencies

Electronic Notes in Theoretical Computer Science (ENTCS)
PAGAI: A Path Sensitive Static Analyser

Electronic Notes in Theoretical Computer Science (ENTCS)
Succinct representations for abstract interpretation: combined analysis algorithms and experimental evaluation

SAS'12 Proceedings of the 19th international conference on Static Analysis
Second competition on software verification

TACAS'13 Proceedings of the 19th international conference on Tools and Algorithms for the Construction and Analysis of Systems
Finding optimum abstractions in parametric dataflow analysis

Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is often the case that increasing the precision of a program analysis leads to worse results. It is our thesis that this phenomenon is the result of fundamental limits on the ability to use precise abstract domains as the basis for inferring strong invariants of programs. We show that bias-variance tradeoffs, an idea from learning theory, can be used to explain why more precise abstractions do not necessarily lead to better results and also provides practical techniques for coping with such limitations. Learning theory captures precision using a combinatorial quantity called the VC dimension. We compute the VC dimension for different abstractions and report on its usefulness as a precision metric for program analyses. We evaluate cross validation, a technique for addressing bias-variance tradeoffs, on an industrial strength program verification tool called YOGI. The tool produced using cross validation has significantly better running time, finds new defects, and has fewer time-outs than the current production version. Finally, we make some recommendations for tackling bias-variance tradeoffs in program analysis.