On estimating the size and confidence of a statistical audit

  • Authors:
  • Javed A. Aslam;Raluca A. Popa;Ronald L. Rivest

  • Affiliations:
  • College of Computer and Information Science, Northeastern University, Boston, MA;Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA;Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA

  • Venue:
  • EVT'07 Proceedings of the USENIX Workshop on Accurate Electronic Voting Technology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of statistical sampling for auditing elections, and we develop a remarkably simple and easily-calculated upper bound for the sample size necessary for determining with probability at least c if a given set of n objects contains fewer than b "bad" objects. While the size of the optimal sample drawn without replacement can be determined with a computer program, our goal is to derive a highly accurate and simple formula that can be used by election officials equipped with only a hand-held calculator. We actually develop several formulae, but the one we recommend for use in practice is: U3(n, b, c) = ⌈(n - (b - 1)/2) ċ (1 - (1 - c)1/b)⌉ = ⌈(n - (b - 1)/2) ċ (1 - exp(ln(1 - c)/b))⌉ As a practical matter, this formula is essentially exact: we prove that it is never too small, and empirical testing for many representative values of n ≤ 10,000, and b ≤ n/2, and c ≤ 0.99 never finds it more than one too large. Theoretically, we show that for all n and b this formula never exceeds the optimal sample size by more than 3 for c ≤ 0.9975, and by more than (-ln(1 - c))/2 for general c.