Counting subsets of contingency tables

  • Authors:
  • George S. Fishman

  • Affiliations:
  • Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, USA 27599

  • Venue:
  • Computational Statistics
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe multistage Markov chain Monte Carlo (MSMCMC) procedures which, in addition to estimating the total number of contingency tables with given positive row and column sums, estimate the number, $$Q$$Q, and the proportion, $$P$$P, of those tables that satisfy an additional, possibly, nonlinear constraint. Three Options, A, B, and C, are studied. Options A and B exploit locally optimal statistical properties whereas judicious assignment of a particular parameter of Option C allows estimation with approximately minimal standard error. Ten examples of varying dimensions and total entries illustrate and compare the procedures, where $$Q$$Q and $$P$$P denote the number and proportion of chi-squared statistics less than a given value. For both small and large dimensional tables, the comparisons favor Options A and B for moderate $$P$$P and Option C for small $$P$$P. Additional comparison with sequential importance sampling estimates favors the latter for small dimensional tables and moderate $$P$$P but favors Option C for large dimensional tables for both small and moderate $$P$$P. The proposed options extend an earlier MSMCMC technique for estimating total count and, in principle, can be further extended to incorporate additional constraints.