Computing best-possible bounds for the distribution of a sum of several variables is NP-hard

  • Authors:
  • Vladik Kreinovich;Scott Ferson

  • Affiliations:
  • Department of Computer Science, University of Texas at El Paso, El Paso, TX 79968, USA;Applied Biomathematics, 100 North Country Road, Setauket, NY 11733, USA

  • Venue:
  • International Journal of Approximate Reasoning
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many real-life situations, we know the probability distribution of two random variables x"1 and x"2, but we have no information about the correlation between x"1 and x"2; what are the possible probability distributions for the sum x"1+x"2? This question was originally raised by A.N. Kolmogorov. Algorithms exist that provide best-possible bounds for the distribution of x"1+x"2; these algorithms have been implemented as a part of the efficient software for handling probabilistic uncertainty. A natural question is: what if we have several (n2) variables with known distribution, we have no information about their correlation, and we are interested in possible probability distribution for the sum y=x"1+...+x"n? Known formulas for the case n=2 can be (and have been) extended to this case. However, as we prove in this paper, not only are these formulas not best-possible anymore, but in general, computing the best-possible bounds for arbitrary n is an NP-hard (computationally intractable) problem.