Information theoretical analysis of multivariate correlation

  • Authors:
  • Satosi Watanabe

  • Affiliations:
  • -

  • Venue:
  • IBM Journal of Research and Development
  • Year:
  • 1960

Quantified Score

Hi-index 0.00

Visualization

Abstract

A set λ of stochastic variables, y1, y2,..., yn, is grouped into subsets, µ1, µ2,..., µk. The correlation existing in λ with respect to the µ's is adequately expressed by C= Σi=1k S(µi)-S(λ)≥0, where S(v) is the entropy function defined with reference to the variables y in subset v. For a given λ, C becomes maximum when each µi consists of only one variable, (n=k). The value Cis then called fhe total correlation in λ, Ctot(λ). The present paper gives various theorems, according to which Ctot(λ) can be decomposed in terms of the partial correlations existing in subsets of λ, and of quantities derivable therefrom. The information-theoretical meaning of each decomposition is carefully explained. As illustrations, two problems are discussed at the end of the paper: (1) redundancy in geometrical figures in pattern recognition, and (2) randomization effect of shuffling cards marked "zero' or "one."