Contingency matrix theory: Statistical dependence in a contingency table

  • Authors:
  • Shusaku Tsumoto

  • Affiliations:
  • Department of Medical Informatics, Faculty of Medicine, Shimane University, 89-1 Enya-cho, Izumo, Shimane 693-8501, Japan

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 0.07

Visualization

Abstract

Chance discovery aims at understanding the meaning of functional dependency from the viewpoint of unexpected relations. One of the most important observations is that such a chance is hidden under a huge number of coocurrencies extracted from a given data. On the other hand, conventional data-mining methods are strongly dependent on frequencies and statistics rather than interestingness or unexpectedness. This paper discusses some limitations of ideas of statistical dependence, especially focusing on the formal characteristics of Simpson's paradox from the viewpoint of linear algebra. Theoretical results show that such a Simpson's paradox can be observed when a given contingency table as a matrix is not regular, in other words, the rank of a contingency matrix is not full. Thus, data-ordered evidence gives some limitations, which should be compensated by human-oriented reasoning.