Contingency Tables as the Foundation for Concepts, Concept Hierarchies, and Rules: The 49er System Approach

  • Authors:
  • Jan M. Żytkow;Robert Zembowicz

  • Affiliations:
  • (also Instytut Podstaw Informatyki, PAN, Ordona 21, 01-237 Warszawa, Poland. zytkow@uncc.edu, robert@cs.twsu.edu) Department of Computer Science, University of North Carolina at Charlotte, Charlot ...;Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC. 28223, USA

  • Venue:
  • Fundamenta Informaticae
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

We analyze relationships between different forms of knowledge that can be discovered in the same data matrix (relational table): contingency tables, equations, concept definitions, concept hierarchies, and rules. We argue that contingency tables are the basic form of knowledge because other forms can be derived from their various special cases. We analyze the relationship between contingency tables and rules and present advantages of knowledge expressed in contingency tables. We show that special cases of contingency tables lead to concepts with empirical contents. In our view, concepts should be accepted as a by-product of knowledge discovery, as instruments justified by knowledge they express. The same applies to taxonomies (concept hierarchies). They should be created in the right circumstances, to express specific empirical knowledge. We discuss several types of knowledge that are not conducive to taxonomy formation. Then we demonstrate how concepts generated from contingency tables which approximate logical equivalence can be combined to construct concept hierarchies: (1) each of those regularities leads to a hierarchy element (mini-hierarchy), (2) the elements are merged to increase their empirical contents, and (3) they are combined into multi-level hierarchy. This method has been implemented as a part of database discovery system 49er. We illustrate our algorithm by an application on the soybean database, and we show how our results go beyond those obtained by the COBWEB approach.