An Index for the Data Size to Extract Decomposable Structures in LAD

Authors:
Hirotaka Ono;Mutsunori Yagiura;Toshihide Ibaraki
Affiliations:
-;-;-
Venue:
ISAAC '01 Proceedings of the 12th International Symposium on Algorithms and Computation
Year:
2001

Citing 9
Cited 0

Cause-effect relationships and partially defined Boolean functions

Annals of Operations Research
Computational learning theory: an introduction

Computational learning theory: an introduction
The power of sampling in knowledge discovery

PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Decomposability of partially defined Boolean functions

Discrete Applied Mathematics - Special volume on partitioning and decomposition in combinatorial optimization
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Positive and Horn decomposability of partially defined Boolean functions

Discrete Applied Mathematics
Logical analysis of numerical data

Mathematical Programming: Series A and B - Special issue: papers from ismp97, the 16th international symposium on mathematical programming, Lausanne EPFL
An Implementation of Logical Analysis of Data

IEEE Transactions on Knowledge and Data Engineering
Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Logical analysis of data (LAD) is one of the methodologies for extracting knowledge as a Boolean function f from a given pair of data sets (T,F) on attributes set S of size n, in which T (resp., F) ⊆ {0, 1}n denotes a set of positive (resp., negative) examples for the phenomenon under consideration. In this paper, we consider the case in which extracted knowledge has a decomposable structure; i.e., f is described as a form f(x) = g(x[S0], h(x[S1])) for some S0, S1 ⊆ S and Boolean functions g and h, where x[I] denotes the projection of vector x on I. In order to detect meaningful decomposable structures, it is expected that the sizes |T| and |F| must be sufficiently large. In this paper, we provide an index for such indispensable number of examples, based on probabilistic analysis. Using p = |T|/(|T| + |F|) and q = |F|/(|T| + |F|), we claim that there exist many deceptive decomposable structures of (T,F) if |T| + |F| ≤ √2n-1/pq. The computational results on synthetically generated data sets show that the above index gives a good lower bound on the indispensable data size.