Cause-effect relationships and partially defined Boolean functions
Annals of Operations Research
Computational learning theory: an introduction
Computational learning theory: an introduction
The power of sampling in knowledge discovery
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Decomposability of partially defined Boolean functions
Discrete Applied Mathematics - Special volume on partitioning and decomposition in combinatorial optimization
Advances in knowledge discovery and data mining
Advances in knowledge discovery and data mining
Positive and Horn decomposability of partially defined Boolean functions
Discrete Applied Mathematics
Logical analysis of numerical data
Mathematical Programming: Series A and B - Special issue: papers from ismp97, the 16th international symposium on mathematical programming, Lausanne EPFL
An Implementation of Logical Analysis of Data
IEEE Transactions on Knowledge and Data Engineering
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Hi-index | 0.00 |
Logical analysis of data (LAD) is one of the methodologies for extracting knowledge as a Boolean function f from a given pair of data sets (T,F) on attributes set S of size n, in which T (resp., F) ⊆ {0, 1}n denotes a set of positive (resp., negative) examples for the phenomenon under consideration. In this paper, we consider the case in which extracted knowledge has a decomposable structure; i.e., f is described as a form f(x) = g(x[S0], h(x[S1])) for some S0, S1 ⊆ S and Boolean functions g and h, where x[I] denotes the projection of vector x on I. In order to detect meaningful decomposable structures, it is expected that the sizes |T| and |F| must be sufficiently large. In this paper, we provide an index for such indispensable number of examples, based on probabilistic analysis. Using p = |T|/(|T| + |F|) and q = |F|/(|T| + |F|), we claim that there exist many deceptive decomposable structures of (T,F) if |T| + |F| ≤ √2n-1/pq. The computational results on synthetically generated data sets show that the above index gives a good lower bound on the indispensable data size.