Random databases with correlated data

  • Authors:
  • Gyula O. H. Katona

  • Affiliations:
  • Rényi Institute, Budapest, Hungary

  • Venue:
  • Conceptual Modelling and Its Theoretical Foundations
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

A model of random databases is given, with arbitrary correlations among the data of one individual. This is given by a joint distribution function. The individuals are chosen independently, their number m is considered to be (approximately) known. The probability of the event that a given functional dependency A→b holds (A is a set of attributes, b is an attribute) is determined in a limiting sense. This probability is small if m is much larger than $2^{H_2(A\rightarrow b)/2}$ and is large if m is much smaller than $2^{H_2(A\rightarrow b)/2}$ where H2 (A→b) is an entropy like functional of the probability distribution of the data.