A Frequency-based Approach for Mining Coverage Statistics in Data Integration

  • Authors:
  • Zaiging Nie;Subbarao Kambhampati

  • Affiliations:
  • -;-

  • Venue:
  • ICDE '04 Proceedings of the 20th International Conference on Data Engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Query optimization in data integration requires source coverageand overlap statistics.Gathering and storing the requiredstatistics presents many challenges, not the least of which is controllingthe amount of statistics learned.In this paper we introduceStatMiner, a novel statistics mining approach which automaticallygenerates attribute value hierarchies, efficiently discoversfrequently accesses query classes based on the learned attributevalue hierarchies, and learns statistics only with respect to theseclasses.We describe the details of our method, and present experimentalresults demonstrating the efficiency and effectiveness of ourapproach.Our experiments are done in the context of BibFinder,a publicly fielded bibliography mediator.