Myriad: scalable and expressive data generation

  • Authors:
  • Alexander Alexandrov;Kostas Tzoumas;Volker Markl

  • Affiliations:
  • Technische Universität Berlin, Germany;Technische Universität Berlin, Germany;Technische Universität Berlin, Germany

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The current research focus on Big Data systems calls for a rethinking of data generation methods. The traditional sequential data generation approach is not well suited to large-scale systems as generating a terabyte of data may require days or even weeks depending on the number of constraints imposed on the generated model. We demonstrate Myriad, a new data generation toolkit that enables the specification of semantically rich data generator programs that can scale out linearly in a shared-nothing environment. Data generation programs built on top of Myriad implement an efficient parallel execution strategy leveraged by the extensive use of pseudo-random number generators with random access support.