Generation of synthetic XML for evaluation of hybrid XML systems

Authors:
David Hall;Lena Strömbäck
Affiliations:
Linköpings universitet, Linköping, Sweden;Linköpings universitet, Linköping, Sweden
Venue:
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
Year:
2010

Citing 14
Cited 0

The 007 Benchmark

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A measure of transaction processing power

Readings in database systems (3rd ed.)
StatiX: making XML count

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
ToXgene: a template-based data generator for XML

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Benchmarking Database Systems A Systematic Approach

VLDB '83 Proceedings of the 9th International Conference on Very Large Data Bases
XMach-1: A Benchmark for XML Data Management

Datenbanksysteme in Büro, Technik und Wissenschaft (BTW), 9. GI-Fachtagung,
The XML benchmark project

The XML benchmark project
Fast Detection of XML Structural Similarity

IEEE Transactions on Knowledge and Data Engineering
XCluster Synopses for Structured XML Content

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Generating XML structure using examples and constraints

Proceedings of the VLDB Endowment
OpenStreetMap: User-Generated Street Maps

IEEE Pervasive Computing
HShreX - A Tool for Design and Evaluation of Hybrid XML Storage

DEXA '09 Proceedings of the 2009 20th International Workshop on Database and Expert Systems Application
The Michigan benchmark: towards XML query performance diagnostics

Information Systems
Count-Constraints for generating XML

NGITS'06 Proceedings of the 6th international conference on Next Generation Information Technologies and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hybrid XML storage offers a large number of alternative shredding choices. In order to automatically determine optimal shredding strategies it is crucial to have an insight into how the structure of a XML data set affects the performance. Since the structure can take many forms and the number of possible mappings is huge it is important to gain insights on the relation between structure and performance for formats that are actually used. By taking real-world data sets and modify the structure in steps you can see how the performance and other measurable properties change. We describe how a data generator can be used to produce a synthetic data set based on an existing data set, by using four different models. We compare the performance on the original data set with the performance on the different synthetic models.