Improving query performance on XML documents: a workload-driven design approach

  • Authors:
  • Rebeca Schroeder;Ronaldo dos Santos Mello

  • Affiliations:
  • Federal University of Santa Catarina, Florianópolis, Brazil;Federal University of Santa Catarina, Florianópolis, Brazil

  • Venue:
  • Proceedings of the eighth ACM symposium on Document engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

As XML has emerged as a data representation format and as great quantities of data have been stored in the XML format, XML document design has become an important and evident issue in several application contexts. Methodologies based on conceptual modeling are being tightly applied for designing XML documents. However, the conversion of a conceptual schema to an XML schema is a complex process. In many cases, conceptual relationships cannot be represented in a hierarchy so that they have to be represented by reference relationships in the XML schema. The problem is that reference relationships generate a disconnected XML structure and, consequently, produce an overhead cost for query processing on XML documents. This paper presents a design approach for generating XML schemas from conceptual schemas considering the expected workload of the XML applications. Query workload is used to produce XML schemas which minimize the impact of the reference relationships on query performance. We evaluate our approach through a case study where a set of XML documents are redesigned by our methodology. The results demonstrate that query performance is improved in terms of the number of accesses generated by the queries on the XML documents designed by our approach.