Distributed XML Processing

Authors:
M. Tamer Özsu
Affiliations:
Cheriton School of Computer Science, University of Waterloo,
Venue:
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Year:
2009

Citing 0
Cited 1

XML structural similarity search using mapreduce

WAIM'10 Proceedings of the 11th international conference on Web-age information management

Quantified Score

Hi-index	0.00

Visualization

Abstract

XML is commonly used to store data and to exchange it between a variety of systems. While centralized querying of XML data is increasingly well understood, the same is not true in a scenario where the data is spread across multiple nodes in a distributed system. Since the size of XML data collections are increasing along with the heavy workloads that need to be evaluated on top of these collections, scaling a centralized solution is becoming increasingly difficult. A common method for addressing this issue is to distribute the data and parallelize query execution. This is well understood in relational databases, but the issues are more complicated in the case of XML data due to the complexity of the data representation and the flexibility of the schema definition. In this talk, I will introduce our new project to systematically study distributed XML processing issues. The talk will focus on data fragmentation and localization issues. This is joint work with Patrick Kling.