A Model for Distributing and Querying a Data Warehouse on a Computing Grid

Authors:
Pascal Wehrle;Maryvonne Miquel;Anne Tchounikine
Affiliations:
LIRIS - INSA de Lyon;LIRIS - INSA de Lyon;LIRIS - INSA de Lyon
Venue:
ICPADS '05 Proceedings of the 11th International Conference on Parallel and Distributed Systems - Volume 01
Year:
2005

Citing 0
Cited 5

Experimenting the Query Performance of a Grid-Based Sensor Network Data Warehouse

Globe '08 Proceedings of the 1st international conference on Data Management in Grid and Peer-to-Peer Systems
Data mining-based fragmentation of XML data warehouses

Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
Data Transformation Services over Grids with Real-Time Bound Constraints

OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part I on On the Move to Meaningful Internet Systems:
Enhancing XML data warehouse query performance by fragmentation

Proceedings of the 2009 ACM symposium on Applied Computing
Fragmenting very large XML data warehouses via K-means clustering algorithm

International Journal of Business Intelligence and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data warehpuses store large volumes of data according to a multidimensional model dimensions representing different axesof analysis. OLAP systems (Online Analytical Processing) provide the ability to interactively explore the data warehouse. Rising volumes and complexity of data favor the use of morepowerful distributed computing architectures. Computing grids in particular are built for decentralized management of heterogeneousdistributed resources. Their lack of centralized control however conflicts with classic centralized data warehouse models. To take advantage of a computing grid infrastracture to operate a data warehouse several problems need to be solved. First the data warehouse must be uniquely indentified and judiciously partitioned to allow effecient distribution querying and exchange among the nodes of the grid. We propose a data model based on "chunks" as atomic entities of warehouse data that can be uniquely identified. We then build contiguous blocks of these chunks to obtain suitable fragments of the data warehouse. The fragments stored on each grid node must be indexed in a uniform way to effectively interact with existing gridservices. Our indexing structure consists of a lattice structure mapping queries to warehouse fragments and specialized spatial index structure formed by X-trees providing the information neccessary for optimized query evaluation plans.