Improving the performance and functionality of Mondrian open-source OLAP systems

  • Authors:
  • Pablo Sendín-Raña;Francisco J. González-Castaño;Enrique Pérez-Barros;Pedro S. Rodríguez-Hernández;Felipe Gil-Castiñeira;José M. Pousada-Carballo

  • Affiliations:
  • Departamento de Ingeniería Telemática, Universidad de Vigo, Campus, 36310 Vigo, Spain;Departamento de Ingeniería Telemática, Universidad de Vigo, Campus, 36310 Vigo, Spain;SAEC-DATA SA, C. Gran Vía 94-1, 36203 Vigo, Spain;Departamento de Ingeniería Telemática, Universidad de Vigo, Campus, 36310 Vigo, Spain;Departamento de Ingeniería Telemática, Universidad de Vigo, Campus, 36310 Vigo, Spain;Departamento de Ingeniería Telemática, Universidad de Vigo, Campus, 36310 Vigo, Spain

  • Venue:
  • Software—Practice & Experience
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

For a long time, the design of relational databases has focused on the optimization of atomic transactions (insert, select, update or delete). Currently, relational databases store tactical information of data warehouses, mainly for select-like operations. However, the database paradigm has evolved, and nowadays on-line analytical processing (OLAP) systems handle strategic information for further analysis. These systems enable fast, interactive and consistent information analysis of data warehouses, including shared calculations and allocations. OLAP and data warehouses jointly allow multidimensional data views, turning raw data into knowledge. OLAP allows ‘slice and dice’ navigation and a top-down perspective of data hierarchies. In this paper, we describe our experience in the migration from a large relational database management system to an OLAP system on top of a relational layer (the data warehouse), and the resulting contributions in open-source ROLAP optimization. Existing open-source ROLAP technologies rely on summarized tables with materialized aggregate views to improve system performance (in terms of response time). The design and maintenance of those tables are cumbersome. Instead, we intensively exploit cache memory, where key data reside, yielding low response times. A cold start process brings summarized data from the relational database to cache memory, subsequently reducing the response time. We ensure concurrent access to the summarized data, as well as consistency when the relational database updates data. We also improve the OLAP functionality, by providing new features for automating the creation of calculated members. This makes it possible to define new measures on the fly using virtual dimensions, without re-designing the multidimensional cube. We have chosen the XML-A de facto standard for service provision. Copyright © 2008 John Wiley & Sons, Ltd.