Query processing on cubes mapped from ontologies to dimension hierarchies

  • Authors:
  • Carlos Garcia-Alvarado;Carlos Ordonez

  • Affiliations:
  • University of Houston / EMC Greenplum, Houston, TX, USA;University of Houston, Houston, TX, USA

  • Venue:
  • Proceedings of the fifteenth international workshop on Data warehousing and OLAP
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text columns commonly extend core information stored as atomic values in a relational database, creating a need to explore and summarize text data. OLAP cubes can precisely accomplish such tasks. However, cubes have been overlooked as a mechanism for capturing not only text summarizations, but also for representing and exploring the hierarchical structure of an ontology. In this paper, we focus on exploiting cubes to compute multidimensional aggregations on classified documents stored in a DBMS (keyword frequency, document count, document class frequency and so on). We propose CUBO (CUBed Ontologies), a novel algorithm, which efficiently manipulates the hierarchy behind an ontology. Our algorithm is optimized to compute desired summarizations without having to search all possible dimension combinations, exploiting the sparseness of the document classification frequency matrix. Experiments on large text data sets show CUBO can explore faster more dimension combinations than a standard cube algorithm, especially when the cube has a large number of dimensions. CUBO was developed entirely inside a DBMS, using SQL queries and extensibility features.