Approximate Aggregations in Trajectory Data Warehouses

  • Authors:
  • F. Braz;S. Orlando;R. Orsini;A. Raffaeta;A. Roncato;C. Silvestri

  • Affiliations:
  • Dipartimento di Informatica, Università Ca' Foscari di Venezia, Italy. fbraz@dsi.unive.it;Dipartimento di Informatica, Università Ca' Foscari di Venezia, Italy. orlando@dsi.unive.it;Dipartimento di Informatica, Università Ca' Foscari di Venezia, Italy. orsini@dsi.unive.it;Dipartimento di Informatica, Università Ca' Foscari di Venezia, Italy. raffaeta@dsi.unive.it;Dipartimento di Informatica, Università Ca' Foscari di Venezia, Italy. roncato@dsi.unive.it;Dipartimento di Informatica, Università Ca' Foscari di Venezia, Italy. silvestri@dsi.unive.it

  • Venue:
  • ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this papser we discuss how data warehousing technology can be used to store aggregate information about trajectories and perform OLAP operations over them. To this end, we define a data cube with spatial and temporal dimensions, discretized according to a regular grid. We investigate in depth some issues related to the computation of a holistic aggregate function, i.e, the presence, which returns the number of distinct trajectories occurring in a given spatio-temporal area. In particular, we introduce a novel way to compute an approximate, but nevertheless very accurate, presence aggregate function, which uses only a bounded amount of measures stored in the base cells of our cuboid. We also concentrate on the loading phase of our data warehouse, which has to deal with an unbounded stream of trajectory observations. We suggest how the complexity of this phase can be reduced, and we analyse the errors that this procedure induces at the level of the subaggregates stored in the base cells. These errors and the accuracy of our approximate aggregate functions are carefully evaluated by means of tests performed on synthetic trajectory datasets.