Spatiotemporal summarization of traffic data streams

  • Authors:
  • Bei Pan;Ugur Demiryurek;Farnoush Banaei-Kashani;Cyrus Shahabi

  • Affiliations:
  • University of Southern California, Los Angeles, CA;University of Southern California, Los Angeles, CA;University of Southern California, Los Angeles, CA;University of Southern California, Los Angeles, CA

  • Venue:
  • Proceedings of the ACM SIGSPATIAL International Workshop on GeoStreaming
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

With resource-efficient summarization and accurate reconstruction of the historic traffic sensor data, one can effectively manage and optimize transportation systems (e.g., road networks) to become smarter (better mobility, less congestion, less travel time, and less travel cost) and greener (less waste of fuel and less greenhouse gas production). The existing data summarization (and archival) techniques are generic and are not designed to leverage the unique characteristics of the traffic data for effective data reduction. In this paper, we propose and explore a family of data summaries that take advantage of the high temporal and spatial redundancy/correlation among sensor readings from individual sensors and sensor groups, respectively, for effective data reduction. In particular, with these summaries we derive and maintain a "signature" as well as a series of "outliers" for the readings received from each individual sensor or group of co-located sensors. While signatures capture the typical readings that estimate the actual readings with bounded error, the outliers represent the actual readings where the error-bound is violated. With the combination of signatures and outliers, our proposed data summaries can effectively represent the actual data with much smaller storage footprint, while allowing for efficient querying of the sensor data with bounded error. Our experiments with a real traffic sensor dataset shows that our proposed data summaries use only 23% of the storage space otherwise required for storing the actual data, while allowing for highly accurate query results with guaranteed precision.