Sample-based forecasting exploiting hierarchical time series

Authors:
Ulrike Fischer;Frank Rosenthal;Wolfgang Lehner
Affiliations:
Dresden University of Technology, Dresden, Germany;Dresden University of Technology, Dresden, Germany;Dresden University of Technology, Dresden, Germany
Venue:
Proceedings of the 16th International Database Engineering & Applications Sysmposium
Year:
2012

Citing 8
Cited 2

ICICLES: Self-Tuning Samples for Approximate Query Answering

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Optimized stratified sampling for approximate query processing

ACM Transactions on Database Systems (TODS)
Processing forecasting queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A skip-list approach for efficiently processing forecasting queries

Proceedings of the VLDB Endowment
MAD skills: new analysis practices for big data

Proceedings of the VLDB Endowment
Forecasting high-dimensional data

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Efficient in-database maintenance of ARIMA models

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
F2DB: The Flash-Forward Database System

ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering

Forecasting in hierarchical environments

Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Efficient forecasting for hierarchical time series

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Time series forecasting is challenging as sophisticated forecast models are computationally expensive to build. Recent research has addressed the integration of forecasting inside a DBMS. One main benefit is that models can be created once and then repeatedly used to answer forecast queries. Often forecast queries are submitted on higher aggregation levels, e. g., forecasts of sales over all locations. To answer such a forecast query, we have two possibilities. First, we can aggregate all base time series (sales in Austria, sales in Belgium...) and create only one model for the aggregate time series. Second, we can create models for all base time series and aggregate the base forecast values. The second possibility might lead to a higher accuracy but it is usually too expensive due to a high number of base time series. However, we actually do not need all base models to achieve a high accuracy, a sample of base models is enough. With this approach, we still achieve a better accuracy than an aggregate model, very similar to using all models, but we need less models to create and maintain in the database. We further improve this approach if new actual values of the base time series arrive at different points in time. With each new actual value we can refine the aggregate forecast and eventually converge towards the real actual value. Our experimental evaluation using several real-world data sets, shows a high accuracy of our approaches and a fast convergence towards the optimal value with increasing sample sizes and increasing number of actual values respectively.