The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Efficiently supporting ad hoc queries in large datasets of time sequences
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Dimensionality reduction for similarity searching in dynamic databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A comparison of DFT and DWT based similarity search in time-series databases
Proceedings of the ninth international conference on Information and knowledge management
Rank aggregation methods for the Web
Proceedings of the 10th international conference on World Wide Web
Locally adaptive dimensionality reduction for indexing large time series databases
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The TV-tree: an index structure for high-dimensional data
The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Wavelets for Computer Graphics: A Primer, Part 1
IEEE Computer Graphics and Applications
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Efficient Time Series Matching by Wavelets
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Parallelization of Similarity Search in Large Time Series Databases
IMSCCS '06 Proceedings of the First International Multi-Symposiums on Computer and Computational Sciences - Volume 1 (IMSCCS'06) - Volume 01
Experiencing SAX: a novel symbolic representation of time series
Data Mining and Knowledge Discovery
Exact indexing of dynamic time warping
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
iSAX: indexing and mining terabyte sized time series
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the VLDB Endowment
Bounded similarity querying for time-series data
Information and Computation
Hi-index | 0.00 |
In this paper, we investigate the possibilities offered by the Hadoop eco-system for searching time series in an electric power company (Top-K or range-queries based on a similarity measure). There has been much work done on speeding up the search of time series in a large dataset, mainly by designing efficient indexing techniques preceded by reduction techniques. In this paper, we do not follow these approaches but focus on using the brutal force of distributed computations in the Hadoop environment. We propose an implementation of time series search functions in Hadoop and describe experiments on a large database of electric power consumption curves (35M customers observed during 1 month at a 30' sampling rate). We also show that this architecture supports easily the computation of several distances for the same query with a small response time overhead: this is very useful in practice when the end-user does not know very well which distance to use.