Searching time series with Hadoop in an electric power company
Proceedings of the 2nd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Hi-index | 0.00 |
In this paper, an efficient parallel algorithm to search large time series databases is proposed. There are existing parallel algorithms for performing such tasks, which generally utilize multidimensional tree structures and thus are subjected to the performance of multidimensional trees. On the other hand, there have been a number of serial algorithms proposed in the past decade. Most of them use certain transformation techniques to reduce the dimensionality and then build an index to facilitate the search process. This again results in performance degradation. This work develops a parallel algorithm to process range query and knearest neighbor query in parallel time series databases, assuming a shared nothing multi-processor architecture. Both analytical and experimental results show that the new approach has near linear scaleup and linear speedup with little more effort than non-index based sequential scan and thus another alternative to index based approach.