Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
Change detection in hierarchically structured information
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Proceedings of the 10th international conference on World Wide Web
Introduction to Algorithms
Speeding Up External Mergesort
IEEE Transactions on Knowledge and Data Engineering
ACM Transactions on Database Systems (TODS)
NEXSORT: Sorting XML in External Memory
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
XArch: archiving scientific and reference data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Proceedings of the VLDB Endowment
Efficient storage and temporal query evaluation in hierarchical data archiving systems
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
SliceSort: efficient sorting of hierarchical data
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Sorting hierarchical data in external memory is necessary for a wide variety of applications including archiving scientific data and dealing with large XML datasets. The topic of sorting hierarchical data, however, has received little attention from the research community so far. In this paper we focus on sorting arbitrary hierarchical data that far exceed the size of physical memory. We propose HErMeS, an algorithm that generalizes the most widely-used techniques for sorting flat data in external memory. HErMeS efficiently exploits the hierarchical structure to minimize the number of disk accesses and optimize the use of available memory. We extract the theoretical bounds of the algorithm with respect to the structure of the hierarchical dataset. We then show how the algorithm can be used to support efficient archiving. We have conducted an experimental study using several workloads and comparing HErMeS to the state-of-the-art approaches. Our results show that our algorithm (a) meets its theoretical expectations, (b) allows for scalable database archiving, and (c) outperforms the competition by a significant factor. These results, we believe, prove our technique to be a viable and scalable solution to the problem of sorting hierarchical data in external memory.