Persistence, amortization and randomization
SODA '91 Proceedings of the second annual ACM-SIAM symposium on Discrete algorithms
Probability
A fast algorithm for particle simulations
Journal of Computational Physics - Special issue: commenoration of the 30th anniversary
Understanding Molecular Simulation
Understanding Molecular Simulation
Analysis of predictive spatio-temporal queries
ACM Transactions on Database Systems (TODS)
A Guide to Monte Carlo Simulations in Statistical Physics
A Guide to Monte Carlo Simulations in Statistical Physics
Scientific data management in the coming decade
ACM SIGMOD Record
Computing Distance Histograms Ef?ciently in Scientific Databases
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
BioSimGrid: Grid-enabled biomolecular simulation data storage and analysis
Future Generation Computer Systems - Collaborative and learning applications of grid technology
A time efficient indexing scheme for complex spatiotemporal retrieval
ACM SIGMOD Record
Performance analysis of a dual-tree algorithm for computing spatial distance histograms
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient SDH computation in molecular simulations data
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Hi-index | 0.00 |
Large data generated by scientific applications imposes challenges in storage and efficient query processing. Many queries against scientific data are analytical in nature and require super-linear computation time using straightforward methods. Spatial distance histogram (SDH) is one of the basic queries to analyze the molecular simulation (MS) data, and it takes quadratic time to compute using brute-force approach. Often, an SDH query is executed continuously to analyze the simulation system over a period of time. This adds to the total time required to compute SDH. In this paper, we propose an approximate algorithm to compute SDH efficiently over consecutive time periods. In our approach, data is organized into a Quad-tree based data structure. The spatial locality of the particles (at given time) in each node of the tree is acquired to determine the particle distribution. Similarly, the temporal locality of particles (between consecutive time periods) in each node is also acquired. The spatial distribution and temporal locality are utilized to compute the approximate SDH at every time instant. The performance is boosted by storing and updating the spatial distribution information over time. The efficiency and accuracy of the proposed algorithm is supported by mathematical analysis and results of extensive experiments using biological data generated from real MS studies.