High-throughput query scheduling with spatial clustering based on distributed exponential moving average

Authors:
Beomseok Nam;Deukyeon Hwang;Jinwoong Kim;Minho Shin
Affiliations:
Electrical and Computer Engineering, Ulsan National Inst. of Science and Technology, Ulsan, Korea 689-798;Electrical and Computer Engineering, Ulsan National Inst. of Science and Technology, Ulsan, Korea 689-798;Electrical and Computer Engineering, Ulsan National Inst. of Science and Technology, Ulsan, Korea 689-798;Dept. of Computer Engineering, Myongji University, Yongin, Korea
Venue:
Distributed and Parallel Databases
Year:
2012

Citing 13
Cited 1

Simultaneous optimization and evaluation of multiple dimensional queries

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Locality-aware request distribution in cluster-based network servers

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
An optimal algorithm for approximate nearest neighbor searching fixed dimensions

Journal of the ACM (JACM)
Efficient and extensible algorithms for multi query optimization

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Scaling for E Business: Technologies, Models, Performance, and Capacity Planning

Scaling for E Business: Technologies, Models, Performance, and Capacity Planning
On the Multiple-Query Optimization Problem

IEEE Transactions on Knowledge and Data Engineering
Common Subexpression Processing in Multiple-Query Processing

IEEE Transactions on Knowledge and Data Engineering
Semantic Caching and Query Processing

IEEE Transactions on Knowledge and Data Engineering
Multiple Query Optimization for Data Analysis Applications on Clusters of SMPs

CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Scalable Spatio-temporal Continuous Query Processing for Location-aware Services

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Query planning for the grid: adapting to dynamic resource availability

CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
Scalable content-aware request distribution in cluster-based networks servers

ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Multiple query scheduling for distributed semantic caches

Journal of Parallel and Distributed Computing

Special issue for data intensive eScience

Distributed and Parallel Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

In distributed scientific query processing systems, leveraging distributed cached data is becoming more important. In such systems, a front-end query scheduler distributes queries among many application servers rather than processing queries in a few high-performance workstations. Although many query scheduling policies exist such as round-robin and load-monitoring, they are not sophisticated enough to exploit cached results as well as balance the workload. Efforts were made to improve the query processing performance using statistical methods such as exponential moving average. However, existing methods have limitations for certain query patterns: queries with hotspots, or dynamic query distributions. In this paper, we propose novel query scheduling policies that take into account both the contents of distributed caching infrastructure and the load balance among the servers. Our experiments show that the proposed query scheduling policies outperform existing policies by producing better query plans in terms of load balance and cache-hit ratio.