Dynamic Data Migration Policies for Query-Intensive Distributed Data Environments

  • Authors:
  • Tengjiao Wang;Bishan Yang;Allen Huang;Qi Zhang;Jun Gao;Dongqing Yang;Shiwei Tang;Jinzhong Niu

  • Affiliations:
  • Key Laboratory of High Confidence Software Technologies, Peking University,Ministry of Education, China School of Electronics Engineering and Computer Science, Peking University, Beijing, China 10 ...;Key Laboratory of High Confidence Software Technologies, Peking University,Ministry of Education, China School of Electronics Engineering and Computer Science, Peking University, Beijing, China 10 ...;Microsoft SQL China R&D Center,;Key Laboratory of High Confidence Software Technologies, Peking University,Ministry of Education, China School of Electronics Engineering and Computer Science, Peking University, Beijing, China 10 ...;Key Laboratory of High Confidence Software Technologies, Peking University,Ministry of Education, China School of Electronics Engineering and Computer Science, Peking University, Beijing, China 10 ...;Key Laboratory of High Confidence Software Technologies, Peking University,Ministry of Education, China School of Electronics Engineering and Computer Science, Peking University, Beijing, China 10 ...;Key Laboratory of High Confidence Software Technologies, Peking University,Ministry of Education, China School of Electronics Engineering and Computer Science, Peking University, Beijing, China 10 ...;Computer Science, Graduate School and University Center, City University of New York, New York, USA NY 10016

  • Venue:
  • APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern large distributed applications, such as telecommunication and banking services, need to respond instantly to a huge number of queries within a short period of time. The data-intensive, query-intensive nature makes it necessary to build these applications in a distributed data environment that involves a number of data servers sharing service load. How data is distributed among the servers has a crucial impact on the system response time. This paper introduces two policies that dynamically migrate data in such an environment as the pattern of queries on data changes, and achieve query load balance. One policy is based on a central controller that periodically collects the query load information on all data servers and regulates data migration across the whole system. The other policy lets individual server dynamically selects a partner to migrate data and balance query load in between. Experimental results show that both policies significantly improve system performance in terms of average query response time and fairness, and communication overhead incurred is marginal.