Design and evaluation of data allocation algorithms for distributed multimedia database systems

  • Authors:
  • Yu-Kwong Kwok;K. Karlapalem;I. Ahmad;Ng Moon Pun

  • Affiliations:
  • Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., Kowloon;-;-;-

  • Venue:
  • IEEE Journal on Selected Areas in Communications
  • Year:
  • 2006

Quantified Score

Hi-index 0.07

Visualization

Abstract

A major cost in retrieving multimedia data from multiple sites is the cost incurred in transferring multimedia data objects (MDOs) from different sites to the site where the query is initiated. The objective of a data allocation algorithm is to locate the MDOs at different sites so as to minimize the total data transfer cost incurred in executing a given set of queries. The optimal allocation of MDOs depends on the query execution strategy employed by a distributed multimedia system while the query execution strategy optimizes a query based on this allocation. We fix the query execution strategy and develop a site-independent MDO dependency graph representation to model the dependencies among the MDOs accessed by a query. Given the MDO dependency graphs as well as the set of multimedia database sites, data transfer costs between the sites, the allocation limit on the number of MDOs that can be allocated at a site, and the query execution frequencies from the sites, an allocation scheme is generated. We formulate the data allocation problem as an optimization problem. We solve this problem with a number of techniques that broadly belong to three classes: max-flow min-cut, state-space search, and graph partitioning heuristics. The max-flow min-cut technique formulates the data allocation problem as a network-flow problem, and uses a hill-climbing approach to try to find the optimal solution. For the state-space search approach, the problem is solved using a best-first search algorithm. The graph partitioning approach uses two clustering heuristics, the agglomerative clustering and divisive clustering. We evaluate and compare these approaches, and assess their cost-performance trade-offs. All algorithms are also compared with optimal solutions obtained through exhaustive search. Conclusions are also made on the suitability of these approaches to different scenarios