Skyline and mapping aware join query evaluation

  • Authors:
  • Venkatesh Raghavan;Elke A. Rundensteiner;Shweta Srivastava

  • Affiliations:
  • Greenplum, 1900 South Norfolk Street, San Mateo, CA, United States and Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA, United States;Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA, United States;Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA, United States

  • Venue:
  • Information Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Growing interests in multi-criteria decision support applications have resulted in a flurry of efficient skyline algorithms. In practice, real-world decision support applications require to access data from disparate sources. Existing techniques define the skyline operation to work on a single set, and therefore, treat skylines as an ''add-on'' on top of a traditional Select-Project-Join query plan. In many real-world applications, the skyline dimensions can be anti-correlated such as the attribute pair {price, mileage} for cars and {price, distance} for hotels. Anti-correlated data are particularly challenging for skyline evaluation and therefore have commonly been ignored by existing techniques. In this work, we propose a robust execution framework called SKIN to evaluate skyline over joins. The salient features of SKIN are: (a) effective in reducing the two primary costs, namely the cost of generating the join results and the cost of dominance comparisons to compute the final skyline of join results, (b) shown to be robust for both skyline-friendly (independent and correlated) as well as skyline-unfriendly (anti-correlated) data distributions. SKIN is effective in exploiting the skyline knowledge in both local within individual data sources and across disparate sources-to significantly reduce the above-mentioned costs incurred during the evaluation of skyline over join. Our experimental study demonstrates the superiority of our proposed approach over state-of-the-art techniques to handle a wide variety of data distributions.