Efficiently evaluating skyline queries on RDF databases

Authors:
Ling Chen;Sidan Gao;Kemafor Anyanwu
Affiliations:
Semantic Computing Research Lab, Department of Computer Science, North Carolina State University;Semantic Computing Research Lab, Department of Computer Science, North Carolina State University;Semantic Computing Research Lab, Department of Computer Science, North Carolina State University
Venue:
ESWC'11 Proceedings of the 8th extended semantic web conference on The semanic web: research and applications - Volume Part II
Year:
2011

Citing 12
Cited 4

The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
Efficient Progressive Skyline Computation

Proceedings of the 27th International Conference on Very Large Data Bases
Progressive skyline computation in database systems

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Maximal vector computation in large data sets

VLDB '05 Proceedings of the 31st international conference on Very large data bases
SaLSa: computing the skyline without scanning the whole sky

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Shooting stars in the sky: an online algorithm for skyline queries

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Scalable semantic web data management using vertical partitioning

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Serving the Sky: Discovering and Selecting Semantic Web Services through Dynamic Skyline Queries

ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
Top-k dominant web services under multi-criteria matching

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Relaxing RDF queries based on user and domain preferences

Journal of Intelligent Information Systems
Selecting skyline services for QoS-based web service composition

Proceedings of the 19th international conference on World wide web
Querying the semantic web with preferences

ISWC'06 Proceedings of the 5th international conference on The Semantic Web

CAREY: ClimAtological contRol of EmergencY regions

OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems
Malleability-Aware skyline computation on linked open data

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
Evaluating skyline queries over vertically partitioned tables

Proceedings of the 17th International Database Engineering & Applications Symposium
Efficiently Producing the K Nearest Neighbors in the Skyline on Vertically Partitioned Tables

International Journal of Information Retrieval Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Skyline queries are a class of preference queries that compute the pareto-optimal tuples from a set of tuples and are valuable for multicriteria decision making scenarios. While this problem has received significant attention in the context of single relational table, skyline queries over joins of multiple tables that are typical of storage models for RDF data has received much less attention. A naïve approach such as a join-first-skyline-later strategy splits the join and skyline computation phases which limit opportunities for optimization. Other existing techniques for multi-relational skyline queries assume storage and indexing techniques that are not typically used with RDF which would require a preprocessing step for data transformation. In this paper, we present an approach for optimizing skyline queries over RDF data stored using a vertically partitioned schema model. It is based on the concept of a "Header Point" which maintains a concise summary of the already visited regions of the data space. This summary allows some fraction of nonskyline tuples to be pruned from advancing to the skyline processing phase, thus reducing the overall cost of expensive dominance checks required in the skyline phase. We further present more aggressive pruning rules that result in the computation of near-complete skylines in significantly less time than the complete algorithm. A comprehensive performance evaluation of different algorithms is presented using datasets with different types of data distributions generated by a benchmark data generator.