Algorithms and analyses for maximal vector computation

Authors:
Parke Godfrey;Ryan Shipley;Jarek Gryz
Affiliations:
York University, Canada;The College of William and Mary, USA;York University, Canada
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2007

Citing 23
Cited 32

On the average number of maxima in a set of vectors

Information Processing Letters
Fast linear expected-time alogorithms for computing maxima and convex hulls

SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
On Finding the Maxima of a Set of Vectors

Journal of the ACM (JACM)
On the Average Number of Maxima in a Set of Vectors and Applications

Journal of the ACM (JACM)
Interactive Data Analysis: The Control Project

Computer
Querying with Intrinsic Preferences

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
Efficient Progressive Skyline Computation

Proceedings of the 27th International Conference on Very Large Data Bases
Indexing for progressive skyline computation

Data & Knowledge Engineering
An optimal and progressive algorithm for skyline queries

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Mining thick skylines over large databases

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Efficient Processing of Skyline Queries with Partially-Ordered Domains

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Progressive skyline computation in database systems

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Stratified computation of skylines with partially-ordered domains

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Maximal vector computation in large data sets

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Efficient computation of the skyline cube

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Catching the best views of skyline: a semantic approach based on decisive subspaces

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Robust Cardinality and Cost Estimation for Skyline Operator

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
SUBSKY: Efficient Computation of Skylines in Subspaces

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Skyline Queries Against Mobile Lightweight Devices in MANETs

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Shooting stars in the sky: an online algorithm for skyline queries

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Multi-objective query processing for database systems

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Kernel-based skyline cardinality estimation

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Masking patterns in sequences: A new class of motif discovery with don't cares

Theoretical Computer Science
Evaluation of skyline algorithms in PostgreSQL

IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
Randomized multi-pass streaming skyline algorithms

Proceedings of the VLDB Endowment
Discovering relative importance of skyline attributes

Proceedings of the VLDB Endowment
Top-k vectorial aggregation queries in a distributed environment

Journal of Parallel and Distributed Computing
Regret-minimizing representative databases

Proceedings of the VLDB Endowment
Efficient skyline evaluation over partially ordered domains

Proceedings of the VLDB Endowment
Distributed threshold querying of general functions by a difference of monotonic representation

Proceedings of the VLDB Endowment
Stream engines meet wireless sensor networks: cost-based planning and processing of complex queries in AnduIN

Distributed and Parallel Databases
QSkycube: efficient skycube computation using point-based space partitioning

Proceedings of the VLDB Endowment
Preference elicitation in prioritized skyline queries

The VLDB Journal — The International Journal on Very Large Data Bases
On finding skylines in external memory

Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A survey on representation, composition and application of preferences in database systems

ACM Transactions on Database Systems (TODS)
Maxima-finding algorithms for multidimensional samples: A two-phase approach

Computational Geometry: Theory and Applications
Highly scalable multiprocessing algorithms for preference-based database retrieval

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Fidelity metrics for estimation models

Proceedings of the International Conference on Computer-Aided Design
Interactive regret minimization

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Efficient approximation of the maximal preference scores by lightweight cubic views

Proceedings of the 15th International Conference on Extending Database Technology
Malleability-Aware skyline computation on linked open data

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
Ad insertion in automatically composed documents

Proceedings of the 2012 ACM symposium on Document engineering
Worst-Case I/O-Efficient Skyline Algorithms

ACM Transactions on Database Systems (TODS)
Learning to translate with multiple objectives

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Efficient processing of multiple continuous skyline queries over a data stream

Information Sciences: an International Journal
Skyline probability over uncertain preferences

Proceedings of the 16th International Conference on Extending Database Technology
Skyline queries in crowd-enabled databases

Proceedings of the 16th International Conference on Extending Database Technology
Breaking skyline computation down to the metal: the skyline breaker algorithm

Proceedings of the 17th International Database Engineering & Applications Symposium
Skyline operator on anti-correlated distributions

Proceedings of the VLDB Endowment
Skyline queries, front and back

ACM SIGMOD Record
Scalable skyline computation using a balanced pivot selection technique

Information Systems
Resource allocation with multi-factor node ranking in data center networks

Future Generation Computer Systems
Toward efficient multidimensional subspace skyline computation

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The maximal vector problem is to identify the maximals over a collection of vectors. This arises in many contexts and, as such, has been well studied.The problem recently gained renewed attention with skyline queries for relational databases and with work to develop skyline algorithms that are external and relationally well behaved. While many algorithms have been proposed, how they perform has been unclear. We study the performance of, and design choices behind, these algorithms. We prove runtime bounds based on the number of vectors N and the dimensionality K. Early algorithms based on divide and conquer established seemingly good average and worst-case asymptotic runtimes. In fact, the problem can be solved in $$\mathcal{O}(KN)$$ average-case (holding K as fixed). We prove, however, that the performance is quite bad with respect to K. We demonstrate that the more recent skyline algorithms are better behaved, and can also achieve $$\mathcal{O}(KN)$$ average-case. While K matters for these, in practice, its effect vanishes in the asymptotic. We introduce a new external algorithm, LESS, that is more efficient and better behaved. We evaluate LESS’s effectiveness and improvement over the field, and prove that its average-case running time is $$\mathcal{O}(KN)$$.