Database support for matching: limitations and opportunities

Authors:
Ameet Kini;Srinath Shankar;Jeffrey F. Naughton;David J. Dewitt
Affiliations:
University of Wisconsin - Madison, Madison, WI;University of Wisconsin - Madison, Madison, WI;University of Wisconsin - Madison, Madison, WI;University of Wisconsin - Madison, Madison, WI
Venue:
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Year:
2006

Citing 13
Cited 2

A new approach to the maximum-flow problem

Journal of the ACM (JACM)
An optimal algorithm for on-line bipartite matching

STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
Network flows: theory, algorithms, and applications

Network flows: theory, algorithms, and applications
A framework for expressing and combining preferences

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
The onion technique: indexing for linear optimization queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Condor: a distributed job scheduler

Beowulf cluster computing with Linux
Evaluating Top-k Selection Queries

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Materialization and Incremental Update of Path Information

Proceedings of the Fifth International Conference on Data Engineering
Matchmaking: Distributed Resource Management for High Throughput Computing

HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Supporting top-k join queries in relational databases

The VLDB Journal — The International Journal on Very Large Data Bases
AdWords and Generalized On-line Matching

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Efficient approximation of optimization queries under parametric aggregation constraints

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Merging the results of approximate match operations

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Clustera: an integrated computation and data management system

Proceedings of the VLDB Endowment
Generating databases for query workloads

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

We define a match join of R and S with predicate θ to be a subset of the θ-join of R and S such that each tuple of R and S contributes to at most one result tuple. Match joins and their generalizations belong to a broad class of matching problems that have attracted a great deal of attention in disciplines including operations research and theoretical computer science. Instances of these problems arise in practice in resource allocation scenarios. To the best of our knowledge no one uses an RDBMS as a tool to help solve these problems; our goal in this paper is to explore whether or not this needs to be the case. We show that the simple approach of computing the full θ-join and then applying standard graph-matching algorithms to the result is ineffective for all but the smallest of problem instances. By contrast, a closer study shows that the DBMS primitives of grouping, sorting, and joining can be exploited to yield efficient match join operations. This suggests that RDBMSs can play a role in matching related problems beyond merely serving as expensive file systems exporting data sets to external user programs.