Exact and approximate algorithms for the most connected vertex problem

Authors:
Cheng Sheng;Yufei Tao;Jianzhong Li
Affiliations:
Chinese University of Hong Kong, Sha Tin, Hong Kong;Chinese University of Hong Kong and Korea Advanced Institute of Science and Technology, Korea;Harbin Institute of Technology, China
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2012

Citing 32
Cited 0

A guided tour of Chernoff bounds

Information Processing Letters
Property testing and its connection to learning and approximation

Journal of the ACM (JACM)
Online computation and competitive analysis

Online computation and competitive analysis
Adaptive set intersections, unions, and differences

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Dynamic Maintenance of Maxima of 2-d Point Sets

SIAM Journal on Computing
Optimal aggregation algorithms for middleware

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Introduction to algorithms

Introduction to algorithms
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Testing k-colorability

SIAM Journal on Discrete Mathematics
Data on Air: Organization and Access

IEEE Transactions on Knowledge and Data Engineering
A Lower Bound for Testing 3-Colorability in Bounded-Degree Graphs

FOCS '02 Proceedings of the 43rd Symposium on Foundations of Computer Science
Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
Finding hidden independent sets in interval graphs

Theoretical Computer Science
Learning a Hidden Matching

SIAM Journal on Computing
Top-k Spatial Joins

IEEE Transactions on Knowledge and Data Engineering
Fast Approximate Similarity Search in Extremely High-Dimensional Data Sets

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Progressive skyline computation in database systems

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Finding k-dominant skylines in high dimensional space

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Exact indexing of dynamic time warping

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Supporting top-K join queries in relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Survey of graph database models

ACM Computing Surveys (CSUR)
A Characterization of the (Natural) Graph Properties Testable with One-Sided Error

SIAM Journal on Computing
Learning a hidden graph using O( logn) queries per edge

Journal of Computer and System Sciences
Yes, there is a correlation: - from social networks to personal behavior on the web

Proceedings of the 17th international conference on World Wide Web
Evaluating rank joins with optimal cost

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Probabilistic top-k and ranking-aggregate queries

ACM Transactions on Database Systems (TODS)
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
Every Monotone Graph Property Is Testable

SIAM Journal on Computing
The geometry of binary search trees

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Multi-dimensional top-k dominating queries

The VLDB Journal — The International Journal on Very Large Data Bases
Instance-Optimal Geometric Algorithms

FOCS '09 Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science
Finding maximum degrees in hidden bipartite graphs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

An (edge) hidden graph is a graph whose edges are notexplicitly given. Detecting the presence of an edge requires an expensive edge probing query. We consider the k Most Connected Vertex (k-MCV) problem on hidden bipartite graphs. Given a bipartite graph G with independent vertex sets B and W, the goal is to find the k vertices in B with the largest degrees using the minimum number of queries. This problem can be regarded as a top-k extension of semi-join, and is encountered in several applications in practice. If B and W have n and m vertices, respectively, the number of queries needed to solve the problem is nm in the worst case. This, however, is a pessimistic estimate on how many queries are necessary on practical data. In fact, on some inputs, the problem may be settled with only km + n queries, which is significantly lower than nm for k ≪ n. The huge difference between km + n and nm makes it interesting to design an adaptive algorithm that is guaranteed to achieve the best possible performance on every input G. For k ≤ n/2, we give an algorithm that is instance optimal among a broad class of solutions. This means that, for any G, our algorithm can perform more queries than the optimal solution (which is unknown) by only a constant factor, which can be shown at most 2. As a second step, we study an ϵ-approximate version of the k-MCV problem, where ϵ is a parameter satisfying 0 k black vertices b1, …, bk such that the degree of bi (i ≤ k) can be smaller than ti by a factor of at most ϵ, where ti, …, tk (in nonascending order) are the degrees of the k most connected black vertices. We give an efficient randomized algorithm that successfully finds the correct answer with high probability. In particular, for a fixed ϵ and a fixed success probability, our algorithm performs o(nm) queries in expectation for tk = ω(log n). In other words, whenever tk is greater than log n by more than a constant, our algorithm beats the Ω(nm) lower bound for solving the k-MCV problem exactly. All the proposed algorithms, despite the complication of their underlying theory, are simple enough for easy implementation in practice. Extensive experiments have confirmed that their performance in reality agrees with our theoretical findings very well.