Density index and proximity search in large graphs

Authors:
Nan Li;Xifeng Yan;Zhen Wen;Arijit Khan
Affiliations:
UC Santa Barbara, Santa Barbara, CA, USA;UC Santa Barbara, Santa Barbara, CA, USA;IBM Research, Hawthorne, NY, USA;UC Santa Barbara, Santa Barbara, CA, USA
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 23
Cited 0

Finding k points with minimum diameter and related problems

Journal of Algorithms
On saying “Enough already!” in SQL

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
LEDA: a platform for combinatorial and geometric computing

LEDA: a platform for combinatorial and geometric computing
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Evaluating top-k queries over web-accessible databases

ACM Transactions on Database Systems (TODS)
Fast Algorithms for Computing the Smallest k-Enclosing Circle

Algorithmica
Bidirectional expansion for keyword search on graph databases

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Keyword Proximity Search in XML Trees

IEEE Transactions on Knowledge and Data Engineering
Motif Search in Graphs: Application to Metabolic Networks

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
BLINKS: ranked keyword searches on graphs

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Fast algorithms for topk personalized pagerank queries

Proceedings of the 17th international conference on World Wide Web
EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Fast incremental proximity search in large graphs

Proceedings of the 25th international conference on Machine learning
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
PROJECT TEAM SELECTION USING FUZZY OPTIMIZATION APPROACH

Cybernetics and Systems
A team formation model based on knowledge and collaboration

Expert Systems with Applications: An International Journal
Querying Communities in Relational Databases

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Finding a team of experts in social networks

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Walking in facebook: a case study of unbiased sampling of OSNs

INFOCOM'10 Proceedings of the 29th conference on Information communications
Finding approximate and constrained motifs in graphs

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Keyword search in graphs: finding r-cliques

Proceedings of the VLDB Endowment
Random texts exhibit Zipf's-law-like word frequency distribution

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a large real-world graph where vertices are associated with labels, how do we quickly find interesting vertex sets according to a given query? In this paper, we study label-based proximity search in large graphs, which finds the top-k query-covering vertex sets with the smallest diameters. Each set has to cover all the labels in a query. Existing greedy algorithms only return approximate answers, and do not scale well to large graphs. We propose a novel framework, called gDensity, which uses density index and likelihood ranking to find vertex sets in an efficient and accurate manner. Promising vertices are ordered and examined according to their likelihood to produce answers, and the likelihood calculation is greatly facilitated by density indexing. Techniques such as progressive search and partial indexing are further proposed. Experiments on real-world graphs show the efficiency and scalability of gDensity.