Efficient processing of group-oriented connection queries in a large graph

Authors:
James Cheng;Yiping Ke;Wilfred Ng
Affiliations:
Nanyang Technological University, Singapore, Singapore, Singapore;Chinese University of Hong Kong, Hong Kong, Hong Kong;Hong Kong University of Science and Technology, Hong Kong, Hong Kong
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 7
Cited 2

A faster approximation algorithm for the Steiner problem in graphs

Information Processing Letters
Fast discovery of connection subgraphs

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Measuring and extracting proximity in networks

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Center-piece subgraphs: problem definition and fast solutions

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Random Walk with Restart and Its Applications

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Context-Aware Object Connection Discovery in Large Graphs

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
The web as a graph: measurements, models, and methods

COCOON'99 Proceedings of the 5th annual international conference on Computing and combinatorics

Structure and attribute index for approximate graph matching in large graphs

Information Systems
REX: explaining relationships between entity pairs

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study query processing in large graphs that are fundamental data model underpinning various social networks and Web structures. Given a set of query nodes, we aim to find the groups which the query nodes belong to, as well as the best connection among the groups. Such a query is useful to many applications but the query processing is extremely costly. We define a new notion of Correlation Group (CG), which is a set of nodes that are strongly correlated in a large graph G. We then extract the subgraph from G that gives the best connection for the nodes in a CG. To facilitate query processing, we develop an efficient index built upon the CGs. Our experiments show that the CGs are meaningful as groups and importantly, the meaningfulness of the query results are justifiable. We also demonstrate the high efficiency of CG computation, index construction and query processing.