Fast best-effort pattern matching in large attributed graphs

Authors:
Hanghang Tong;Christos Faloutsos;Brian Gallagher;Tina Eliassi-Rad
Affiliations:
Carnegie Mellon University;Carnegie Mellon University;Lawrence Livermore National Laboratory;Lawrence Livermore National Laboratory
Venue:
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2007

Citing 16
Cited 33

OntoSeek: Content-Based Access to the Web

IEEE Intelligent Systems
Mining the Web: Discovering Knowledge from HyperText Data

Mining the Web: Discovering Knowledge from HyperText Data
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Graph-based technologies for intelligence analysis

Communications of the ACM - Homeland security
Graph indexing: a frequent structure-based approach

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Fast discovery of connection subgraphs

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic multimedia cross-modal correlation discovery

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
On mining cross-graph quasi-cliques

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Discovering frequent topological structures from graph datasets

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Finding Frequent Patterns in a Large Sparse Graph*

Data Mining and Knowledge Discovery
Measuring and extracting proximity in networks

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Center-piece subgraphs: problem definition and fast solutions

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Relaxing join and selection queries

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Fast Random Walk with Restart and Its Applications

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Template based semantic similarity for security applications

ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics

Fast mining of complex time-stamped events

Proceedings of the 17th ACM conference on Information and knowledge management
Efficient query processing on graph databases

ACM Transactions on Database Systems (TODS)
iPoG: fast interactive proximity querying on graphs

Proceedings of the 18th ACM conference on Information and knowledge management
Distance-join: pattern match query in a large graph database

Proceedings of the VLDB Endowment
MARGIN: Maximal frequent subgraph mining

ACM Transactions on Knowledge Discovery from Data (TKDD)
Frequent subgraph mining in outerplanar graphs

Data Mining and Knowledge Discovery
DSI: a method for indexing large graphs using distance set

WAIM'10 Proceedings of the 11th international conference on Web-age information management
Graph pattern matching: from intractable to polynomial time

Proceedings of the VLDB Endowment
From engineering diagrams to engineering models: Visual recognition and applications

Computer-Aided Design
Structure and attribute index for approximate graph matching in large graphs

Information Systems
Querying graph patterns

Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Neighborhood based fast graph search in large networks

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A flexible graph pattern matching framework via indexing

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
All normalized anti-monotonic overlap graph measures are bounded

Data Mining and Knowledge Discovery
DELTA: indexing and querying multi-labeled graphs

Proceedings of the 20th ACM international conference on Information and knowledge management
Capturing topology in graph pattern matching

Proceedings of the VLDB Endowment
Answering pattern match queries in large graph databases via graph embedding

The VLDB Journal — The International Journal on Very Large Data Bases
Querying large graph databases

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
BASSET: scalable gateway finder in large graphs

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Distributed graph pattern matching

Proceedings of the 21st international conference on World Wide Web
Frequent temporal social behavior search in information networks

Proceedings of the 21st international conference companion on World Wide Web
Gateway finder in large graphs: problem definitions and fast solutions

Information Retrieval
Finding collections of k-clique percolated components in attributed graphs

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Leyline: provenance-based search using a graphical sketchpad

Proceedings of the Symposium on Human-Computer Interaction and Information Retrieval
G-SPARQL: a hybrid engine for querying large attributed graphs

Proceedings of the 21st ACM international conference on Information and knowledge management
Inexact subgraph isomorphism in MapReduce

Journal of Parallel and Distributed Computing
NeMa: fast graph search with label similarity

Proceedings of the VLDB Endowment
Incremental graph pattern matching

ACM Transactions on Database Systems (TODS)
Graph similarity search with edit distance constraint in large graph databases

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Strong simulation: Capturing topology in graph pattern matching

ACM Transactions on Database Systems (TODS)
Diversified top-k graph pattern matching

Proceedings of the VLDB Endowment
Querying Regular Graph Patterns

Journal of the ACM (JACM)
Hybrid query execution engine for large attributed graphs

Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We focus on large graphs where nodes have attributes, such as a social network where the nodes are labelled with each person's job title. In such a setting, we want to find subgraphs that match a user query pattern. For example, a "star" query would be, "find a CEO who has strong interactions with a Manager, a Lawyer,and an Accountant, or another structure as close to that as possible". Similarly, a "loop" query could help spot a money laundering ring. Traditional SQL-based methods, as well as more recent graph indexing methods, will return no answer when an exact match does not exist. This is the first main feature of our method. It can find exact-, as well as near-matches, and it will present them to the user in our proposed "goodness" order. For example, our method tolerates indirect paths between, say, the "CEO" and the "Accountant" of the above sample query, when direct paths don't exist. Its second feature is scalability. In general, if the query has nq nodes and the data graph has n nodes, the problem needs polynomial time complexity O(n n q), which is prohibitive. Our G-Ray ("Graph X-Ray") method finds high-quality subgraphs in time linear on the size of the data graph. Experimental results on the DLBP author-publication graph (with 356K nodes and 1.9M edges) illustrate both the effectiveness and scalability of our approach. The results agree with our intuition, and the speed is excellent. It takes 4 seconds on average fora 4-node query on the DBLP graph.