DSI: a method for indexing large graphs using distance set

Authors:
Yubo Kou;Yukun Li;Xiaofeng Meng
Affiliations:
Renmin University of China, Beijing, China;Renmin University of China, Beijing, China;Renmin University of China, Beijing, China
Venue:
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Year:
2010

Citing 18
Cited 0

Algorithmics and applications of tree and graph searching

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The complexity of theorem-proving procedures

STOC '71 Proceedings of the third annual ACM symposium on Theory of computing
Graph indexing: a frequent structure-based approach

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Substructure similarity search in graph databases

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Finding Frequent Patterns in a Large Sparse Graph*

Data Mining and Knowledge Discovery
Closure-Tree: An Index Structure for Graph Queries

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Fg-index: towards verification-free query processing on graph databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Fast best-effort pattern matching in large attributed graphs

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Top-k subgraph matching query in a large graph

Proceedings of the ACM first Ph.D. workshop in CIKM
Towards graph containment search and indexing

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Graph indexing: tree + delta

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Subgraph Support in a Single Large Graph

ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
A novel spectral coding in a large graph database

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Graphs-at-a-time: query language and access methods for graph databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
GADDI: distance index based subgraph matching in biological networks

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
TALE: A Tool for Approximate Large Graph Matching

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Continuous Subgraph Pattern Search over Graph Streams

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Efficient and simple generation of random simple connected graphs with prescribed degree sequence

COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent years we have witnessed a great increase in modeling data as large graphs in multiple domains, such as XML, the semantic web, social network. In these circumstances, researchers are interested in querying the large graph like that: Given a large graph G, and a query Q, we report all the matches of Q in G. Since subgraph isomorphism checking is proved to be NP-Complete[1], it is infeasible to scan the whole large graph for answers, especially when the query's size is also large. Hence, the "filter-verification" approach is widely adopted. In this approach, researchers first index the neighborhood of each vertex in the large graph, then filter vertexes, and finally perform subgraph matching algorithms. Previous techniques mainly focus on efficient matching algorithms, paying little attention to indexing techniques. However, appropriate indexing techniques could help improve the efficiency of query response by generating less candidates. In this paper we investigate indexing techniques on large graphs, and propose an index structure DSI(Distance Set Index) to capture the neighborhood of each vertex. Through our distance set index, more vertexes could be pruned, resulting in a much smaller search space. Then a subgraph matching algorithm is performed in the search space. We have applied our index structure to real datasets and synthetic datasets. Extensive experiments demonstrate the efficiency and effectiveness of our indexing technique.