Efficient Graph Similarity Joins with Edit Distance Constraints

Authors:
Xiang Zhao;Chuan Xiao;Xuemin Lin;Wei Wang
Affiliations:
-;-;-;-
Venue:
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Year:
2012

Citing 0
Cited 4

Efficient algorithms for generalized subgraph query processing

Proceedings of the 21st ACM international conference on Information and knowledge management
CTrace: semantic comparison of multi-granularity process traces

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Graph similarity search with edit distance constraint in large graph databases

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Efficient processing of graph similarity queries with edit distance constraints

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Graphs are widely used to model complicated data semantics in many applications in bioinformatics, chemistry, social networks, pattern recognition, etc. A recent trend is to tolerate noise arising from various sources, such as erroneous data entry, and find similarity matches. In this paper, we study the graph similarity join problem that returns pairs of graphs such that their edit distances are no larger than a threshold. Inspired by the q-gram idea for string similarity problem, our solution extracts paths from graphs as features for indexing. We establish a lower bound of common features to generate candidates. An efficient algorithm is proposed to exploit both matching and mismatching features to improve the filtering and verification on candidates. We demonstrate the proposed algorithm significantly outperforms existing approaches with extensive experiments on publicly available datasets.