Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
CloseGraph: mining closed frequent graph patterns
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
The link prediction problem for social networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Propagation of trust and distrust
Proceedings of the 13th international conference on World Wide Web
Graph indexing: a frequent structure-based approach
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs
IEEE Transactions on Pattern Analysis and Machine Intelligence
Substructure similarity search in graph databases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Foundations of probabilistic answers to queries
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
ExOR: opportunistic multi-hop routing for wireless networks
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Closure-Tree: An Index Structure for Graph Queries
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Management of probabilistic data: foundations and challenges
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Scalable semantic web data management using vertical partitioning
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Managing and Mining Uncertain Data
Managing and Mining Uncertain Data
Query Evaluation on Probabilistic RDF Databases
WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
Managing and Mining Graph Data
Managing and Mining Graph Data
Probabilistic path queries in road networks: traffic uncertainty aware path selection
Proceedings of the 13th International Conference on Extending Database Technology
Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Frequent Subgraph Patterns from Uncertain Graph Data
IEEE Transactions on Knowledge and Data Engineering
Efficient query answering in probabilistic RDF graphs
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Distance-constraint reachability computation in uncertain graphs
Proceedings of the VLDB Endowment
Efficiently answering probability threshold-based shortest path queries over uncertain graphs
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
An Efficient Graph Indexing Method
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Efficient subgraph matching on billion node graphs
Proceedings of the VLDB Endowment
Efficient subgraph matching on billion node graphs
Proceedings of the VLDB Endowment
Using substructure mining to identify misbehavior in network provenance graphs
First International Workshop on Graph Data Management Experiences and Systems
Hi-index | 0.00 |
Many studies have been conducted on seeking the efficient solution for subgraph similarity search over certain (deterministic) graphs due to its wide application in many fields, including bioinformatics, social network analysis, and Resource Description Framework (RDF) data management. All these works assume that the underlying data are certain. However, in reality, graphs are often noisy and uncertain due to various factors, such as errors in data extraction, inconsistencies in data integration, and privacy preserving purposes. Therefore, in this paper, we study subgraph similarity search on large probabilistic graph databases. Different from previous works assuming that edges in an uncertain graph are independent of each other, we study the uncertain graphs where edges' occurrences are correlated. We formally prove that subgraph similarity search over probabilistic graphs is #P-complete, thus, we employ a filter-and-verify framework to speed up the search. In the filtering phase, we develop tight lower and upper bounds of subgraph similarity probability based on a probabilistic matrix index, PMI. PMI is composed of discriminative subgraph features associated with tight lower and upper bounds of subgraph isomorphism probability. Based on PMI, we can sort out a large number of probabilistic graphs and maximize the pruning capability. During the verification phase, we develop an efficient sampling algorithm to validate the remaining candidates. The efficiency of our proposed solutions has been verified through extensive experiments.