SAGA: a subgraph matching tool for biological graphs

Authors:
Yuanyuan Tian;Richard C. Mceachin;Carlos Santos;David J. States;Jignesh M. Patel
Affiliations:
Department of Electrical Engineering and Computer Science, University of Michigan Ann Arbor, MI 48109, USA;National Center for Integrative Biomedical Informatics, University of Michigan Ann Arbor, MI 48109, USA;Department of Human Genetics and Bioinformatics Program, University of Michigan Ann Arbor, MI 48109, USA;Department of Human Genetics and Bioinformatics Program, University of Michigan Ann Arbor, MI 48109, USA;Department of Electrical Engineering and Computer Science, University of Michigan Ann Arbor, MI 48109, USA
Venue:
Bioinformatics
Year:
2007

Citing 0
Cited 30

Towards graph containment search and indexing

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Graphs-at-a-time: query language and access methods for graph databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Periscope/GQ: a graph querying toolkit

Proceedings of the VLDB Endowment
G-hash: towards fast kernel-based similarity search in large graph databases

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Comparing stars: on approximating graph edit distance

Proceedings of the VLDB Endowment
Distance-join: pattern match query in a large graph database

Proceedings of the VLDB Endowment
DOGMA: A Disk-Oriented Graph Matching Algorithm for RDF Databases

ISWC '09 Proceedings of the 8th International Semantic Web Conference
Sentence generation for artificial brains: A glocal similarity-matching approach

Neurocomputing
On graph query optimization in large networks

Proceedings of the VLDB Endowment
A tool for fast indexing and querying of graphs

Proceedings of the 20th international conference companion on World wide web
Computing subgraph isomorphic queries using structural unification and minimum graph structures

Proceedings of the 2011 ACM Symposium on Applied Computing
Structure and attribute index for approximate graph matching in large graphs

Information Systems
Neighborhood based fast graph search in large networks

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Asymmetric Comparison and Querying of Biological Networks

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
An edge-based framework for fast subgraph matching in a large graph

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Subgraph search over massive disk resident graphs

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
From graphs to events: a subgraph matching approach for information extraction from biomedical text

BioNLP Shared Task '11 Proceedings of the BioNLP Shared Task 2011 Workshop
Answering pattern match queries in large graph databases via graph embedding

The VLDB Journal — The International Journal on Very Large Data Bases
Approximate matching over biological RDF graphs

Proceedings of the 27th Annual ACM Symposium on Applied Computing
WSM: a novel algorithm for subgraph matching in large weighted graphs

Journal of Intelligent Information Systems
Review of bisonet abstraction techniques

Bisociative Knowledge Discovery
Faster subgraph isomorphism detection by well-founded total order indexing

Pattern Recognition Letters
QSEA for fuzzy subgraph querying of KEGG pathways

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Top-k Similar Graph Matching Using TraM in Biological Networks

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
NeMa: fast graph search with label similarity

Proceedings of the VLDB Endowment
Using substructure mining to identify misbehavior in network provenance graphs

First International Workshop on Graph Data Management Experiences and Systems
Graph similarity search with edit distance constraint in large graph databases

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Facilitating representation and retrieval of structured cases: Principles and toolkit

Information Systems
SQBC: An efficient subgraph matching method over large and dense graphs

Information Sciences: an International Journal
Querying KEGG pathways in logic

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at http://www.eecs.umich.edu/saga. Binaries are also freely available at this website. Contact: jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at http://www.eecs.umich.edu/periscope/publ/saga-suppl.pdf.