Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
The nature of statistical learning theory
The nature of statistical learning theory
The quickhull algorithm for convex hulls
ACM Transactions on Mathematical Software (TOMS)
SCG '97 Proceedings of the thirteenth annual symposium on Computational geometry
RAPID: randomized pharmacophore identification for drug design
SCG '97 Proceedings of the thirteenth annual symposium on Computational geometry
An Algorithm for Finding the Largest Approximately Common Substructures of Two Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
Geometric matching under noise: combinatorial bounds and algorithms
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Introduction to Algorithms
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining Molecular Fragments: Finding Relevant Substructures of Molecules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Local Similarity in RNA Secondary Structures
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Almost-Delaunay simplices: nearest neighbor relations for imprecise points
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
SPIN: mining maximal frequent subgraphs from graph databases
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing itemset patterns: a profile-based approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining closed relational graphs with connectivity constraints
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Almost-Delaunay simplices: Robust neighbor relations for imprecise 3D points using CGAL
Computational Geometry: Theory and Applications
Effective and efficient itemset pattern summarization: regression-based approaches
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Structure feature selection for graph classification
Proceedings of the 17th ACM conference on Information and knowledge management
Efficient query processing on graph databases
ACM Transactions on Database Systems (TODS)
Graph classification based on pattern co-occurrence
Proceedings of the 18th ACM conference on Information and knowledge management
gRegress: extracting features from graph transactions for regression
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Protein Structure Classification Based on Conserved Hydrophobic Residues
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Output space sampling for graph patterns
Proceedings of the VLDB Endowment
Towards proximity pattern mining in large graphs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
IEEE Computational Intelligence Magazine
An efficient features-based processing technique for supergraph queries
Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Fast graph query processing with a low-cost index
The VLDB Journal — The International Journal on Very Large Data Bases
EFP-M2: efficient model for mining frequent patterns in transactional database
ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part II
Hybrid query execution engine for large attributed graphs
Information Systems
Hi-index | 0.00 |
Finding recurring residue packing patterns, or spatial motifs, that characterize protein structural families is an important problem in bioinformatics. We apply a novel frequent subgraph mining algorithm to three graph representations of protein three-dimensional (3D) structure. In each protein graph, a vertex represents an amino acid. Vertex-residues are connected by edges using three approaches: first, based on simple distance threshold between contact residues; second using the Delaunay tessellation from computational geometry, and third using the recently developed almost-Delaunay tessellation approach.Applying a frequent subgraph mining algorithm to a set of graphs representing a protein family from the Structural Classification of Proteins (SCOP) database, we typically identify several hundred common subgraphs equivalent to common packing motifs found in the majority of proteins in the family. We also use the counts of motifs extracted from proteins in two different SCOP families as input variables in a binary classification experiment. The resulting models are capable of predicting the protein family association with the accuracy exceeding 90 percent. Our results indicate that graphs based on both almost-Delaunay and Delaunay tessellations are sparser than the contact distance graphs; yet they are robust and efficient for mining protein spatial motif.