Algorithmic graph theory
Combinatorial pattern discovery for scientific data: some preliminary results
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Machine Discovery of Protein Motifs
Machine Learning - Special issue on applications in molecular biology
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Discovery in Biomolecular Data: Tools, Techniques, and Applications
Pattern Discovery in Biomolecular Data: Tools, Techniques, and Applications
Levelwise Search and Borders of Theories in KnowledgeDiscovery
Data Mining and Knowledge Discovery
An Empirical Study of Domain Knowledge and Its Benefits to Substructure Discovery
IEEE Transactions on Knowledge and Data Engineering
Knowledge Discovery in Molecular Databases
IEEE Transactions on Knowledge and Data Engineering
INTSYS '98 Proceedings of the IEEE International Joint Symposia on Intelligence and Systems
Limitations of Geometric Hashing in the Presence of Gaussian Noise
Limitations of Geometric Hashing in the Presence of Gaussian Noise
Affine Matching With Bounded Sensor Error: Study of Geometric Hashing and Alignment
Affine Matching With Bounded Sensor Error: Study of Geometric Hashing and Alignment
Scalable Parallel Geometric Hashing for Hypercube SIMD Architechtures
Scalable Parallel Geometric Hashing for Hypercube SIMD Architechtures
On a Parallel Implementation of Geometric Hashing on the Connection Machine
On a Parallel Implementation of Geometric Hashing on the Connection Machine
Knowledge discovery in molecular structure databases
Knowledge discovery in molecular structure databases
Unordered Tree Mining with Applications to Phylogeny
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Parallel algorithms for mining frequent structural motifs in scientific data
Proceedings of the 18th annual international conference on Supercomputing
Extracting frequent connected subgraphs from large graph sets
Journal of Computer Science and Technology
Finding Patterns on Protein Surfaces: Algorithms and Applications to Protein Classification
IEEE Transactions on Knowledge and Data Engineering
Finding Frequent Patterns in a Large Sparse Graph*
Data Mining and Knowledge Discovery
SEFM '05 Proceedings of the Third IEEE International Conference on Software Engineering and Formal Methods
Frequency-based views to pattern collections
Discrete Applied Mathematics - Special issue: Discrete mathematics & data mining II (DM & DM II)
Discovering Frequent Graph Patterns Using Disjoint Paths
IEEE Transactions on Knowledge and Data Engineering
Discovering frequent geometric subgraphs
Information Systems
A platform based on the multi-dimensional data modal for analysis of bio-molecular structures
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Protein secondary structure prediction using rule induction from covering
CIBCB'09 Proceedings of the 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology
Frequency-based views to pattern collections
Discrete Applied Mathematics - Special issue: Discrete mathematics & data mining II (DM & DM II)
An efficient algorithm of frequent connected subgraph extraction
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Implicit enumeration of patterns
KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
MICCAI'06 Proceedings of the 9th international conference on Medical Image Computing and Computer-Assisted Intervention - Volume Part II
Hi-index | 0.00 |
This paper presents a method for finding patterns in 3D graphs. Each node in a graph is an undecomposable or atomic unit and has a label. Edges are links between the atomic units. Patterns are rigid substructures that may occur in a graph after allowing for an arbitrary number of whole-structure rotations and translations as well as a small number (specified by the user) of edit operations in the patterns or in the graph. (When a pattern appears in a graph only after the graph has been modified, we call that appearance 驴approximate occurrence.驴) The edit operations include relabeling a node, deleting a node and inserting a node. The proposed method is based on the geometric hashing technique, which hashes node-triplets of the graphs into a 3D table and compresses the label-triplets in the table. To demonstrate the utility of our algorithms, we discuss two applications of them in scientific data mining. First, we apply the method to locating frequently occurring motifs in two families of proteins pertaining to RNA-directed DNA Polymerase and Thymidylate Synthase and use the motifs to classify the proteins. Then, we apply the method to clustering chemical compounds pertaining to aromatic, bicyclicalkanes, and photosynthesis. Experimental results indicate the good performance of our algorithms and high recall and precision rates for both classification and clustering.