Least-Squares Fitting of Two 3-D Point Sets
IEEE Transactions on Pattern Analysis and Machine Intelligence
International Journal of Robotics Research
Matrix computations (3rd ed.)
Estimating 3-D rigid body transformations: a comparison of four major algorithms
Machine Vision and Applications - Special issue on performance evaluation
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
An Efficient Index-based Protein Structure Database Searching Method
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
Optimal suffix tree construction with large alphabets
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
New hashing techniques and their application to a protein structure database system
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Towards Index-based Similarity Search for Protein Structure Databases
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
PSIST: Indexing Protein Structures Using Suffix Trees
CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Neural Network Method for Protein Structure Search Using Cell-Cell Adhesion
Neural Information Processing
Searching Protein 3-D Structures in Linear Time
RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Geometric suffix tree: Indexing protein 3-D structures
Journal of the ACM (JACM)
Protein conformational flexibility analysis with noisy data
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
Prefix-shuffled geometric suffix tree
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
IEEE Transactions on Information Technology in Biomedicine
Hi-index | 0.00 |
Protein structure analysis is one of the most important research issues in the post-genomic era, and faster and more accurate query data structures for such 3-D structures are highly desired for research on proteins. This paper proposes a new data structure for indexing protein 3-D structures. For strings, there are many efficient indexing structures such as suffix trees, but it has been considered very difficult to design such sophisticated data structures against 3-D structures like proteins. Our index structure is based on the suffix trees and is called the geometric suffix tree. By using the geometric suffix tree for a set of protein structures, we can search for all of their substructures whose RMSDs (root mean square deviations) or URMSDs (unit-vector root mean square deviations) to a given query 3-D structure are not larger than a given bound. Though there are O(N2) substructures, our data structure requires only O(N) space where N is the sum of lengths of the set of proteins. We propose an O(N2) construction algorithm for it, while a naive algorithm would require O(N3) time to construct it. Moreover we propose an efficient search algorithm. We also show computational experiments to demonstrate the practicality of our data structure. The experiments show that the construction time of the geometric suffix tree is practically almost linear to the size of the database, when applied to a protein structure database.