Machine Learning
The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching
The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching
Searching protein structure databases with DaliLite v.3
Bioinformatics
Bioinformatics
Hi-index | 0.00 |
Given the rapidly increasing quantity of available genomic and proteomic data, efficient and reliable analysis of protein 3D structures has become a major challenge in the post genomic era. In this work, we introduce the sorted protein shape context, and its encoding into a protein shape string as an effective descriptor for protein 3D structures. Based on the new encoding, we present a method for predicting the functional family for a given protein 3D structure. Applying the proposed method on a dataset of known protein families from Pfam resulted in an average Type I error rate of 10% and Type II error rate of 0.1%.