Performance of an iterated T-HMM for homology detection
Bioinformatics
Protein homology detection using string alignment kernels
Bioinformatics
Research Article: Exploiting three kinds of interface propensities to identify protein binding sites
Computational Biology and Chemistry
Prediction of protein protein interactions from primary sequences
International Journal of Data Mining and Bioinformatics
Protein remote homology detection based on auto-cross covariance transformation
Computers in Biology and Medicine
Hi-index | 0.00 |
Remote homology detection is a key element of protein structure and function analysis in computational and experimental biology. This paper presents a simple representation of protein sequences, which uses the evolutionary information of profiles for efficient remote homology detection. The frequency profiles are directly calculated from the multiple sequence alignments outputted by PSI-BLAST and converted into binary profiles with a probability threshold. Such binary profiles make up of a new building block for protein sequences. The protein sequences are mapped into high-dimensional vectors by the occurrence times of each binary profile. The resulting vectors are then evaluated by support vector machine to train classifiers that are then used to classify the test protein sequences. The method is further improved by applying an efficient feature extraction algorithm from natural language processing, namely, the latent semantic analysis model. Testing on the SCOP 1.53 database shows that the method based on binary profiles outperforms those based on many other basic building blocks including N-grams, patters and motifs. The ROC50 score is 0.698, which is higher than other methods by nearly 10 percent.