Nuclear localization signal prediction based on sequential pattern mining
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Hi-index | 3.84 |
Motivation: Nucleo-cytoplasmic trafficking of proteins is a core regulatory process that sustains the integrity of the nuclear space of eukaryotic cells via an interplay between numerous factors. Despite progress on experimentally characterizing a number of nuclear localization signals, their presence alone remains an unreliable indicator of actual translocation. Results: This article introduces a probabilistic model that explicitly recognizes a variety of nuclear localization signals, and integrates relevant amino acid sequence and interaction data for any candidate nuclear protein. In particular, we develop and incorporate scoring functions based on distinct classes of classical nuclear localization signals. Our empirical results show that the model accurately predicts whether a protein is imported into the nucleus, surpassing the classification accuracy of similar predictors when evaluated on the mouse and yeast proteomes (area under the receiver operator characteristic curve of 0.84 and 0.80, respectively). The model also predicts the sequence position of a nuclear localization signal and whether it interacts with importin-α. Availability: http://pprowler.itee.uq.edu.au/NucImport Contact: m.boden@uq.edu.au Supplementary information:Supplementary data are available at Bioinformatics online.