High-Order Pattern Discovery from Discrete-Valued Data
IEEE Transactions on Knowledge and Data Engineering
Granular computing: an emerging paradigm
Granular computing: an emerging paradigm
Information Sciences—Informatics and Computer Science: An International Journal
Multipattern consensus regions in multiple aligned protein sequences and their segmentation
EURASIP Journal on Bioinformatics and Systems Biology
InfoBarcoding: Selection of non-contiguous sites in molecular biomarker
ICCABS '11 Proceedings of the 2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences
Hi-index | 0.00 |
The relationship connecting the biomolecular sequence, the molecular structure, and the biological function is of extreme importance in nanostructure analysis such as drug discovery. Previous studies involving multiple sequence alignment of biomolecules have demonstrated that associated sites are indicative of the structural and functional characteristics of biomolecules, comparable to methods such as consensus sequences analysis. In this paper, a new method to detect associated sites in aligned sequence ensembles is proposed. It involves the use of multiple sub-tables (or levels) of two-dimensional contingency table analysis. The idea is to incorporate analysis by using a concept known as granular computing, which represents information at different levels of granularity. The analysis involves two phases. The first phase includes labeling of the molecular sites in the p53 protein multiple sequence alignment according to the detected associated patterns. The sites are consequently labeled into three different types based on their site characteristics: 1) conserved sites, 2) associated sites and 3) hypervariate sites. In the second phase, the significance of the extracted site patterns is evaluated with respect to targeted structural and functional characteristics of the p53 protein. The results indicate that the extracted site patterns are significantly associated with some of the known functionalities of p53, a cancer suppressor. Furthermore, when these sites are aligned with p63 and p73, the homologs of p53 without the same cancer suppressing property, based on the common domains, the sites significantly discriminate between the human sequences of the p53 family. Therefore, the study confirms the importance of these detected sites that could indicate their differences in cancer suppressing property.