Statistical feature selection from chaos game representation for promoter recognition
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Hi-index | 0.00 |
Bioinformatics nowadays is a very attractive field. Many fascinating biological problems were still unsolved, even after a great amount of diverse genomic sequences have been sequenced for the coming of post genome era. Currently available programs are far from powerful enough to recognize the regulatory signals completely. Researches have looked for various types of patterns around the transcription start site (TSS) and tried to translate those as classification rules; however, they were not always good solutions. In this paper, we proposed a new hybrid learning system to recognize the regulatory elements (i.e., promoter) in deoxyribonucleic acid (DNA) sequences. The proposed hybrid system calculated the distributions of oligo-nucleotides statistics as positional weight matrices which contribute to discriminate promoters from non-promoters. This study can help to locate the expressive regions of DNA, to foretell and to realize the properties, structures, and functions of the proteins that are synthesized starting from the coding region of DNA. The benchmark datasets were evaluated using the leave-one-out method. The experimental results demonstrate that the proposed system has higher accuracy than others.