Prediction of methylation status on DNA sequences and identification of its important DNA sequence features

  • Authors:
  • Yoichi Yamada;Kenji Satou

  • Affiliations:
  • Graduate School of Natural Science and Technology, Kanazawa University, Kanazawa, Japan;Graduate School of Natural Science and Technology, Kanazawa University, Kanazawa, Japan

  • Venue:
  • BEBI'08 Proceedings of the 1st WSEAS international conference on Biomedical electronics and biomedical informatics
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In mammals, cytosines of most CpG dinucleotides in their genomes except gene promoters are subject to modification by methyl group (methylation). A number of genes in a mammal are regulated developmental-specifically or tissue-specifically by the methylation. Mammalian DNA methylation contributes to regulation of gene expression, repression of parasitic sequences, inactivation of X chromosome in female, genomic imprinting, etc. Aberrant methylation results in a cancer or a part of genetic diseases in human. Therefore it is required that methylation status on human genome is comprehensively revealed in each kind of cells. However, since comprehensive methylation analyses require a lot of times and large labor, methylation status on only a part of genomic regions is revealed in mammals. Because of this, machine learning using already known methylation data and prediction of methylation status on other genomic regions are important. Moreover, since sequence differences between unmethylated and methylated DNA regions also remain unclear, those differences should be also determined. Therefore we conducted machine learning by support vector machine using our previously reported methylation data, and predicted methylation status on DNA sequences from DNA sequence features. Furthermore we explored different sequence features between unmethylated and methylated DNA sequences using random forest.