Prediction of mRNA polyadenylation sites by support vector machine

  • Authors:
  • Yiming Cheng;Robert M. Miura;Bin Tian

  • Affiliations:
  • Department of Mathematical Sciences, New Jersey Institute of Technology Newark, NJ 07102, USA;Department of Mathematical Sciences, New Jersey Institute of Technology Newark, NJ 07102, USA;Department of Biochemistry and Molecular Biology, New Jersey Medical School University of Medicine and Dentistry of New Jersey, Newark, NJ 07101, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2006

Quantified Score

Hi-index 3.84

Visualization

Abstract

mRNA polyadenylation is responsible for the 3' end formation of most mRNAs in eukaryotic cells and is linked to termination of transcription. Prediction of mRNA polyadenylation sites [poly(A) sites] can help identify genes, define gene boundaries, and elucidate regulatory mechanisms. Current methods for poly(A) site prediction achieve moderate sensitivity and specificity. Here, we present a method using support vector machine for poly(A) site prediction. Using 15 cis-regulatory elements that are over-represented in various regions surrounding poly(A) sites, this method achieves higher sensitivity and similar specificity when compared with polyadq, a common tool for poly(A) site prediction. In addition, we found that while the polyadenylation signal AAUAAA and U-rich elements are primary determinants for poly(A) site prediction, other elements contribute to both sensitivity and specificity of the prediction, indicating a combinatorial mechanism involving multiple elements when choosing poly(A) sites in human cells. Contact: btian@umdnj.edu