TrueSight: self-training algorithm for splice junction detection using RNA-seq

  • Authors:
  • Yang Li;Hong-Mei Li;Paul Burns;Mark Borodovsky;Gene E. Robinson;Jian Ma

  • Affiliations:
  • Department of Bioengineering, University of Illinois, Urbana-Champaign, USA and Institute for Genomic Biology, University of Illinois, Urbana-Champaign;Institute for Genomic Biology, University of Illinois, Urbana-Champaign, USA and Department of Entomology, University of Illinois, Urbana-Champaign;Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology;Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, USA and School of Computational Science & Engineering, Georgia Institute of Technology;Institute for Genomic Biology, University of Illinois, Urbana-Champaign, USA and Department of Entomology, University of Illinois, Urbana-Champaign;Department of Bioengineering, University of Illinois, Urbana-Champaign, USA and Institute for Genomic Biology, University of Illinois, Urbana-Champaign

  • Venue:
  • RECOMB'12 Proceedings of the 16th Annual international conference on Research in Computational Molecular Biology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

RNA-seq has proven to be a powerful technique for transcriptome profiling based on next-generation sequencing (NGS) technologies. However, due to the limited read length of NGS data, it is extremely challenging to accurately map RNA-seq reads to splice junctions, which is critically important for the analysis of alternative splicing and isoform construction. Several tools have been developed to find splice junctions by RNA-seq de novo, without the aid of gene annotations [1-3]. However, the sensitivity and specificity of these tools need to be improved. In this paper, we describe a novel method, called TrueSight, that combines information from (i) RNA-seq read mapping quality and (ii) coding potential from the reference genome sequences into a unified model that utilizes semi-supervised learning to precisely identify splice junctions.