Journal of Biomedical Informatics
Estimation of alternative splicing isoform frequencies from RNA-Seq data
WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
Isolasso: a lasso regression approach to RNA-seq based transcriptome assembly
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
T-IDBA: a de novo iterative de bruijn graph assembler for transcriptome
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Rapid parallel genome indexing with MapReduce
Proceedings of the second international workshop on MapReduce and its applications
Optimizing bioinformatics workflows for data analysis using cloud management techniques
Proceedings of the 6th workshop on Workflows in support of large-scale science
Inference of isoforms from short sequence reads
RECOMB'10 Proceedings of the 14th Annual international conference on Research in Computational Molecular Biology
Unified view of backward backtracking in short read mapping
Algorithms and Applications
TrueSight: self-training algorithm for splice junction detection using RNA-seq
RECOMB'12 Proceedings of the 16th Annual international conference on Research in Computational Molecular Biology
A Cloud Infrastructure for Optimization of a Massive Parallel Sequencing Workflow
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
POPE: pipeline of parentally-biased expression
ISBRA'12 Proceedings of the 8th international conference on Bioinformatics Research and Applications
An integer programming approach to novel transcript reconstruction from paired-end RNA-Seq reads
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Adaptive resource configuration for Cloud infrastructure management
Future Generation Computer Systems
CLIIQ: accurate comparative detection and quantification of expressed isoforms in a population
WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
Comparing DNA sequence collections by direct comparison of compressed text indexes
WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
A dynamic pipeline for RNA sequencing on multicore processors
Proceedings of the 20th European MPI Users' Group Meeting
SpliceGrapherXT: From Splice Graphs to Transcripts Using RNA-Seq
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Transforming Genomes Using MOD Files with Applications
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Read Annotation Pipeline for High-Throughput Sequencing Data
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Systematic Assessment of RNA-Seq Quantification Tools Using Simulated Sequence Data
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Managing and Optimizing Bioinformatics Workflows for Data Analysis in Clouds
Journal of Grid Computing
Genome-Guided Transcriptome Assembly in the Age of Next-Generation Sequencing
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 3.84 |
Motivation: A new protocol for sequencing the messenger RNA in a cell, known as RNA-Seq, generates millions of short sequence fragments in a single run. These fragments, or ‘reads’, can be used to measure levels of gene expression and to identify novel splice variants of genes. However, current software for aligning RNA-Seq data to a genome relies on known splice junctions and cannot identify novel ones. TopHat is an efficient read-mapping algorithm designed to align reads from an RNA-Seq experiment to a reference genome without relying on known splice sites. Results: We mapped the RNA-Seq reads from a recent mammalian RNA-Seq experiment and recovered more than 72% of the splice junctions reported by the annotation-based software from that study, along with nearly 20 000 previously unreported junctions. The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer. We describe several challenges unique to ab initio splice site discovery from RNA-Seq reads that will require further algorithm development. Availability: TopHat is free, open-source software available from http://tophat.cbcb.umd.edu Contact: cole@cs.umd.edu Supplementary information:Supplementary data are available at Bioinformatics online.