Isolasso: a lasso regression approach to RNA-seq based transcriptome assembly

Authors:
Wei Li;Jianxing Feng;Tao Jiang
Affiliations:
Department of Computer Science and Engineering, University of California, Riverside, CA;College of Life Science and Biotechnology, Tongji University, Shanghai, China;Department of Computer Science and Engineering, University of California, Riverside, CA and School of Information Science and Technology, Tsinghua University, Beijing, China
Venue:
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Year:
2011

Citing 8
Cited 4

Constructing and Analyzing a Large-Scale Gene-to-Gene Regulatory Network-Lasso-Constrained Inference and Biological Validation

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Genome-wide association analysis by lasso penalized logistic regression

Bioinformatics
Statistical inferences for isoform expression in RNA-Seq

Bioinformatics
TopHat

Bioinformatics
A multivariate regression approach to association analysis of a quantitative trait network

Bioinformatics
De novo transcriptome assembly with ABySS

Bioinformatics
Inference of isoforms from short sequence reads

RECOMB'10 Proceedings of the 14th Annual international conference on Research in Computational Molecular Biology
Accurate estimation of expression levels of homologous genes in RNA-seq experiments

RECOMB'10 Proceedings of the 14th Annual international conference on Research in Computational Molecular Biology

A robust method for transcript quantification with RNA-seq data

RECOMB'12 Proceedings of the 16th Annual international conference on Research in Computational Molecular Biology
An integer programming approach to novel transcript reconstruction from paired-end RNA-Seq reads

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
CLIIQ: accurate comparative detection and quantification of expressed isoforms in a population

WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
SpliceGrapherXT: From Splice Graphs to Transcripts Using RNA-Seq

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The new second generation sequencing technology revolutionizes many biology related research fields, and posts various computational biology challenges. One of them is transcriptome assembly based on RNA-Seq data, which aims at reconstructing all full-length mRNA transcripts simultaneously from millions of short reads. In this paper, we consider three objectives in transcriptome assembly: the maximization of prediction accuracy, minimization of interpretation, and maximization of completeness. The first objective, the maximization of prediction accuracy, requires that the estimated expression levels based on assembled transcripts should be as close as possible to the observed ones for every expressed region of the genome. The minimization of interpretation follows the parsimony principle to seek as few transcripts in the prediction as possible. The third objective, the maximization of completeness, requires that the maximum number of mapped reads (or "expressed segments" in gene models) be explained by (i.e., contained in) the predicted transcripts in the solution. Based on the above three objectives, we present IsoLasso, a new RNA-Seq based transcriptome assembly tool. IsoLasso is based on the well-known LASSO algorithm, a multivariate regression method designated to seek a balance between the maximization of prediction accuracy and the minimization of interpretation. By including some additional constraints in the quadratic program involved in LASSO, IsoLasso is able to make the set of assembled transcripts as complete as possible. Experiments on simulated and real RNA-Seq datasets show that IsoLasso achieves higher sensitivity and precision simultaneously than the state-of-art transcript assembly tools