Distribution associated with stochastic processes of gene expression in a single eukaryotic cell

  • Authors:
  • Vladimir A. Kuznetsov

  • Affiliations:
  • Laboratory of Integrative and Medical Biophysics, NICHD, NIH, Bethesda, MD

  • Venue:
  • EURASIP Journal on Applied Signal Processing
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ability to simultaneously measure mRNA abundance for large number of genes has revolutionized biological research by allowing statistical analysis of global gene-expression data. Large-scale gene-expression data sets have been analyzed in order to identify the probability distributions of gene-expression levels (or transcript copy numbers) in eukaryotic cells. Determining such function(s) may provide a theoretical basis for accurately counting all expressed genes in a given cell and for understanding gene-expression control. Using the gene-expression libraries derived from yeast cells and from different human cell tissues we found that all observed gene-expression levels data appear to follow a Pareto-like skewed frequency distribution with parameters dependent of the size of the libraries. We produced the skewed probability function, called the binomial differential distribution, that accounts for many rarely transcribed genes in a single cell. We also developed a novel method for estimating and removing major experimental errors and redundancies from the Serial Analysis Gene Expression (SAGE) data sets. We successfully applied this method to the yeast transcriptome. A "basal" random transcription mechanism for all protein-coding genes in every eukaryotic cell type is predicted.