Tests for Gene Clusters Satisfying the Generalized Adjacency Criterion

  • Authors:
  • Ximing Xu;David Sankoff

  • Affiliations:
  • Department of Mathematics and Statistics, University of Ottawa, Ottawa, Canada K1N 6N5;Department of Mathematics and Statistics, University of Ottawa, Ottawa, Canada K1N 6N5

  • Venue:
  • BSB '08 Proceedings of the 3rd Brazilian symposium on Bioinformatics: Advances in Bioinformatics and Computational Biology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study a parametrized definition of gene clusters that permits control over the trade-off between increasing gene content versus conserving gene order within a cluster. This is based on the notion of generalized adjacency, which is the property shared by any two genes no farther apart, in the linear order of a chromosome, than a fixed threshold parameter 茂戮驴. Then a cluster in two or more genomes is just a maximal set of markers, where in each genome these markers form a connected chain of generalized adjacencies. Since even pairs of randomly constructed genomes may have many generalized adjacency clusters in common, we study the statistical properties of generalized adjacency clusters under the null hypothesis that the markers are ordered completely randomly on the genomes. We derive expresions for the exact values of the expected number of clusters of a given size, for large and small values of the parameter. We discover through simulations that the trend from small to large clusters as a function of the parameter theta exhibits a "cut-off" phenomenon at or near $\sqrt{\theta}$ as genome size increases.