Assignment of Orthologous Genes via Genome Rearrangement
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Genes order and phylogenetic reconstruction: application to γ-proteobacteria
RCG'05 Proceedings of the 2005 international conference on Comparative Genomics
Maximizing synteny blocks to identify ancestral homologs
RCG'05 Proceedings of the 2005 international conference on Comparative Genomics
The incompatible desiderata of gene cluster properties
RCG'05 Proceedings of the 2005 international conference on Comparative Genomics
A framework for orthology assignment from gene rearrangement data
RCG'05 Proceedings of the 2005 international conference on Comparative Genomics
Conserved interval distance computation between non-trivial genomes
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Minimum common string partition problem: hardness and approximations
ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Genomes containing duplicates are hard to compare
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
A parsimony approach to genome-wide ortholog assignment
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Comparing Genomes with Duplications: A Computational Complexity Point of View
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
Inferring orthologous and paralogous genes is an important problem in whole genomes comparisons, both for functional or evolutionary studies. In this paper, we introduce a new approach for inferring candidate pairs of orthologous genes between genomes, also called positional homologs, based on the conservation of the genomic context. We consider genomes represented by their gene order – i.e. sequences of signed integers – and common intervals of these sequences as the anchors of the final gene matching. We show that the natural combinatorial problem of computing a maximal cover of the two genomes using the minimum number of common intervals is NP-complete and we give a simple heuristic for this problem. We illustrate the effectiveness of this first approach using common intervals of sequences on two datasets, respectively 8 γ-proteobacterial genomes and the human and mouse whole genomes.