Gapped permutation patterns for comparative genomics

Authors:
Laxmi Parida
Affiliations:
Computational Biology Center, IBM T. J. Watson Research Center, Yorktown Heights, New York
Venue:
WABI'06 Proceedings of the 6th international conference on Algorithms in Bioinformatics
Year:
2006

Citing 7
Cited 2

Data Structures and Algorithms

Data Structures and Algorithms
The Algorithmic of Gene Teams

WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Finding All Common Intervals of k Permutations

CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Identifying conserved gene clusters in the presence of orthologous groups

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms

Journal of Computer and System Sciences
Computing common intervals of K permutations, with applications to modular decomposition of graphs

ESA'05 Proceedings of the 13th annual European conference on Algorithms
Using PQ trees for comparative genomics

CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching

On table arrangements, scrabble freaks, and jumbled pattern matching

FUN'10 Proceedings of the 5th international conference on Fun with algorithms
Near linear time construction of an approximate index for all maximum consecutive sub-sums of a sequence

CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching

Quantified Score

Hi-index	0.00

Visualization

Abstract

The list of species whose complete DNA sequence have been read, is growing steadily and it is believed that comparative genomics is in its early days[12]. Permutations patterns (groups of genes in some “close” proximity) on gene sequences of genomes across species is being studied under different models, to cope with this explosion of data. The challenge is to (intelligently and efficiently) analyze the genomes in the context of other genomes. In this paper we present a generalized model that uses three notions, gapped permutation patterns (with gap g), genome clusters, via quorum, K 1, parameter, and, possible multiplicity in the patterns. The task is to automatically discover all permutation patterns (with possible multiplicity), that occur with gap g in at least K of the given m genomes. We present $\mathcal{O}(\log m N_I +|\Sigma|\log |\Sigma|N_O)$ time algorithm where m is the number of sequences, each defined on Σ, NI is the size of the input and NO is the size of the maximal gene clusters that appear in at least K of the m genomes.