Permutation Editing and Matching via Embeddings

Authors:
Graham Cormode;S. Muthukrishnan;Süleyman Cenk Sahinalp
Affiliations:
-;-;-
Venue:
ICALP '01 Proceedings of the 28th International Colloquium on Automata, Languages and Programming,
Year:
2001

Citing 12
Cited 6

Fast algorithms for approximately counting mismatches

Information Processing Letters
Conserved synteny as a measure of genomic distance

Discrete Applied Mathematics - Special volume on computational molecular biology
Sorting by reversals is difficult

RECOMB '97 Proceedings of the first annual international conference on Computational molecular biology
Approximate nearest neighbors: towards removing the curse of dimensionality

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Efficient search for approximate nearest neighbor in high dimensional spaces

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Sorting by Transpositions

SIAM Journal on Discrete Mathematics
A 2-approximation algorithm for genome rearrangements by reversals and transpositions

Theoretical Computer Science - Special issue: Genome informatics
The complexity of gene placement

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
A 3/2-approximation algorithm for sorting by reversals

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Original Synteny

CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
An Approximate L1-Difference Algorithm for Massive Data Streams

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Genome rearrangements and sorting by reversals

SFCS '93 Proceedings of the 1993 IEEE 34th Annual Foundations of Computer Science

Simple and Practical Sequence Nearest Neighbors with Block Operations

CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Large scale parallel document mining for machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Safe and efficient strategies for updating firewall policies

TrustBus'10 Proceedings of the 7th international conference on Trust, privacy and security in digital business
Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence

SIAM Journal on Computing
The streaming complexity of cycle counting, sorting by reversals, and other problems

Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Edit distance to monotonicity in sliding windows

ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

If the genetic maps of two species are modelled as permutations of (homologous) genes, the number of chromosomal rearrangements in the form of deletions, block moves, inversions etc. to transform one such permutation to another can be used as a measure of their evolutionary distance. Motivated by such scenarios, we study problems of computing distances between permutations as well as matching permutations in sequences, and finding most similar permutation from a collection ("nearest neighbor"). We adopt a general approach: embed permutation distances of relevance into well-known vector spaces in an approximately distance-preserving manner, and solve the resulting problems on the well-known spaces. Our results are as follows: -We present the first known approximately distance preserving embeddings of these permutation distances into well-known spaces. -Using these embeddings, we obtain several results, including the first known efficient solution for approximately solving nearest neighbor problems with permutations and the first known algorithms for finding permutation distances in the "data stream" model. -We consider a novel class of problems called permutation matching problems which are similar to string matching problems, except that the pattern is a permutation (rather than a string) and present linear or near-linear time algorithms for approximately solving permutation matching problems; in contrast, the corresponding string problems take significantly longer.