New algorithms for binary jumbled pattern matching

Authors:
Emanuele Giaquinta;Szymon Grabowski
Affiliations:
Department of Computer Science, University of Helsinki, Finland;Institute of Applied Computer Science, Lodz University of Technology, Al. Politechniki 11, 90-924 Łód, Poland
Venue:
Information Processing Letters
Year:
2013

Citing 8
Cited 1

Data parallel algorithms

Communications of the ACM - Special issue on parallelism
Sorting in linear time?

STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Dynamic ordered sets with exponential search trees

Journal of the ACM (JACM)
Scaled and permuted string matching

Information Processing Letters
Indexing permutations for binary strings

Information Processing Letters
On table arrangements, scrabble freaks, and jumbled pattern matching

FUN'10 Proceedings of the 5th international conference on Fun with algorithms
Sub-quadratic time and linear space data structures for permutation matching in binary strings

Journal of Discrete Algorithms
Near linear time construction of an approximate index for all maximum consecutive sub-sums of a sequence

CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching

Algorithms for computing Abelian periods of words

Discrete Applied Mathematics

Quantified Score

Hi-index	0.89

Visualization

Abstract

Given a pattern P and a text T, both strings over a binary alphabet, the binary jumbled string matching problem consists in telling whether any permutation of P occurs in T. The indexed version of this problem, i.e., preprocessing a string to efficiently answer such permutation queries, is hard and has been studied in the last few years. Currently the best bounds for this problem are O(n^2/log^2n) (with O(n) space and O(1) query time) (Moosa and Rahman (2012) [1]) and O(r^2logr) (with O(|L|) space and O(log|L|) query time) (Badkobeh et al. (2012) [2]), where r is the length of the run-length encoding of T and |L|=O(n) is the size of the index. In this paper we present new results for this problem. Our first result is an alternative construction of the index by Badkobeh et al. (2012) [2] that obtains a trade-off between the space and the time complexity. It has O(r^2logk+n/k) complexity to build the index, O(logk) query time, and uses O(n/k+|L|) space, where k is a parameter. The second result is an O(n^2log^2w/w) algorithm (with O(n) space and O(1) query time), based on word-level parallelism where w is the word size in bits.