The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Approximate string matching: a simpler faster algorithm
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Faster algorithms for string matching with k mismatches
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
On the Complexity of Determining the Period of a String
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
Finding Repeated Elements
Finding frequent items in data streams
Theoretical Computer Science - Special issue on automata, languages and programming
Efficient randomized pattern-matching algorithms
IBM Journal of Research and Development - Mathematics and computing
Optimal approximations of the frequency moments of data streams
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Space efficient mining of multigraph streams
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Simpler algorithm for estimating frequency moments of data streams
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Stable distributions, pseudorandom generators, embeddings, and data stream computation
Journal of the ACM (JACM)
STAGGER: Periodicity Mining of Data Streams Using Expanding Sliding Windows
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Space-optimal heavy hitters with strong error bounds
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The Data Stream Space Complexity of Cascaded Norms
FOCS '09 Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science
Exact and Approximate Pattern Matching in the Streaming Model
FOCS '09 Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science
An optimal algorithm for the distinct elements problem
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
1-pass relative-error Lp-sampling with applications
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
APPROX'05/RANDOM'05 Proceedings of the 8th international workshop on Approximation, Randomization and Combinatorial Optimization Problems, and Proceedings of the 9th international conference on Randamization and Computation: algorithms and techniques
Practical algorithms for tracking database join sizes
FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
Improved sketching of hamming distance with error correcting
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Periodicity and cyclic shifts via linear sketches
APPROX'11/RANDOM'11 Proceedings of the 14th international workshop and 15th international conference on Approximation, randomization, and combinatorial optimization: algorithms and techniques
Pattern matching in multiple streams
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Simple real-time constant-space string matching
Theoretical Computer Science
Hi-index | 0.01 |
In this work we study sublinear space algorithms for detecting periodicity over data streams. A sequence of length n is said to be periodic if it consists of repetitions of a block of length p for some p ≤ n/2. In the first part of this paper, we give a 1-pass randomized streaming algorithm that uses O(log2 n) space and reports the shortest period if the given stream is periodic. At the heart of this result is a 1-pass O(log n log m) space streaming pattern matching algorithm. This algorithm uses similar ideas to Porat and Porat's algorithm in FOCS 2009 but it does not need an offline pre-processing stage and is simpler. In the second part, we study distance to p-periodicity under the Hamming metric, where we estimate the minimum number of character substitutions needed to make a given sequence p-periodic. In streaming terminology, this problem can be described as computing the cascaded aggregate L1 × F1res(1) over a matrix Ap×⌊n/p⌋ given in column ordering. For this problem, we present a randomized streaming algorithm with approximation factor 2 + ε that takes Õ(1/ε2) space. We also show a 1 + ε randomized streaming algorithm which uses Õ(1/ε5.5p1/2) space.