Longest Common Consecutive Substring in Two Random Strings

Authors:
Sarmad Abbasi
Affiliations:
-
Venue:
Longest Common Consecutive Substring in Two Random Strings
Year:
1997

Citing 0
Cited 1

Fast and Sensitive Probe Selection for DNA Chips Using Jumps in Matching Statistics

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Let $\Sigma$ be a finite alphabet with $C$ letters. For any two strings $x$ and $y$ of length $n$, we let $S(x,y)$ denote the size of the longest common consecutive substring between $x$ and $y$; that is, $S(x,y)$ is the largest $k$ such that, $$ x_i \cdots x_{i+k} = y_j \cdots y_{j+k}$$ for some $i$ and $j$. We show that for $x$ and $y$ chosen uniformly among all possible strings of length $n$, $S(x,y)$ is highly concentrated around $2 \log_C n$. More precisely, for any $a \geq 1$ $$ \Pr [ |S(x,y) - 2 \log_C n |