1
$\begingroup$

Let $x=x_1x_2\dotsm x_n$ be a binary string. A $k$-substring of $x$ is a period of continuous bits, i.e. $x_ix_{i+1}\dotsm x_{i+k-1}$. I want to count the number of all the binary strings whose any two arbitrary over-lapping $k$-substrings is never Reverse-Complement 【Reverse-Complement means reverse one of substring and add to the other (the bit-sum modulo 2), a $k$-length string with all bits equal to 1 is obtained】.

Is there a theory about counting the number of such binary strings?

$\endgroup$
15
  • $\begingroup$ Is the sum modulo $2$? i.e., you want an odd number of $1$'s in the $k-$substrings? $\endgroup$ Commented Apr 14, 2023 at 11:46
  • 1
    $\begingroup$ @PeterTaylor So the condition is that $w_1\ne\bar w_2$ for any overlapping $k$-substrings $w_1$, $w_2$, right? $\endgroup$ Commented Apr 14, 2023 at 12:56
  • 1
    $\begingroup$ Is $k$ given and fixed? $\endgroup$ Commented Apr 14, 2023 at 13:04
  • 2
    $\begingroup$ If $k$ is fixed as $2\ell-1$ or if $k$ is fixed as $2\ell$ with $\ell\in \{2,3,\dots\}$, it amounts to count binary strings that avoid $u\overline{u}$ as a substring for any binary string $u$ of length $\ell$, where $\overline{u}$ is the reverse complement of $u$ defined in @PeterTaylor 's comment. $\endgroup$ Commented Apr 14, 2023 at 13:32
  • 2
    $\begingroup$ Then the problem amounts to constructing the De Bruijn graph for given $k$, removing edges corresponding to the forbidden substrings, and then counting walks in this graph via the transfer-matrix method. $\endgroup$ Commented Apr 14, 2023 at 15:13

1 Answer 1

2
$\begingroup$

The even length substrings which equal their own reverse complement are precisely the even length palindromes when you reverse every other letter in $x$. Therefore you are looking for the number of strings of length $n$ that do not contain any palindromes of length $2\ell := 2\lceil k / 2 \rceil$. Let's call this number $f_\ell(n)$.

Counting the number of words of length $n$ avoiding length $k$ palindromes sounds like a very natural problem in combinatorics on words, but I couldn't find anything after a quick google search, so I'm not sure there exists a nice exact formula. However you can find some rough bounds on its behavior for variable $\ell$ and $n$ using counting and probabilistic arguments.

For example, there is an upper bound of $f_\ell(n) \leq (2^{2\ell} - 2^\ell)^{\lceil n / 2\ell\rceil} \approx e^{-(2^{-\ell-1}/\ell) \cdot n}\cdot 2^n$, because there are $2^{2\ell} - 2^\ell$ possibilities for each segment of length $2\ell$.

For a lower bound, note that cyclic strings of length $2^{\ell-1}$ are expected to have half a palindrome. Thus there are at least $$\frac{2^{2^{\ell-1}}}{2\cdot2^{4\ell}}$$ different strings of length $2^{\ell-1}$ same length $2\ell$ prefix and suffix, which we can concatenate arbitrarily without introducing palindromes. This gives a lower bound of $$f_\ell(n) \geq \left(\frac{2^{2^{\ell-1}}}{2\cdot2^{4\ell}}\right)^{\lfloor n / 2^{\ell-1}\rfloor } \approx 2^{-(4\ell+1)2^{-\ell+1}\cdot n}\cdot 2^n.$$ Therefore $f_\ell(n) = 2^{n(1-2^{-\ell + o(\ell)})+O(2^\ell)}$.

$\endgroup$
1
  • $\begingroup$ $f_2(n)=2a(n+2)$ for $n\ge 3$ where $a(n)$ is OEIS's A000930 Narayana's cows sequence. $\endgroup$ Commented Apr 15, 2023 at 10:01

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.