AENC/resampling_chain Changeset - b27a94c0f7d3 · Centrum Wiskunde & Informatica (CWI)

@@ -244,198 +244,30 @@

    We observe that for the initial $\rho$ we have that $\text{rank}(\rho)=n$, therefore $\text{rank}(\rho*(M_{(n)}^k))\geq n+k$, and so $\rho*(M_{(n)}^k)*\mathbbm{1}$ is obviously divisible by $p^{k}$. This implies that $a_k^{(n)}$ can be calculated by only looking at $\rho*(M_{(n)}^1)*\mathbbm{1}, \ldots, \rho*(M_{(n)}^k)*\mathbbm{1}$.

\newpage

\section{Quasiprobability method}

Let us first introduce notation for paths of the Markov Chain

\section{Proving the strong cancellation claim}

It is useful to introduce some new notation. We will consider variations of the Markov Chains:

\begin{itemize}

    \item $\P^{(n)}$ refers to the original process on the length-$n$ cycle.

    \item $\P^{[a,b]}$ or $\P^{[n]}$ refers to a similar Markov Chain but on a finite chain ($[a,b]$ or $[1,n]$).

\end{itemize}

The process on the finite chain has the following modification at the boundary: if a boundary site is resampled, it can only resample its single neighbour so it draws only two new bits.

We use the notation $\E^{(n)}$,$\E^{[a,b]}$ and $\E^{[n]}$ similarly for denoting expectations.

\begin{definition}[Paths]

    We define a \emph{path} of the Markov Chain as a sequence of states and resampling choices $\xi=((b_0,r_0),(b_1,r_1),...,(b_k,r_k)) \in (\{0,1\}^n\times[n])^k$ indicating that at time $t$ Markov Chain was in state $b_t\in\{0,1\}^n$ and then resampled site $r_t$. We denote by $|\xi|$ the length $k$ of such a path, i.e. the number of resamples that happened, and by $\mathbb{P}[\xi]$ the probability associated to this path.

    We denote by $\paths{b}$ the set of all valid paths $\xi$ that start in state $b$ and end in state $\mathbf{1} := 1^n$.

	We define a \emph{path} of the Markov Chain as a sequence of states and resampling choices $\xi=((b_0,r_0),(b_1,r_1),...,(b_k,r_k)) \in (\{0,1\}^n\times[n])^k$ indicating that at time $t$ Markov Chain was in state $b_t\in\{0,1\}^n$ and then resampled site $r_t$. We denote by $\mathbb{P}[\xi]$ the probability that the process followed this path.

	We denote by $\paths{b}$ the set of all valid paths $\xi$ that start in state $b$ and end in state $\mathbf{1} := 1^n$.

\end{definition}

We can write the expected number of resamplings per site $R^{(n)}(p)$ as

\begin{align}

    R^{(n)}(p) &= \frac{1}{n}\sum_{b\in\{0,1\}^{n}} \rho_b \; R_b(p) \label{eq:originalsum} ,

R^{(n)}(p) &= \frac{1}{n}\sum_{b\in\{0,1\}^{n}} \rho_b \; R_b(p) \label{eq:originalsum} ,

\end{align}

where $R_b(p)$ is the expected number of resamplings when starting from configuration $b$

\begin{align*}

	R_b(p) &= \sum_{\xi \in \paths{b}} \mathbb{P}[\xi] \cdot |\xi| .

\end{align*}

We consider $R^{(n)}(p)$ as a power series in $p$ and show that many terms in (\ref{eq:originalsum}) cancel out if we only consider the series up to some finite order $p^k$. The main idea is that if a path samples a $0$ then $\mathbb{P}[\xi]$ gains a factor $p$ so paths that contribute to $p^k$ can't be arbitrarily long.\\

To see this, we split the sum in (\ref{eq:originalsum}) into parts that will later cancel out. The initial probabilities $\rho_b$ contain a factor $p$ for every $0$ and a factor $(1-p)$ for every $1$. When expanding this product of $p$s and $(1-p)$s, we see that the $1$s contribute a factor $1$ and a factor $(-p)$ and the $0$s only give a factor $p$. We want to expand this product explicitly and therefore we no longer consider bitstrings $b\in\{0,1\}^n$ but bitstrings $b\in\{0,1,1'\}^n$. We view this as follows: every site can have one of $\{0,1,1'\}$ with `probabilities' $p$, $1$ and $-p$ respectively. A configuration $b=101'1'101'$ now has probability $\rho_{b} = 1\cdot p\cdot(-p)\cdot(-p)\cdot 1\cdot p\cdot(-p) = -p^5$ in the starting state $\rho$. It should not be hard to see that we have

\begin{align*}

    R^{(n)}(p) &= \frac{1}{n}\sum_{b\in\{0,1,1'\}^{n}} \rho_{b} \; R_{\bar{b}}(p) ,

\end{align*}

where $\bar{b}$ is the bitstring obtained by changing every $1'$ in it back to a $1$. It is simply the same sum as (\ref{eq:originalsum}) but now every factor $(1-p)$ is explicitly split into $1$ and $(-p)$.

Some terminology: for any configuration we call a $0$ a \emph{particle} (probability $p$) and a $1'$ an \emph{antiparticle} (probability $-p$). We use the word \emph{slot} for a position that is occupied by either a paritcle or antiparticle ($0$ or $1'$). In the initial state, the probability of a configuration is given by $\pm p^{\mathrm{\#slots}}$ where the $\pm$ sign depends on the parity of the number of antiparticles.

We can further rewrite the sum over $b\in\{0,1,1'\}^n$ as a sum over all slot configurations $C\subseteq[n]$ and over all possible fillings of these slots.

\begin{align*}

	R^{(n)}(p) &= \frac{1}{n} \sum_{C\subseteq[n]} \sum_{f\in\{0,1'\}^{|C|}} \rho_{C(f)} R_{C(f)} ,

\end{align*}

where $C(f)\in\{0,1,1'\}^n$ denotes a configuration with slots on the sites $C$ filled with (anti)particles described by $f$. The non-slot positions are filled with $1$s.

\begin{definition}[Diameter and gaps] \label{def:diameter} \label{def:gaps}

    For a subset $C\subseteq[n]$, we define the \emph{diameter} $\diam{C}$ to be the minimum size of an integer interval $I$ containing $C$. Here we consider both $C$ and the interval modulo $n$. In other words $\diam{C} = \min\{ j \vert \exists i : C\subseteq [i,i+j-1] \}$. We define the \emph{gaps} of $C$, as $I\setminus C$ and denote this by $\gaps{C}$. Note that $\diam{C} = |C| + |\gaps{C}|$.  Define $\maxgap{C}$ as the size of the largest connected component of $\gaps{C}$. Figure \ref{fig:diametergap} illustrates these concepts with a picture.

\end{definition}

\begin{figure}

	\begin{center}

    	\includegraphics{diagram_gap.pdf}

    \end{center}

    \caption{\label{fig:diametergap} Illustration of Definition \ref{def:diameter}. A set $C=\{1,2,4,7,9\}\subseteq[n]$ consisting of 5 positions is shown by the red dots. The smallest interval containing $C$ is $[1,9]$, so the diameter is $\diam{C}=9$. The blue squares denote the set $\gaps{C} = \{3,5,6,8\}$. The dotted line at the top depicts the rest of the cycle which may be much larger. The largest gap of $C$ is $\maxgap{C}=2$ which is the largest connected component of $\gaps{C}$.}

\end{figure}

\begin{claim}[Strong cancellation claim] \label{claim:strongcancel}

	The lowest order term in

    \begin{align*}

        \sum_{f\in\{0,1'\}^{|C|}} \rho_{C(f)} R_{C(f)} ,

    \end{align*}

	is $p^{\diam{C}}$ when $n$ is large enough. All lower order terms cancel out.

\end{claim}

Example: for $C_0=\{1,2,4,7,9\}$ (the configuration shown in Figure \ref{fig:diametergap}) we computed the quantity up to order $p^{20}$ in an infinite system:

\begin{align*}

	\sum_{f\in\{0,1'\}^{|C_0|}} \rho_{C_0(f)} R_{C_0(f)} &= 0.0240278 p^{9} + 0.235129 p^{10} + 1.24067 p^{11} + 4.71825 p^{12} \\

    &\quad + 14.5555 p^{13} + 38.8307 p^{14} + 93.2179 p^{15} + 206.837 p^{16}\\

    &\quad + 432.302 p^{17} + 862.926 p^{18} + 1662.05 p^{19} + 3112.9 p^{20} + \mathcal{O}(p^{21})

R_b(p) &= \sum_{\xi \in \paths{b}} \mathbb{P}[\xi] \cdot |\xi| .

\end{align*}

and indeed the lowest order is $\diam{C}=9$.

A weaker version of the claim is that if $C$ contains a gap of size $k$, then the sum is zero up to and including order $p^{|C|+k-1}$.

\begin{claim}[Weak cancellation claim] \label{claim:weakcancel}

	For $C\subseteq[n]$ a configuration of slot positions, the lowest order term in

    \begin{align*}

        \sum_{f\in\{0,1'\}^{|C|}} \rho_{C(f)} R_{C(f)} ,

    \end{align*}

    is at least $p^{|C|+\maxgap{C}}$ when $n$ is large enough. All lower order terms cancel out.

\end{claim}

This weaker version would imply \ref{it:const} but for $\mathcal{O}(k^2)$ as opposed to $k+1$.

\newpage

The reason that claim \ref{claim:strongcancel} would prove \ref{it:const} is the following: to know the value of $a_k^{(n)}$, for any $n\geq k+1$ it is enough to look at configurations $C$ with diameter at most $k$, since larger configurations do not contribute to $a_k^{(n)}$.

For a starting state $b\in\{0,1\}^n$ that \emph{does} give a nonzero contribution, you can take that same starting configuration and translate it to get $n$ other configurations that give the same contribution. (An exception is a starting state like $1010101010...$ which you can only translate twice, but we only have to consider configurations with small diameter, in which case you can make exactly $n$ translations.)

Therefore the coefficient in the expected number of resamplings is a multiple of $n$ which Andr\'as already divided out in the definition of $R^{(n)}(p)$. To show \ref{it:const} we argue that this is the \emph{only} dependency on $n$. This is because there are only finitely many (depending on $k$ but not on $n$) configurations where the $k$ slots are nearby regardless of the value of $n$. So there are only finitely many nonzero contributions after translation symmetry was taken out. For example, when considering all starting configurations with 5 slots one might think there are $\binom{n}{5}$ configurations to consider which would be a dependency on $n$ (more than only the translation symmetry). But since most of these configurations have a diameter larger than $k$, they do not contribute to $a_k$. Only finitely many do and that does not depend on $n$.

Section \ref{sec:computerb} shows how to compute $R_b$ (this is not relevant for showing the claim) and the section after that shows how to prove the weaker claim.

\newpage

\subsection{Computation of $R_b$} \label{sec:computerb}

By $R_{101}$ we denote $R_b(p)$ for a $b$ that consists of only $1$s except for a single zero. We compute $R_{101}$ up to second order in $p$. This requires the following transitions.

\begin{align*}

    \framebox{$1 0 1$} &\to \framebox{$1 1 1$} & (1-p)^3 = 1-3p+3p^2-p^3\\

    \hline

    \framebox{$1 0 1$} &\to

        \begin{cases}

            \framebox{$0 1 1$}\\

            \framebox{$1 0 1$}\\

            \framebox{$1 1 0$}

        \end{cases}

        & 3p(1-p)^2 = 3p-6p^2+3p^3\\

    \hline

    \framebox{$1 0 1$} &\to \framebox{$0 1 0$} & p^2(1-p) = p^2-p^3\\

    \framebox{$1 0 1$} &\to

        \begin{cases}

            \framebox{$1 0 0$}\\

            \framebox{$0 0 1$}

        \end{cases}

        & 2p^2(1-p) = 2p^2 - 2p^3\\

    \hline

    \framebox{$1 0 1$} &\to \framebox{$0 0 0$} & p^3

\end{align*}

With this we can write a recursive formula for the expected number of resamples from $101$:

\begin{align*}

    R_{101} &= (1-3p+3p^2 - p^3)(1) + (3p -6p^2 +3p^3) (1+R_{101}) \\

            &\quad + (p^2 - p^3) (1+R_{10101}) + (2p^2-2p^3) (1+R_{1001}) + p^3(1+R_{10001}) \\

			&= 1 + 3 p + 7 p^2 + 14.6667 p^3 + 29 p^4 + 55.2222 p^5 + 102.444 p^6 + 186.36 p^7 \\

            &\quad + 333.906 p^8 + 590.997 p^9 + 1035.58 p^{10} + 1799.39 p^{11} + 3104.2 p^{12} \\

            &\quad+ 5322.18 p^{13} + 9075.83 p^{14} + 15403.6 p^{15} + 26033.4 p^{16} + 43833.5 p^{17} \\

            &\quad+ 73555.2 p^{18} + 123053 p^{19} + 205290 p^{20} + 341620 p^{21} + 567161 p^{22} \\

            &\quad+ 939693 p^{23} + 1.5537\cdot10^{6} p^{24} + 2.56158\cdot10^{6} p^{25} + \mathcal{O}(p^{26})

\end{align*}

where the recursion steps were done with a computer for an infinite line (or a cirlce where $n$ is assumed to be much larger than the largest power of $p$ considered).

Note: in the first line at the second term it uses that with probability $(3p-6p^2 + 3p^3)$ the state goes to $\framebox{$101$}$ and then the expected number of resamplings is $1+R_{101}$. Note that the actual term in the recursive formula should be

$$(3p-6p^2+3p^3)\cdot\left( \sum_{\xi\in\paths{101}} \mathbb{P}[\xi] \cdot \left( 1 + |\xi|\right) \right) = (3p-6p^2+3p^3)\left( p_\mathrm{tot} + R_{101} \right)$$

where $p_\mathrm{tot} := \sum_{\xi\in\paths{b}} \mathbb{P}[\xi]$. However, since the state space is finite (for finite $n$) and there is always a non-vanishing probability to go to $\mathbf{1}$, we know that $p_\mathrm{tot}=1$, i.e. the process terminates almost surely.

\newpage

\subsection{Weak cancellation proof}

Here we prove claim \ref{claim:weakcancel}, the weaker version of the claim. We require the following definition

\begin{definition}[Path independence] \label{def:independence}

	We say two paths $\xi_i\in\paths{b_i}$ ($i=1,2$) of the Markov Chain are \emph{independent} if $\xi_1$ never resamples a site that was ever zero in $\xi_2$ and the other way around. It is allowed that $\xi_1$ resamples a $1$ to a $1$ that was also resampled from $1$ to $1$ by $\xi_2$ and vice versa. If the paths are not independent then we call the paths \emph{dependent}.

\end{definition}

\begin{definition}[Path independence - alternative] \label{def:independence2}

    Equivalently, on the infinite line $\xi_1$ and $\xi_2$ are independent if there is a site `inbetween' them that was never zero in $\xi_1$ and never zero in $\xi_2$. On the cycle $\xi_1$ and $\xi_2$ are independent if there are \emph{two} sites inbetween them that are never zero.

\end{definition}

\begin{claim}[Sum of expectation values] \label{claim:expectationsum}

When $b=b_1\land b_2\in\{0,1\}^n$ is a state with two groups ($b_1\lor b_2 = 1^n$) of zeroes with $k$ $1$s inbetween the groups, then we have $R_b(p) = R_{b_1}(p) + R_{b_2}(p) + \bigO{p^{k}}$ where $b_1$ and $b_2$ are the configurations where only one of the groups is present and the other group has been replaced by $1$s. To be precise, the sums agree up to and including order $p^{k-1}$.

\end{claim}

\textbf{Example}: For $b_1 = 0111111$ and $b_2 = 1111010$ we have $b=0111010$ and $k=3$. The claim says that the expected time to reach $\mathbf{1}$ from $b$ is the time to make the first group $1$ plus the time to make the second group $1$, as if they are independent. Simulation shows that

\begin{align*}

    R_{b_1} &= 1 + 3p + 7p^2 + 14.67p^3 + 29p^4 + \mathcal{O}(p^5)\\

    R_{b_2} &= 2 + 5p + 10.67p^2 + 21.11p^3+40.26p^4 + \mathcal{O}(p^5)\\

    R_{b} &= 3 + 8p + 17.67p^2 + 34.78p^3+65.27p^4 + \mathcal{O}(p^5)\\

    R_{b_1} + R_{b_2} &= 3 + 8p + 17.67p^2+35.78p^3 + 69.26p^4 +\mathcal{O}(p^5)

\end{align*}

and indeed the sums agree up to order $p^{k-1}=p^2$. When going up to order $p^{k}$ or higher, there will be terms where the groups interfere so they are no longer independent.

\begin{proof}

    Consider a path $\xi_1\in\paths{b_1}$ and a path $\xi_2\in\paths{b_2}$ such that $\xi_1$ and $\xi_2$ are independent (Definition \ref{def:independence} or \ref{def:independence2}). The paths $\xi_1,\xi_2$ induce $\binom{|\xi_1|+|\xi_2|}{|\xi_1|}$ different paths of total length $|\xi_1|+|\xi_2|$ in $\paths{b_1\land b_2}$. In the sums $R_{b_1}$ and $R_{b_2}$, the contribution of these paths are $\mathbb{P}[\xi_1]\cdot |\xi_1|$ and $\mathbb{P}[\xi_2]\cdot |\xi_2|$. The next diagram shows how these $\binom{|\xi_1|+|\xi_2|}{|\xi_1|}$ paths contribute to $R_{b_1\land b_2}$. Point $(i,j)$ in the grid indicates that $i$ steps of $\xi_1$ have been done and $j$ steps of $\xi_2$ have been done. At every point (except the top and right edges of the grid) one has to choose between doing a step of $\xi_1$ or a step of $\xi_2$. The number of zeroes in the current state determine the probabilities with which this happens (beside the probabilities associated to the two original paths already). The grid below shows that at a certain point one can choose to do a step of $\xi_1$ with probability $p_i$ or a step of $\xi_2$ with probability $1-p_i$. These $p_i$ could in principle be different at every point in this grid. The weight of such a new path $\xi\in\paths{b_1\land b_2}$ is $p_\mathrm{grid}\cdot\mathbb{P}[\xi_1]\cdot\mathbb{P}[\xi_2]$ where $p_\mathrm{grid}$ is the weight of the path in the diagram. By induction one can show that the sum over the $\binom{|\xi_1|+|\xi_2|}{|\xi_1|}$ different terms $p_\mathrm{grid}$ is $1$.

\begin{center}

\includegraphics{diagram_paths.pdf}

\end{center}

 Hence the contribution of all $\binom{|\xi_1|+|\xi_2|}{|\xi_1|}$ paths together to $R_{b_1\land b_2}$ is given by

\[

\mathbb{P}[\xi_1]\cdot\mathbb{P}[\xi_2]\cdot(|\xi_1|+|\xi_2|) = \mathbb{P}[\xi_2]\cdot\mathbb{P}[\xi_1]\cdot|\xi_1| \;\; + \;\; \mathbb{P}[\xi_1]\cdot\mathbb{P}[\xi_2]\cdot|\xi_2|.

\]

Ideally we would now like to sum this expression over all possible paths $\xi_1,\xi_2$ and use $p_\mathrm{tot}:=\sum_{\xi\in\paths{b_i}} \mathbb{P}[\xi] = 1$ (which also holds up to arbitrary order in $p$). The above expression would then become $R_{b_1} + R_{b_2}$. However, not all paths in the sum would satisfy the independence condition so it seems we can't do this. We now argue that it works up to order $p^{k-1}$.

For all $\xi\in\paths{b_1\land b_2}$ we have that \emph{either} $\xi$ splits into two independent paths $\xi_1,\xi_2$ as above, \emph{or} it does not. In the latter case, when $\xi$ can not be split like that, we know $\mathbb{P}[\xi]$ contains a power $p^k$ or higher because there is a gap of size $k$  and the paths must have moved at least $k$ times `towards each other' (for example one path moves $m$ times to the right and the other path moves $k-m$ times to the left). So the total weight of such a combined path is at least order $p^k$. Therefore we have

\[

	R_{b_1\land b_2} = \sum_{\mathclap{\substack{\xi_{1,2}\in\paths{b_{1,2}}\\ \mathrm{independent}}}} \mathbb{P}[\xi_2]\mathbb{P}[\xi_1]|\xi_1| + \sum_{\mathclap{\substack{\xi_{1,2}\in\paths{b_{1,2}}\\ \mathrm{independent}}}} \mathbb{P}[\xi_1]\mathbb{P}[\xi_2]|\xi_2| + \sum_{\mathclap{\xi\;\mathrm{dependent}}} \mathbb{P}[\xi]|\xi|.

\]

where last sum only contains only terms of order $p^{k}$ or higher. Now for the first sum, note that

\[

	\sum_{\mathclap{\substack{\xi_{1,2}\in\paths{b_{1,2}}\\ \mathrm{independent}}}} \mathbb{P}[\xi_2]\mathbb{P}[\xi_1]|\xi_1|

    = \sum_{\xi_1\in\paths{b_1}} \sum_{\substack{\xi_2\in\paths{b_2}\\ \text{independent of }\xi_1}} \mathbb{P}[\xi_2]\mathbb{P}[\xi_1]|\xi_1|

\]

where the sum over independent paths could be empty for certain $\xi_1$. Now we replace this last sum by a sum over \emph{all} paths $\xi_2\in\paths{b_2}$. This will change the sum but only for terms where $\xi_1,\xi_2$ are dependent. For those terms we already know that $\mathbb{P}[\xi_1]\mathbb{P}[\xi_2]$ contains a factor $p^k$ and hence we have

\begin{align*}

    \sum_{\mathclap{\substack{\xi_{1,2}\in\paths{b_{1,2}}\\ \mathrm{independent}}}} \mathbb{P}[\xi_2]\mathbb{P}[\xi_1]|\xi_1|

    &= \sum_{\xi_1\in\paths{b_1}} \sum_{\xi_2\in\paths{b_2}} \mathbb{P}[\xi_2]\mathbb{P}[\xi_1]|\xi_1| + \mathcal{O}(p^k) \\

    &= \sum_{\xi_1\in\paths{b_1}} \mathbb{P}[\xi_1]|\xi_1| + \mathcal{O}(p^k) \\

    &= R_{b_1} + \mathcal{O}(p^k)

\end{align*}

we can do the same with the second term and this proves the claim.

\end{proof}

~\\

\textbf{Proof of claim \ref{claim:weakcancel}}: We can assume $C$ consists of a group on the left with $l$ slots and a group on the right with $r$ slots (so $r+l=|C|$), with a gap of size $k=\mathrm{gap}(C)$ between these groups. Then on the left we have strings in $\{0,1'\}^l$ as possibilities and on the right we have strings in $\{0,1'\}^r$. The combined configuration can be described by strings $f=(a,b)\in\{0,1'\}^{l+r}$. The initial probability of such a state $C(a,b)$ is $\rho_{C(a,b)} = (-1)^{|a|+|b|} p^{r+l}$ and by claim \ref{claim:expectationsum} we know $R_{C(a,b)} = R_{C(a)} + R_{C(b)} + \mathcal{O}(p^k)$ where $C(a)$ indicates that only the left slots have been filled by $a$ and the other slots are filled with $1$s. The total contribution of these configurations is therefore

\begin{align*}

    \sum_{f\in\{0,1'\}^{|C|}} \rho_{C(f)} R_{C(f)}

    &= \sum_{a\in\{0,1'\}^l} \sum_{b\in\{0,1'\}^r} (-1)^{|a|+|b|}p^{r+l} \left( R_{C(a)} + R_{C(b)} + \mathcal{O}(p^k) \right) \\

    &=\;\;\; p^{r+l}\sum_{a\in\{0,1'\}^l} (-1)^{|a|} R_{C(a)} \sum_{b\in\{0,1'\}^r} (-1)^{|b|} \\

    &\quad + p^{r+l}\sum_{b\in\{0,1'\}^r} (-1)^{|b|} R_{C(b)} \sum_{a\in\{0,1'\}^l} (-1)^{|a|}

        + \mathcal{O}(p^{r+l+k})\\

    &= 0 + \mathcal{O}(p^{|C|+k})

\end{align*}

where we used the identity $\sum_{a\in\{0,1\}^l} (-1)^{|a|} = 0$.

\newpage

\section{Proving the strong cancellation claim}

It is useful to introduce some new notation. We will consider variations of the Markov Chains:

\begin{itemize}

    \item $\P^{(n)}$ refers to the original process on the length-$n$ cycle.

    \item $\P^{[a,b]}$ or $\P^{[n]}$ refers to a similar Markov Chain but on a finite chain ($[a,b]$ or $[1,n]$).

\end{itemize}

The process on the finite chain has the following modification at the boundary: if a boundary site is resampled, it can not resample one of its neighbors so it ignores it and only draws two new bits.

We consider $R^{(n)}(p)$ as a power series in $p$ and our main aim in this section is to show that $R^{(n)}(p)$ and $R^{(n+k)}(p)$ are the same up to order $n-1$.

%Note that an \emph{event} is a subset of all possible paths of the Markov Chain.

\begin{definition}[Events conditioned on starting state] \label{def:conditionedevents}

@@ -576,11 +408,6 @@ The following lemma considers two vertices $v,w$ that are never ``crossed'' so t

    %Dividing by $\P^{(n)}_b(\mathrm{NZ}_{(v,w)},A_1,A_2)$ and using the first equality gives the desired result.

\end{proof}

\begin{definition}[Starting state dependent probability distribution.]

	Let $I\subset\mathbb{Z}$ be a finite set of vertices.

    Let $b_I$ be the state where everything is $1$, apart from the vertices corresponding to $I$, which are set $0$. Define $\P^{(n)}_I(A)=\P^{(n)}_{b_I}(A)$ which is defined in Definition \ref{def:conditionedevents}.

\end{definition}

\begin{lemma}[Conditional independence 2] \label{lemma:eventindependenceNew}

	Let $v,w \in [n]$, and let $A$ be any event that depends only on the sites $[v,w]$ (meaning the initialization and resamples) and similarly $B$ an event that depends only on the sites $[w,v]$. (For example $\mathrm{Z}^{(s)}$ or ``$s$ has been resampled at least $k$ times'' for an $s$ on the correct interval). Then we have

	\begin{align*}

@@ -644,13 +471,18 @@ The following lemma considers two vertices $v,w$ that are never ``crossed'' so t

    The second equality follows in a similar way.

\end{proof}

    Some notation: let $P$ be an interval $[a,b]$. We say $P$ is a \emph{patch} when the $\Z{i}$ event holds for all $i \in [a,b]$ and $\NZ{a-1}$ and $\NZ{b+1}$ holds. We denote this event by $P\in\mathcal{P}$, so

	\begin{align*}

	P\in\mathcal{P} \equiv \NZ{a-1} \cap \Z{a} \cap \Z{a+1} \cap \cdots \cap \Z{b-1} \cap \Z{b} \cap \NZ{b+1} .

	\end{align*}

	Note that we have the following partition of the event $\Z{v}$ for any vertex $v\in[n]$:

	\begin{definition}[Connected patches]

		Let $P$ be an interval $[a,b]$. We say that $P$ is a patch of a particular run of the process if $P$ is a maximal connected component of the vertices that have ever become $0$ before termination. We denote the set of patches of a run by $\mathcal{P}$. For a patch $P$ let $P\in \mathcal{P}$ denote the event that one of the patches is equal to $P$.

		In other words

		\begin{align*}

		P\in\mathcal{P} := \NZ{a-1} \cap \Z{a} \cap \Z{a+1} \cap \cdots \cap \Z{b-1} \cap \Z{b} \cap \NZ{b+1} .

		\end{align*}

		(In the extreme case when $P$ covers the whole cycle $[n]$, then instead $P\in\mathcal{P}:= \bigcap_{v\in[n]}\Z{v}$.)

	\end{definition}

	We are often going to use the observation that we can partition the event $\Z{v}$ using patches:

	\begin{align*}

	\Z{v} = \dot\bigcup_{P : v\in P} (P\in\mathcal{P})

	\Z{v} = \dot\bigcup_{P\text{ patch } : v\in P} (P\in\mathcal{P})

	\end{align*}

The intuition of the following lemma is that the far right can only affect the zero vertex if there is an interaction chain forming, which means that every vertex should get resampled to $0$ at least once.

@@ -663,7 +495,7 @@ The intuition of the following lemma is that the far right can only affect the z

	Induction step: suppose we proved the claim for $n-1$, then

	\begin{align*}

	\P^{[n+1]}(\Z{1})

	&=\sum_{k=1}^{n+1}\P^{[n+1]}([k]\in\mathcal{P}) \tag{the events are a partition}\\

	&=\sum_{k=1}^{n+1}\P^{[n+1]}([k]\in\mathcal{P}) \tag{the events form a partition}\\

	&=\sum_{k=1}^{n-1}\P^{[n+1]}([k]\in\mathcal{P}) + \bigO{p^{n}}\tag*{$\left(\P^{[n+1]}([k]\in\mathcal{P})=O(p^{k})\right)$}\\

	&=\sum_{k=1}^{n-1}\P^{[k+1]}_{b_{k+1}=1}([k]\in\mathcal{P})\cdot \P^{[n-k+1]}(\NZ{1})+ \bigO{p^{n}} \tag{by Claim~\ref{lemma:eventindependenceNew}}\\

	&=\sum_{k=1}^{n-1}\P^{[k+1]}_{b_{k+1}=1}([k]\in\mathcal{P})\cdot \left(\P^{[n-k]}(\NZ{1})+\bigO{p^{n-k}}\right)+ \bigO{p^{n}} \tag{by induction} \\

@@ -729,11 +561,10 @@ The intuition of the following lemma is that the far right can only affect the z

	Again the intuition of the final theorem is simmilar to the previous lemmas. A site can only realise the length of the cycle after an interaction chain was formed around the cycle, implying that every vertex was resampled to $0$ at least once.

	\begin{theorem}

		$R^{(n)}-R^{(m)}=\bigO{p^{\min(n,m)}}$.

	\begin{theorem} $R^{(n)}=\E^{[-m,m]}(\Res{0})+\bigO{p^{n}}$ for all $m\geq n \geq 3$, thus

		$R^{(n)}-R^{(m)}=\bigO{p^{n}}$.

	\end{theorem}

	\begin{proof}

		Let $N\geq \max(2n,2m)$, then

	\begin{proof} In the proof we identify the sites of the $n$-cycle with the$\mod n$ remainder classes.

		\vskip-3mm

		\begin{align*}

			R^{(n)}

@@ -743,12 +574,12 @@ The intuition of the following lemma is that the far right can only affect the z

			&= \sum_{k=1}^{\infty}\sum_{\underset{v+w\leq n}{v,w\in [n]}}\P^{(n)}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P}) +\bigO{p^{n}}\\[-1mm]

			&= \sum_{k=1}^{\infty}\smash{\sum_{\underset{v+w\leq n}{v,w\in [n]}}}\P^{[-v,w]}_{b_{-v}=b_{w}=1}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P}) \P^{[w,n-v]}(\NZ{w,n-v}) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:eventindependenceNew}}\\

			&= \sum_{k=1}^{\infty}\smash{\sum_{\underset{v+w\leq n}{v,w\in [n]}}}\P^{[-v,w]}_{b_{-v}=b_{w}=1}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P})  \left(\left(\P^{[w,n-v]}(\NZ{w})\right)^{\!\!2}\!+\!\bigO{p^{n-v-w+1}}\right) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:independenetSidesNew}}\\

			&= \sum_{k=1}^{\infty}\smash{\sum_{\underset{v+w\leq n}{v,w\in [n]}}}\P^{[-v,w]}_{b_{-v}=b_{w}=1}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P})  \left(\P^{[-N,-v]}(\NZ{-v})\P^{[w,N]}(\NZ{w})\!+\!\bigO{p^{n-v-w+1}}\right) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:independenetSidesNew}}\\

			&= \sum_{k=1}^{\infty}\smash{\sum_{\underset{v+w\leq n}{v,w\in [n]}}}\P^{[-v,w]}_{b_{-v}=b_{w}=1}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P}) \P^{[-N,-v]}(\NZ{-v})\P^{[w,N]}(\NZ{w}) +\bigO{p^{n}} \tag{$|P_{v,w}|=v+w-1$}\\

			&= \sum_{k=1}^{\infty}\sum_{\underset{v+w\leq n}{v,w\in [n]}}\P^{[-N,N]}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P}) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:eventindependenceNew}}\\[-1mm]

			&= \sum_{k=1}^{\infty}\sum_{\underset{|P|<n}{P\text{ patch}:0\in P}}\P^{[-N,N]}(\Res{0}\!\geq\! k\,\&\, P\in\mathcal{P}) +\bigO{p^{n}} \\[-1mm]

			&= \sum_{k=1}^{\infty}\sum_{P\text{ patch}:0\in P}\P^{[-N,N]}(\Res{0}\!\geq\! k\,\&\, P\in\mathcal{P}) +\bigO{p^{n}} \\

			&= \E^{[-N,N]}(\Res{0})+\bigO{p^{n}}.\\[-3mm]

			&= \sum_{k=1}^{\infty}\smash{\sum_{\underset{v+w\leq n}{v,w\in [n]}}}\P^{[-v,w]}_{b_{-v}=b_{w}=1}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P})  \left(\P^{[-m,-v]}(\NZ{-v})\P^{[w,m]}(\NZ{w})\!+\!\bigO{p^{n-v-w+1}}\right) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:independenetSidesNew}}\\

			&= \sum_{k=1}^{\infty}\smash{\sum_{\underset{v+w\leq n}{v,w\in [n]}}}\P^{[-v,w]}_{b_{-v}=b_{w}=1}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P}) \P^{[-m,-v]}(\NZ{-v})\P^{[w,m]}(\NZ{w}) +\bigO{p^{n}} \tag{$|P_{v,w}|=v+w-1$}\\

			&= \sum_{k=1}^{\infty}\sum_{\underset{v+w\leq n}{v,w\in [n]}}\P^{[-m,m]}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P}) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:eventindependenceNew}}\\[-1mm]

			&= \sum_{k=1}^{\infty}\sum_{\underset{|P|<n}{P\text{ patch}:0\in P}}\P^{[-m,m]}(\Res{0}\!\geq\! k\,\&\, P\in\mathcal{P}) +\bigO{p^{n}} \\[-1mm]

			&= \sum_{k=1}^{\infty}\sum_{P\text{ patch}:0\in P}\P^{[-m,m]}(\Res{0}\!\geq\! k\,\&\, P\in\mathcal{P}) +\bigO{p^{n}} \\

			&= \E^{[-m,m]}(\Res{0})+\bigO{p^{n}}.\\[-3mm]

		\end{align*}

		\noindent Repeating the same argument with $m$ and comparing the results completes the proof.

	\end{proof}

@@ -772,7 +603,204 @@ The intuition of the following lemma is that the far right can only affect the z

		\end{align*}

\end{comment}

Old:

Questions:

\begin{itemize}

	\item Can we generalise the proof to other translationally invariant spaces, like the torus?

	\item In view of this proof, can we better characterise $a_k^{(k+1)}$?

	\item Why did Mario's and Tom's simulation show that for fixed $C$ the contribution coefficients have constant sign? Is it relevant for proving \ref{it:pos}-\ref{it:geq}?

\end{itemize}

	%I think the same arguments would translate to the torus and other translationally invariant spaces, so we could go higher dimensional as Mario suggested. Then I think one would need to replace $|S_{><}|$ by the minimal number $k$ such that there is a $C$ set for which $S\cup C$ is connected. I am not entirely sure how to generalise Lemma~\ref{lemma:probIndep} though, which has key importance in the present proof.

\newpage

\section{Quasiprobability method}

Let us first introduce notation for paths of the Markov Chain

\begin{definition}[Paths]

	We define a \emph{path} of the Markov Chain as a sequence of states and resampling choices $\xi=((b_0,r_0),(b_1,r_1),...,(b_k,r_k)) \in (\{0,1\}^n\times[n])^k$ indicating that at time $t$ Markov Chain was in state $b_t\in\{0,1\}^n$ and then resampled site $r_t$. We denote by $|\xi|$ the length $k$ of such a path, i.e. the number of resamples that happened, and by $\mathbb{P}[\xi]$ the probability associated to this path.

	We denote by $\paths{b}$ the set of all valid paths $\xi$ that start in state $b$ and end in state $\mathbf{1} := 1^n$.

\end{definition}

We can write the expected number of resamplings per site $R^{(n)}(p)$ as

\begin{align}

R^{(n)}(p) &= \frac{1}{n}\sum_{b\in\{0,1\}^{n}} \rho_b \; R_b(p) \label{eq:originalsum} ,

\end{align}

where $R_b(p)$ is the expected number of resamplings when starting from configuration $b$

\begin{align*}

R_b(p) &= \sum_{\xi \in \paths{b}} \mathbb{P}[\xi] \cdot |\xi| .

\end{align*}

We consider $R^{(n)}(p)$ as a power series in $p$ and show that many terms in (\ref{eq:originalsum}) cancel out if we only consider the series up to some finite order $p^k$. The main idea is that if a path samples a $0$ then $\mathbb{P}[\xi]$ gains a factor $p$ so paths that contribute to $p^k$ can't be arbitrarily long.\\

To see this, we split the sum in (\ref{eq:originalsum}) into parts that will later cancel out. The initial probabilities $\rho_b$ contain a factor $p$ for every $0$ and a factor $(1-p)$ for every $1$. When expanding this product of $p$s and $(1-p)$s, we see that the $1$s contribute a factor $1$ and a factor $(-p)$ and the $0$s only give a factor $p$. We want to expand this product explicitly and therefore we no longer consider bitstrings $b\in\{0,1\}^n$ but bitstrings $b\in\{0,1,1'\}^n$. We view this as follows: every site can have one of $\{0,1,1'\}$ with `probabilities' $p$, $1$ and $-p$ respectively. A configuration $b=101'1'101'$ now has probability $\rho_{b} = 1\cdot p\cdot(-p)\cdot(-p)\cdot 1\cdot p\cdot(-p) = -p^5$ in the starting state $\rho$. It should not be hard to see that we have

\begin{align*}

R^{(n)}(p) &= \frac{1}{n}\sum_{b\in\{0,1,1'\}^{n}} \rho_{b} \; R_{\bar{b}}(p) ,

\end{align*}

where $\bar{b}$ is the bitstring obtained by changing every $1'$ in it back to a $1$. It is simply the same sum as (\ref{eq:originalsum}) but now every factor $(1-p)$ is explicitly split into $1$ and $(-p)$.

Some terminology: for any configuration we call a $0$ a \emph{particle} (probability $p$) and a $1'$ an \emph{antiparticle} (probability $-p$). We use the word \emph{slot} for a position that is occupied by either a paritcle or antiparticle ($0$ or $1'$). In the initial state, the probability of a configuration is given by $\pm p^{\mathrm{\#slots}}$ where the $\pm$ sign depends on the parity of the number of antiparticles.

We can further rewrite the sum over $b\in\{0,1,1'\}^n$ as a sum over all slot configurations $C\subseteq[n]$ and over all possible fillings of these slots.

\begin{align*}

R^{(n)}(p) &= \frac{1}{n} \sum_{C\subseteq[n]} \sum_{f\in\{0,1'\}^{|C|}} \rho_{C(f)} R_{C(f)} ,

\end{align*}

where $C(f)\in\{0,1,1'\}^n$ denotes a configuration with slots on the sites $C$ filled with (anti)particles described by $f$. The non-slot positions are filled with $1$s.

\begin{definition}[Diameter and gaps] \label{def:diameter} \label{def:gaps}

	For a subset $C\subseteq[n]$, we define the \emph{diameter} $\diam{C}$ to be the minimum size of an integer interval $I$ containing $C$. Here we consider both $C$ and the interval modulo $n$. In other words $\diam{C} = \min\{ j \vert \exists i : C\subseteq [i,i+j-1] \}$. We define the \emph{gaps} of $C$, as $I\setminus C$ and denote this by $\gaps{C}$. Note that $\diam{C} = |C| + |\gaps{C}|$.  Define $\maxgap{C}$ as the size of the largest connected component of $\gaps{C}$. Figure \ref{fig:diametergap} illustrates these concepts with a picture.

\end{definition}

\begin{figure}

	\begin{center}

		\includegraphics{diagram_gap.pdf}

	\end{center}

	\caption{\label{fig:diametergap} Illustration of Definition \ref{def:diameter}. A set $C=\{1,2,4,7,9\}\subseteq[n]$ consisting of 5 positions is shown by the red dots. The smallest interval containing $C$ is $[1,9]$, so the diameter is $\diam{C}=9$. The blue squares denote the set $\gaps{C} = \{3,5,6,8\}$. The dotted line at the top depicts the rest of the cycle which may be much larger. The largest gap of $C$ is $\maxgap{C}=2$ which is the largest connected component of $\gaps{C}$.}

\end{figure}

\begin{claim}[Strong cancellation claim] \label{claim:strongcancel}

	The lowest order term in

	\begin{align*}

	\sum_{f\in\{0,1'\}^{|C|}} \rho_{C(f)} R_{C(f)} ,

	\end{align*}

	is $p^{\diam{C}}$ when $n$ is large enough. All lower order terms cancel out.

\end{claim}

Example: for $C_0=\{1,2,4,7,9\}$ (the configuration shown in Figure \ref{fig:diametergap}) we computed the quantity up to order $p^{20}$ in an infinite system:

\begin{align*}

\sum_{f\in\{0,1'\}^{|C_0|}} \rho_{C_0(f)} R_{C_0(f)} &= 0.0240278 p^{9} + 0.235129 p^{10} + 1.24067 p^{11} + 4.71825 p^{12} \\

&\quad + 14.5555 p^{13} + 38.8307 p^{14} + 93.2179 p^{15} + 206.837 p^{16}\\

&\quad + 432.302 p^{17} + 862.926 p^{18} + 1662.05 p^{19} + 3112.9 p^{20} + \mathcal{O}(p^{21})

\end{align*}

and indeed the lowest order is $\diam{C}=9$.

A weaker version of the claim is that if $C$ contains a gap of size $k$, then the sum is zero up to and including order $p^{|C|+k-1}$.

\begin{claim}[Weak cancellation claim] \label{claim:weakcancel}

	For $C\subseteq[n]$ a configuration of slot positions, the lowest order term in

	\begin{align*}

	\sum_{f\in\{0,1'\}^{|C|}} \rho_{C(f)} R_{C(f)} ,

	\end{align*}

	is at least $p^{|C|+\maxgap{C}}$ when $n$ is large enough. All lower order terms cancel out.

\end{claim}

This weaker version would imply \ref{it:const} but for $\mathcal{O}(k^2)$ as opposed to $k+1$.

\newpage

The reason that claim \ref{claim:strongcancel} would prove \ref{it:const} is the following: to know the value of $a_k^{(n)}$, for any $n\geq k+1$ it is enough to look at configurations $C$ with diameter at most $k$, since larger configurations do not contribute to $a_k^{(n)}$.

For a starting state $b\in\{0,1\}^n$ that \emph{does} give a nonzero contribution, you can take that same starting configuration and translate it to get $n$ other configurations that give the same contribution. (An exception is a starting state like $1010101010...$ which you can only translate twice, but we only have to consider configurations with small diameter, in which case you can make exactly $n$ translations.)

Therefore the coefficient in the expected number of resamplings is a multiple of $n$ which Andr\'as already divided out in the definition of $R^{(n)}(p)$. To show \ref{it:const} we argue that this is the \emph{only} dependency on $n$. This is because there are only finitely many (depending on $k$ but not on $n$) configurations where the $k$ slots are nearby regardless of the value of $n$. So there are only finitely many nonzero contributions after translation symmetry was taken out. For example, when considering all starting configurations with 5 slots one might think there are $\binom{n}{5}$ configurations to consider which would be a dependency on $n$ (more than only the translation symmetry). But since most of these configurations have a diameter larger than $k$, they do not contribute to $a_k$. Only finitely many do and that does not depend on $n$.

Section \ref{sec:computerb} shows how to compute $R_b$ (this is not relevant for showing the claim) and the section after that shows how to prove the weaker claim.

\newpage

\subsection{Computation of $R_b$} \label{sec:computerb}

By $R_{101}$ we denote $R_b(p)$ for a $b$ that consists of only $1$s except for a single zero. We compute $R_{101}$ up to second order in $p$. This requires the following transitions.

\begin{align*}

\framebox{$1 0 1$} &\to \framebox{$1 1 1$} & (1-p)^3 = 1-3p+3p^2-p^3\\

\hline

\framebox{$1 0 1$} &\to

\begin{cases}

\framebox{$0 1 1$}\\

\framebox{$1 0 1$}\\

\framebox{$1 1 0$}

\end{cases}

& 3p(1-p)^2 = 3p-6p^2+3p^3\\

\hline

\framebox{$1 0 1$} &\to \framebox{$0 1 0$} & p^2(1-p) = p^2-p^3\\

\framebox{$1 0 1$} &\to

\begin{cases}

\framebox{$1 0 0$}\\

\framebox{$0 0 1$}

\end{cases}

& 2p^2(1-p) = 2p^2 - 2p^3\\

\hline

\framebox{$1 0 1$} &\to \framebox{$0 0 0$} & p^3

\end{align*}

With this we can write a recursive formula for the expected number of resamples from $101$:

\begin{align*}

R_{101} &= (1-3p+3p^2 - p^3)(1) + (3p -6p^2 +3p^3) (1+R_{101}) \\

&\quad + (p^2 - p^3) (1+R_{10101}) + (2p^2-2p^3) (1+R_{1001}) + p^3(1+R_{10001}) \\

&= 1 + 3 p + 7 p^2 + 14.6667 p^3 + 29 p^4 + 55.2222 p^5 + 102.444 p^6 + 186.36 p^7 \\

&\quad + 333.906 p^8 + 590.997 p^9 + 1035.58 p^{10} + 1799.39 p^{11} + 3104.2 p^{12} \\

&\quad+ 5322.18 p^{13} + 9075.83 p^{14} + 15403.6 p^{15} + 26033.4 p^{16} + 43833.5 p^{17} \\

&\quad+ 73555.2 p^{18} + 123053 p^{19} + 205290 p^{20} + 341620 p^{21} + 567161 p^{22} \\

&\quad+ 939693 p^{23} + 1.5537\cdot10^{6} p^{24} + 2.56158\cdot10^{6} p^{25} + \mathcal{O}(p^{26})

\end{align*}

where the recursion steps were done with a computer for an infinite line (or a cirlce where $n$ is assumed to be much larger than the largest power of $p$ considered).

Note: in the first line at the second term it uses that with probability $(3p-6p^2 + 3p^3)$ the state goes to $\framebox{$101$}$ and then the expected number of resamplings is $1+R_{101}$. Note that the actual term in the recursive formula should be

$$(3p-6p^2+3p^3)\cdot\left( \sum_{\xi\in\paths{101}} \mathbb{P}[\xi] \cdot \left( 1 + |\xi|\right) \right) = (3p-6p^2+3p^3)\left( p_\mathrm{tot} + R_{101} \right)$$

where $p_\mathrm{tot} := \sum_{\xi\in\paths{b}} \mathbb{P}[\xi]$. However, since the state space is finite (for finite $n$) and there is always a non-vanishing probability to go to $\mathbf{1}$, we know that $p_\mathrm{tot}=1$, i.e. the process terminates almost surely.

\newpage

\subsection{Weak cancellation proof}

Here we prove claim \ref{claim:weakcancel}, the weaker version of the claim. We require the following definition

\begin{definition}[Path independence] \label{def:independence}

	We say two paths $\xi_i\in\paths{b_i}$ ($i=1,2$) of the Markov Chain are \emph{independent} if $\xi_1$ never resamples a site that was ever zero in $\xi_2$ and the other way around. It is allowed that $\xi_1$ resamples a $1$ to a $1$ that was also resampled from $1$ to $1$ by $\xi_2$ and vice versa. If the paths are not independent then we call the paths \emph{dependent}.

\end{definition}

\begin{definition}[Path independence - alternative] \label{def:independence2}

	Equivalently, on the infinite line $\xi_1$ and $\xi_2$ are independent if there is a site `inbetween' them that was never zero in $\xi_1$ and never zero in $\xi_2$. On the cycle $\xi_1$ and $\xi_2$ are independent if there are \emph{two} sites inbetween them that are never zero.

\end{definition}

\begin{claim}[Sum of expectation values] \label{claim:expectationsum}

	When $b=b_1\land b_2\in\{0,1\}^n$ is a state with two groups ($b_1\lor b_2 = 1^n$) of zeroes with $k$ $1$s inbetween the groups, then we have $R_b(p) = R_{b_1}(p) + R_{b_2}(p) + \bigO{p^{k}}$ where $b_1$ and $b_2$ are the configurations where only one of the groups is present and the other group has been replaced by $1$s. To be precise, the sums agree up to and including order $p^{k-1}$.

\end{claim}

\textbf{Example}: For $b_1 = 0111111$ and $b_2 = 1111010$ we have $b=0111010$ and $k=3$. The claim says that the expected time to reach $\mathbf{1}$ from $b$ is the time to make the first group $1$ plus the time to make the second group $1$, as if they are independent. Simulation shows that

\begin{align*}

R_{b_1} &= 1 + 3p + 7p^2 + 14.67p^3 + 29p^4 + \mathcal{O}(p^5)\\

R_{b_2} &= 2 + 5p + 10.67p^2 + 21.11p^3+40.26p^4 + \mathcal{O}(p^5)\\

R_{b} &= 3 + 8p + 17.67p^2 + 34.78p^3+65.27p^4 + \mathcal{O}(p^5)\\

R_{b_1} + R_{b_2} &= 3 + 8p + 17.67p^2+35.78p^3 + 69.26p^4 +\mathcal{O}(p^5)

\end{align*}

and indeed the sums agree up to order $p^{k-1}=p^2$. When going up to order $p^{k}$ or higher, there will be terms where the groups interfere so they are no longer independent.

\begin{proof}

	Consider a path $\xi_1\in\paths{b_1}$ and a path $\xi_2\in\paths{b_2}$ such that $\xi_1$ and $\xi_2$ are independent (Definition \ref{def:independence} or \ref{def:independence2}). The paths $\xi_1,\xi_2$ induce $\binom{|\xi_1|+|\xi_2|}{|\xi_1|}$ different paths of total length $|\xi_1|+|\xi_2|$ in $\paths{b_1\land b_2}$. In the sums $R_{b_1}$ and $R_{b_2}$, the contribution of these paths are $\mathbb{P}[\xi_1]\cdot |\xi_1|$ and $\mathbb{P}[\xi_2]\cdot |\xi_2|$. The next diagram shows how these $\binom{|\xi_1|+|\xi_2|}{|\xi_1|}$ paths contribute to $R_{b_1\land b_2}$. Point $(i,j)$ in the grid indicates that $i$ steps of $\xi_1$ have been done and $j$ steps of $\xi_2$ have been done. At every point (except the top and right edges of the grid) one has to choose between doing a step of $\xi_1$ or a step of $\xi_2$. The number of zeroes in the current state determine the probabilities with which this happens (beside the probabilities associated to the two original paths already). The grid below shows that at a certain point one can choose to do a step of $\xi_1$ with probability $p_i$ or a step of $\xi_2$ with probability $1-p_i$. These $p_i$ could in principle be different at every point in this grid. The weight of such a new path $\xi\in\paths{b_1\land b_2}$ is $p_\mathrm{grid}\cdot\mathbb{P}[\xi_1]\cdot\mathbb{P}[\xi_2]$ where $p_\mathrm{grid}$ is the weight of the path in the diagram. By induction one can show that the sum over the $\binom{|\xi_1|+|\xi_2|}{|\xi_1|}$ different terms $p_\mathrm{grid}$ is $1$.

	\begin{center}

		\includegraphics{diagram_paths.pdf}

	\end{center}

	Hence the contribution of all $\binom{|\xi_1|+|\xi_2|}{|\xi_1|}$ paths together to $R_{b_1\land b_2}$ is given by

\[

	\mathbb{P}[\xi_1]\cdot\mathbb{P}[\xi_2]\cdot(|\xi_1|+|\xi_2|) = \mathbb{P}[\xi_2]\cdot\mathbb{P}[\xi_1]\cdot|\xi_1| \;\; + \;\; \mathbb{P}[\xi_1]\cdot\mathbb{P}[\xi_2]\cdot|\xi_2|.

\]

	Ideally we would now like to sum this expression over all possible paths $\xi_1,\xi_2$ and use $p_\mathrm{tot}:=\sum_{\xi\in\paths{b_i}} \mathbb{P}[\xi] = 1$ (which also holds up to arbitrary order in $p$). The above expression would then become $R_{b_1} + R_{b_2}$. However, not all paths in the sum would satisfy the independence condition so it seems we can't do this. We now argue that it works up to order $p^{k-1}$.

	For all $\xi\in\paths{b_1\land b_2}$ we have that \emph{either} $\xi$ splits into two independent paths $\xi_1,\xi_2$ as above, \emph{or} it does not. In the latter case, when $\xi$ can not be split like that, we know $\mathbb{P}[\xi]$ contains a power $p^k$ or higher because there is a gap of size $k$  and the paths must have moved at least $k$ times `towards each other' (for example one path moves $m$ times to the right and the other path moves $k-m$ times to the left). So the total weight of such a combined path is at least order $p^k$. Therefore we have

\[

	R_{b_1\land b_2} = \sum_{\mathclap{\substack{\xi_{1,2}\in\paths{b_{1,2}}\\ \mathrm{independent}}}} \mathbb{P}[\xi_2]\mathbb{P}[\xi_1]|\xi_1| + \sum_{\mathclap{\substack{\xi_{1,2}\in\paths{b_{1,2}}\\ \mathrm{independent}}}} \mathbb{P}[\xi_1]\mathbb{P}[\xi_2]|\xi_2| + \sum_{\mathclap{\xi\;\mathrm{dependent}}} \mathbb{P}[\xi]|\xi|.

\]

	where last sum only contains only terms of order $p^{k}$ or higher. Now for the first sum, note that

\[

	\sum_{\mathclap{\substack{\xi_{1,2}\in\paths{b_{1,2}}\\ \mathrm{independent}}}} \mathbb{P}[\xi_2]\mathbb{P}[\xi_1]|\xi_1|

	= \sum_{\xi_1\in\paths{b_1}} \sum_{\substack{\xi_2\in\paths{b_2}\\ \text{independent of }\xi_1}} \mathbb{P}[\xi_2]\mathbb{P}[\xi_1]|\xi_1|

\]

	where the sum over independent paths could be empty for certain $\xi_1$. Now we replace this last sum by a sum over \emph{all} paths $\xi_2\in\paths{b_2}$. This will change the sum but only for terms where $\xi_1,\xi_2$ are dependent. For those terms we already know that $\mathbb{P}[\xi_1]\mathbb{P}[\xi_2]$ contains a factor $p^k$ and hence we have

	\begin{align*}

	\sum_{\mathclap{\substack{\xi_{1,2}\in\paths{b_{1,2}}\\ \mathrm{independent}}}} \mathbb{P}[\xi_2]\mathbb{P}[\xi_1]|\xi_1|

	&= \sum_{\xi_1\in\paths{b_1}} \sum_{\xi_2\in\paths{b_2}} \mathbb{P}[\xi_2]\mathbb{P}[\xi_1]|\xi_1| + \mathcal{O}(p^k) \\

	&= \sum_{\xi_1\in\paths{b_1}} \mathbb{P}[\xi_1]|\xi_1| + \mathcal{O}(p^k) \\

	&= R_{b_1} + \mathcal{O}(p^k)

	\end{align*}

	we can do the same with the second term and this proves the claim.

\end{proof}

~\\

\textbf{Proof of claim \ref{claim:weakcancel}}: We can assume $C$ consists of a group on the left with $l$ slots and a group on the right with $r$ slots (so $r+l=|C|$), with a gap of size $k=\mathrm{gap}(C)$ between these groups. Then on the left we have strings in $\{0,1'\}^l$ as possibilities and on the right we have strings in $\{0,1'\}^r$. The combined configuration can be described by strings $f=(a,b)\in\{0,1'\}^{l+r}$. The initial probability of such a state $C(a,b)$ is $\rho_{C(a,b)} = (-1)^{|a|+|b|} p^{r+l}$ and by claim \ref{claim:expectationsum} we know $R_{C(a,b)} = R_{C(a)} + R_{C(b)} + \mathcal{O}(p^k)$ where $C(a)$ indicates that only the left slots have been filled by $a$ and the other slots are filled with $1$s. The total contribution of these configurations is therefore

\begin{align*}

\sum_{f\in\{0,1'\}^{|C|}} \rho_{C(f)} R_{C(f)}

&= \sum_{a\in\{0,1'\}^l} \sum_{b\in\{0,1'\}^r} (-1)^{|a|+|b|}p^{r+l} \left( R_{C(a)} + R_{C(b)} + \mathcal{O}(p^k) \right) \\

&=\;\;\; p^{r+l}\sum_{a\in\{0,1'\}^l} (-1)^{|a|} R_{C(a)} \sum_{b\in\{0,1'\}^r} (-1)^{|b|} \\

&\quad + p^{r+l}\sum_{b\in\{0,1'\}^r} (-1)^{|b|} R_{C(b)} \sum_{a\in\{0,1'\}^l} (-1)^{|a|}

+ \mathcal{O}(p^{r+l+k})\\

&= 0 + \mathcal{O}(p^{|C|+k})

\end{align*}

where we used the identity $\sum_{a\in\{0,1\}^l} (-1)^{|a|} = 0$.

\begin{comment}

The intuition of the following lemma is that the far right can only affect the zero vertex if there is an interaction chain forming, which means that every vertex should get resampled to $0$ at least once.

\begin{lemma}\label{lemma:probIndep}

@@ -950,25 +978,6 @@ The intuition of the following lemma is that the far right can only affect the z

		Since the above expression is independent of $n$ the statement follows.

	\end{proof}

	%Final observation: Suppose $S={a,b}$

	%    \begin{align*}

	%    &\sum_{f_{\overline{P}}\in\{0,1'\}^{2}} \rho_{S(f_{\overline{P}})} \mathbb{P}_{S(f_{\overline{P}})}(\NZ{(P_{\min}-1)}\cap \NZ{(P_{\max}+1)})  \\

	%    &= \sum_{f_{\overline{P}}\in\{0,1'\}^{2}} \rho_{S(f_{\overline{P}})} \mathbb{P}_{S(f_{\overline{P}})}(\NZ{(P_{\min}-1)}) \mathbb{P}_{S(f_{\overline{P}})}(\NZ{(P_{\max}+1)})   +\mathcal{O}(p^{|\overline{P}|-|S|})\\

	%    &= \sum_{f_{\overline{P}}\in\{0,1'\}^{2}}  \rho_{a_f}\mathbb{P}_{S(f_{\overline{P}})}(\NZ{(P_{\min}-1)})  \rho_{b_f}\mathbb{P}_{S(f_{\overline{P}})}(\NZ{(P_{\max}+1)})   +\mathcal{O}(p^{|\overline{P}|-|S|})\\

	%    \end{align*}

	Questions:

	\begin{itemize}

		\item Can we generalise the proof to other translationally invariant spaces, like the torus?

		\item In view of this proof, can we better characterise $a_k^{(k+1)}$?

		\item Why did Mario's and Tom's simulation show that for fixed $C$ the contribution coefficients have constant sign? Is it relevant for proving \ref{it:pos}-\ref{it:geq}?

	\end{itemize}

	%I think the same arguments would translate to the torus and other translationally invariant spaces, so we could go higher dimensional as Mario suggested. Then I think one would need to replace $|S_{><}|$ by the minimal number $k$ such that there is a $C$ set for which $S\cup C$ is connected. I am not entirely sure how to generalise Lemma~\ref{lemma:probIndep} though, which has key importance in the present proof.

Here, I (Tom) tried to set do the same Lemma but for the cycle instead of the infinite chain.

\begin{lemma}[Startingstate dependence] \label{lemma:probIndepCycle}

    Let $d(a,b)$ be the distance between $a,b\in[n]$ on the cycle, so $d(a,b)=\min(|a-b| , n-|a-b|)$. Let $\dist{s}(a,b)$ be the distance between $a,b$ when taking the path that does \emph{not} cross $s$. Let $I\subseteq [n]$ be a non-empty set of vertices. Let $i_* \in I$ and define $I' = I \setminus \{i_*\}$. Let $j,s\notin I$, with $j\neq s$ be any vertices not in $I$.

@@ -1169,6 +1178,7 @@ Here, I (Tom) tried to set do the same Lemma but for the cycle instead of the in

        = \mathcal{O}(p^{|S_{><}|})

    \end{align*}

    as required. This finishes the proof.

\end{comment}

\begin{comment}

    \subsection{Sketch of the (false) proof of the linear bound \ref{it:const}}