AENC/resampling_chain Changeset - a43268439e66 · Centrum Wiskunde & Informatica (CWI)

@@ -119,97 +119,97 @@

	%\and

%?%

%}

%\thanksmarkseries{arabic}

%\renewcommand{\thefootnote}{\fnsymbol{footnote}}

%\date{\vspace{-12mm}}

\begin{document}

	\maketitle

	\begin{abstract}

		The model we consider is the following~\cite{ResampleLimit}: We have a cycle of length $n\geq 3$. Initially we set each site to $0$ or $1$ independently at each site, such that we set it $0$ with probability $p$. After that in each step we select a random vertex with $0$ value and resample it together with its two neighbours assigning $0$ with probability $p$ to each vertex just as initially. The question we try to answer is what is the expected number of resamplings performed before reaching the all $1$ state.

		We present strong evidence for a remarkable critical behaviour. We conjecture that there exists some $p_c\approx0.62$, such that for all $p\in[0,p_c)$ the expected number of resamplings is bounded by a $p$ dependent constant times $n$, whereas for all $p\in(p_c,1]$ the expected number of resamplings is exponentially growing in $n$.

	\end{abstract}

	%Let $R(n)$ denote this quantity for a length $n\geq 3$ cycle.

	We can think about the resampling procedure as a Markov chain. To describe the corresponding matrix we introduce some notation. For $b\in\{0,1\}^n$ let $r(b,i,(x_{-1},x_0,x_1))$ denote the bit string which differs form $b$ by replacing the bits at index $i-1$,$i$ and $i+1$ with the values in $x$, interpreting the indices $\!\!\!\!\mod n$. Also for $x\in\{0,1\}^k$ let $p(x)=p((x_1,\ldots,x_k))=\prod_{i=1}^{k}p^{(1-x_i)}(1-p)^{x_i}$. Now we can describe the matrix of the Markov chain. We use row vectors for the elements of the probability distribution indexed by bitstrings of length $n$. Let $M_{(n)}$ denote the matrix of the leaking Markov chain:

$$

		M_{(n)}=\sum_{b\in\{0,1\}^n\setminus{\{1\}^n}}\sum_{i\in[n]:b_i=0}\sum_{x\in\{0,1\}^3}E_{(b,r(b,i,x))}\frac{p(x)}{n-|b|},

$$

	where $E_{(i,j)}$ denotes the matrix that is all $0$ except $1$ at the $(i,j)$th entry.

	We want to calculate the average number of resamplings $R^{(n)}$, which we define as the expected number of resamplings divided by $n$. For this let $\rho,\mathbbm{1}\in[0,1]^{2^n}$ be indexed with elements of $\{0,1\}^n$ such that $\rho_b=p(b)$ and $\mathbbm{1}_b=1$. Then we use that the expected number of resamplings is just the hitting time of the Markov chain:

	\begin{align*}

		R^{(n)}:&=\mathbb{E}(\#\{\text{resampling before termination}\})/n\\

		&=\sum_{k=1}^{\infty}P(\text{at least } k \text{ resamplings are performed})/n\\

		&=\sum_{k=1}^{\infty}\rho M_{(n)}^k \mathbbm{1}/n\\

		&=\sum_{k=0}^{\infty}a^{(n)}_k p^k

	\end{align*}

	\begin{table}[]

	\centering

	\caption{Table of the coefficients $a^{(n)}_k$}

	\label{tab:coeffs}

	\resizebox{\columnwidth}{!}{%

		\begin{tabular}{c|ccccccccccccccccccccc}

			\backslashbox[10mm]{$n$}{$k$} & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 & 15 & 16 & 17 & 18 & 19 & 20 \\		\hline

			3 &	0 & 1 & \cellcolor{blue!25}2 & 3+1/3 & 5.00 & 7.00 & 9.33 & 12.00 & 15.00 & 18.33 & 22.00 & 26.00 & 30.33 & 35.00 & 40.00 & 45.333 & 51.000 & 57.000 & 63.333 & 70.000 & 77.000 \\

			4 &	0 & 1 & 2 & \cellcolor{blue!25}3+2/3 & 6.16 & 9.66 & 14.3 & 20.33 & 27.83 & 37.00 & 48.00 & 61.00 & 76.16 & 93.66 & 113.6 & 136.33 & 161.83 & 190.33 & 222.00 & 257.00 & 295.50 \\

			5 &	0 & 1 & 2 & 3+2/3 & \cellcolor{blue!25}6.44 & 10.8 & 17.3 & 26.65 & 39.43 & 56.48 & 78.65 & 106.9 & 142.2 & 185.8 & 238.7 & 302.41 & 378.05 & 467.13 & 571.14 & 691.69 & 830.44 \\

			6 &	0 & 1 & 2 & 3+2/3 & 6.44 & \cellcolor{blue!25}11.0 & 18.5 & 30.02 & 47.10 & 71.68 & 106.0 & 152.9 & 215.4 & 297.4 & 403.1 & 537.21 & 705.25 & 913.31 & 1168.2 & 1477.4 & 1849.1 \\

			7 &	0 & 1 & 2 & 3+2/3 & 6.44 & 11.0 & \cellcolor{blue!25}18.7 & 31.21 & 50.83 & 80.80 & 125.3 & 189.7 & 280.8 & 407.0 & 578.6 & 808.13 & 1110.2 & 1502.6 & 2005.6 & 2643.2 & 3443.1 \\

			8 &	0 & 1 & 2 & 3+2/3 & 6.44 & 11.0 & 18.7 & \cellcolor{blue!25}31.44 & 52.08 & 84.95 & 136.0 & 213.6 & 328.9 & 496.5 & 735.6 & 1070.7 & 1532.5 & 2159.5 & 2998.8 & 4108.1 & 5556.7 \\

			9 &	0 & 1 & 2 & 3+2/3 & 6.44 & 11.0 & 18.7 & 31.44 & \cellcolor{blue!25}52.30 & 86.27 & 140.7 & 226.3 & 358.4 & 558.4 & 855.4 & 1289.0 & 1911.5 & 2791.4 & 4017.2 & 5701.4 & 7985.9 \\

			10&	0 & 1 & 2 & 3+2/3 & 6.44 & 11.0 & 18.7 & 31.44 & 52.30 & \cellcolor{blue!25}86.49 & 142.1 & 231.6 & 373.4 & 594.8 & 934.4 & 1447.1 & 2209.0 & 3324.6 & 4934.8 & 7226.9 & 10447. \\

            \vdots \\

            15& 0 & 1 & 2 & 3+2/3 & 6.44 & 11.08 & 18.76 & 31.45 & 52.31 & 86.49 & 142.33 & 233.31 & 381.17 & 621.02 & \cellcolor{blue!25}1009.38 & 1637.13 & % 2650.74 & 4285.68 & 6913.55 & 11171.2 & 18052.2

            16& 0 & 1 & 2 & 3+2/3 & 6.44 & 11.08 & 18.76 & 31.45 & 52.31 & 86.49 & 142.33 & 233.31 & 381.17 & 621.02 & 1009.38 & \cellcolor{blue!25}1637.13 & % 2650.74 & 4285.68 & 6913.55 & 11171.2 & 18052.2

        \end{tabular}

	\end{table}

	We observe that this is a power series in $p$. We discovered a very regular structure in this power series. It seems that for all $k\in\mathbb{N}$ and for all $n>k$ we have that $a^{(n)}_k$ is constant, this conjecture we verified using a computer up to $n=14$.

	\newpage

	\noindent Based on our calculations presented in Table~\ref{tab:coeffs} and Figure~\ref{fig:coeffs_conv_radius} we make the following conjectures:

	\begin{enumerate}[label=(\roman*)]

		\item $\forall k\in\mathbb{N}, \forall n\geq 3 : a^{(n)}_k\geq 0$	\label{it:pos}

        (A simpler version: $\forall k>0: a_k^{(3)}=(k+1)(k+2)/6$)

		\item $\forall k\in\mathbb{N}, \forall n>m\geq 3 : a^{(n)}_k\geq a^{(m)}_k$ \label{it:geq}

		\item $\forall k\in\mathbb{N}, \forall n,m > \max(k,3) : a^{(n)}_k=a^{(m)}_k$ \label{it:const}

  		\item $\exists p_c=\lim\limits_{k\rightarrow\infty}1\left/\sqrt[k]{a_{k}^{(k+1)}}\right.$ \label{it:lim}

	\end{enumerate}

	\colorbox{red}{\ref{it:pos}-\ref{it:geq} is false since $a_{1114}^{(10)}<0$ -- needs to be double checked!}

	I figured this out by observing that $R^{(10)}(p)$ has a pole inside the disk of radius $0.96$. This also means that $R^{(10)}(p)=\sum_{k=0}^{\infty}a_k^{(10)}p^k$ is only true in an analytic sense, since for $p>0.96$ the right hand side does not converge.

	We also conjecture that $p_c\approx0.61$, see Figure~\ref{fig:coeffs_conv_radius}.

	\begin{figure}[!htb]\centering

	\includegraphics[width=0.5\textwidth]{coeffs_conv_radius.pdf}

	%\includegraphics[width=0.5\textwidth]{log_coeffs.pdf}

	\caption{$1\left/\sqrt[k]{a_{k}^{(k+1)}}\right.$} %$\frac{1}{\sqrt[k]{a_k^{(k+1)}}}$

	\label{fig:coeffs_conv_radius}

	\end{figure}

    For reference, we also explicitly give formulas for $R^{(n)}(p)$ for small $n$. We also give them in terms of $q=1-p$ because they sometimes look nicer that way.

    \begin{align*}

    	R^{(3)}(p) &= \frac{1-(1-p)^3}{3(1-p)^3}

        			= \frac{1-q^3}{3q^3}\\

    	R^{(4)}(p) &= \frac{p(6-12p+10p^2-3p^3)}{6(1-p)^4}

                    = \frac{(1-q)(1+q+q^2+3q^3)}{6q^4}\\

        R^{(5)}(p) &= \frac{p(90-300p+435p^2-325p^3+136p^4-36p^5+6p^6)}{15(1-p)^5(6-2p+p^2)}\\

                   &= \frac{(1-q)(6+5q+6q^2+21q^3+46q^4+6q^6)}{15q^5(5+q^2)}

    \end{align*}

    For $n=3$ the system becomes very simple because regardless of the current state, the probability of going to $111$ is always equal to $(1-p)^3$. Therefore the expected number of resamplings is simply the expectation of a geometric distribution. This gives the formula for $R^{(3)}(p)$ as shown above. Note that the $k$-th coefficient of the powerseries of a function $f(p)$ is given by $\frac{1}{k!}\left.\frac{d^k f}{dp^k}\right|_{p=0}$, i.e. the $k$-th derivative to $p$ evaluated at $0$ divided by $k!$. For the function $R^{(3)}(p) =\frac{(1-p)^{-3} - 1}{3} $ this yields $a^{(3)}_k = (k+2)(k+1)/6$ for $k\geq 1$ and $a^{(3)}_0=0$.

    We can do the same for $n=4,5$, which gives, for $k\geq 1$ (with Mathematica):

    \begin{align*}

        a^{(3)}_k &= \frac{(k+2)(k+1)}{6}\\

        a^{(4)}_k &= \frac{1}{6}\left(2+\frac{(k+3)(k+2)(k+1)}{6}\right)\\

        a^{(5)}_k &= \frac{1}{15}\left(\frac{(k+4)(k+3)(k+2)(k+1)}{20} - \frac{(k+3)(k+2)(k+1)}{30} - \frac{(k+2)(k+1)}{50} + \frac{76(k+1)}{25}\right.\\

                  &  \qquad\quad \left. + \frac{626}{125} - \frac{4}{250}

                  \left( \left(\frac{1+i\sqrt{5}}{6}\right)^k(94-25\sqrt{5}i)+\left(\frac{1-i\sqrt{5}}{6}\right)^k(94+25\sqrt{5}i) \right)

                  \right)

    \end{align*}

    and from $n=6$ and onwards, the expression becomes complicated and Mathematica can only give expressions including roots of polynomials.

@@ -896,102 +896,125 @@ The intuition of the following lemma is that the far right can only affect the z

	Again the intuition of the final theorem is simmilar to the previous lemmas. A site can only realise the length of the cycle after an interaction chain was formed around the cycle, implying that every vertex was resampled to $0$ at least once.

	\begin{theorem} $R^{(n)}=\E^{[-m,m]}(\Res{0})+\bigO{p^{n}}$ for all $m\geq n \geq 3$, thus

		$R^{(n)}-R^{(m)}=\bigO{p^{n}}$.

	\end{theorem}

	\begin{proof} In the proof we identify the sites of the $n$-cycle with the$\mod n$ remainder classes.

		\vskip-3mm

		\begin{align*}

			R^{(n)}

			&= \E^{(n)}(\Res{0}) \tag{by translation invariance}\\

			&= \sum_{k=1}^{\infty}\P^{(n)}(\Res{0}\!\geq\! k) \\

			&= \sum_{k=1}^{\infty}\sum_{\underset{v+w\leq n+1}{v,w\in [n]}}\P^{(n)}(\Res{0}\!\geq\! k\,\&\, \underset{P_{v,w}:=}{\underbrace{[-v\!+\!1,w\!-\!1]}}\in\mathcal{P}) \tag{partition}\\[-1mm]

			&= \sum_{k=1}^{\infty}\sum_{\underset{v+w\leq n}{v,w\in [n]}}\P^{(n)}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P}) +\bigO{p^{n}}\\[-1mm]

			&= \sum_{k=1}^{\infty}\smash{\sum_{\underset{v+w\leq n}{v,w\in [n]}}}\P^{[-v,w]}_{b_{-v}=b_{w}=1}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P}) \P^{[w,n-v]}(\NZ{w,n-v}) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:eventindependenceNewGen}}\\

			&= \sum_{k=1}^{\infty}\smash{\sum_{\underset{v+w\leq n}{v,w\in [n]}}}\P^{[-v,w]}_{b_{-v}=b_{w}=1}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P})  \left(\left(\P^{[w,n-v]}(\NZ{w})\right)^{\!\!2}\!+\!\bigO{p^{n-v-w+1}}\right) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:independenetSidesNewGen}}\\

			&= \sum_{k=1}^{\infty}\smash{\sum_{\underset{v+w\leq n}{v,w\in [n]}}}\P^{[-v,w]}_{b_{-v}=b_{w}=1}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P})  \left(\P^{[-m,-v]}(\NZ{-v})\P^{[w,m]}(\NZ{w})\!+\!\bigO{p^{n-v-w+1}}\right) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:independenetSidesNewGen}}\\

			&= \sum_{k=1}^{\infty}\smash{\sum_{\underset{v+w\leq n}{v,w\in [n]}}}\P^{[-v,w]}_{b_{-v}=b_{w}=1}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P}) \P^{[-m,-v]}(\NZ{-v})\P^{[w,m]}(\NZ{w}) +\bigO{p^{n}} \tag{$|P_{v,w}|=v+w-1$}\\

			&= \sum_{k=1}^{\infty}\sum_{\underset{v+w\leq n}{v,w\in [n]}}\P^{[-m,m]}(\Res{0}\!\geq\! k\,\&\, P_{v,w}\!\in\!\mathcal{P}) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:eventindependenceNewGen}}\\[-1mm]

			&= \sum_{k=1}^{\infty}\sum_{\underset{|P|<n}{P\text{ patch}:0\in P}}\P^{[-m,m]}(\Res{0}\!\geq\! k\,\&\, P\in\mathcal{P}) +\bigO{p^{n}} \\[-1mm]

			&= \sum_{k=1}^{\infty}\sum_{P\text{ patch}:0\in P}\P^{[-m,m]}(\Res{0}\!\geq\! k\,\&\, P\in\mathcal{P}) +\bigO{p^{n}} \\

			&= \E^{[-m,m]}(\Res{0})+\bigO{p^{n}}.\\[-3mm]

		\end{align*}

		\noindent Repeating the same argument with $m$ and comparing the results completes the proof.

	\end{proof}

\begin{comment}

		Let $N\geq \max(2n,2m)$, then

		\begin{align*}

		R^{(n)}

		&= \E^{(n)}(\Res{1}) \tag{by translation invariance}\\

		&= \sum_{k=1}^{\infty}\P^{(n)}(\Res{1}\geq k) \\

		%&= \sum_{k=1}^{\infty}\sum_{\underset{\ell\geq r-1}{\ell,r\in[n]}}\P^{(n)}(\Res{1}\geq k\,\&\, [\ell+1,r-1]\in\mathcal{P}) \tag{partition}\\

		%&= \sum_{k=1}^{\infty}\sum_{\underset{\ell\geq r}{\ell,r\in[n]}}\P^{(n)}(\Res{1}\geq k\,\&\, [\ell+1,r-1]\in\mathcal{P})  +\bigO{p^{n}} \\

		%&= \sum_{k=1}^{\infty}\sum_{\underset{\ell\geq r}{\ell,r\in[n]}}\P^{[l,r]}_{b_{\ell}=b_{r}=1}(\Res{1}\geq k\,\&\, [\ell+1,r-1]\in\mathcal{P}) \P^{[r,\ell]}(\NZ{\ell,r}) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:eventindependenceNewGen}}\\

		&= \sum_{k=1}^{\infty}\sum_{P\text{ patch}:1\in P}\P^{(n)}(\Res{1}\geq k\,\&\, P\in\mathcal{P}) \tag{partition}\\

		&= \sum_{k=1}^{\infty}\sum_{P\text{ patch}:1\in P}^{|P|<n}\P^{(n)}(\Res{1}\geq k\,\&\, P\in\mathcal{P}) +\bigO{p^{n}}\\

		&= \sum_{k=1}^{\infty}\sum_{P\text{ patch}:1\in P}^{|P|<n}\P^{[P\cup \partial P]}_{b_{\partial P}=1}(\Res{1}\geq k\,\&\, P\in\mathcal{P}) \P^{[\overline{P}]}(\NZ{\partial P}) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:eventindependenceNewGen}}\\

		&= \sum_{k=1}^{\infty}\sum_{P\text{ patch}:1\in P}^{|P|<n}\P^{[P\cup \partial P]}_{b_{\partial P}=1}(\Res{1}\geq k\,\&\, P\in\mathcal{P}) \left(\left(\P^{[|\overline{P}|]}(\NZ{1})\right)^2+\bigO{p^{|\overline{P}|}}\right) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:independenetSidesNewGen}}\\

		&= \sum_{k=1}^{\infty}\sum_{P\text{ patch}:1\in P}^{|P|<n}\P^{[P\cup \partial P]}_{b_{\partial P}=1}(\Res{1}\geq k\,\&\, P\in\mathcal{P}) \left(\left(\P^{[N]}(\NZ{1})\right)^2+\bigO{p^{|\overline{P}|}}\right) +\bigO{p^{n}} \tag{by Corollary~\ref{cor:probIndepNewGen}}\\

		&= \sum_{k=1}^{\infty}\sum_{P\text{ patch}:1\in P}^{|P|<n}\P^{[-N,N]}(\Res{1}\geq k\,\&\, P\in\mathcal{P}) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:eventindependenceNewGen}}\\

		&= \sum_{k=1}^{\infty}\sum_{P\text{ patch}:1\in P}\P^{[-N,N]}(\Res{1}\geq k\,\&\, P\in\mathcal{P}) +\bigO{p^{n}} \tag{by Lemma~\ref{lemma:eventindependenceNewGen}}\\

		&= \E^{[-N,N]}(\Res{1})+\bigO{p^{n}}.

		\end{align*}

\end{comment}

Questions:

\begin{itemize}

	\item Can we generalise the proof to other translationally invariant spaces, like the torus?

	\item Can we prove some upper bound of the coefficients in the difference, other than they are zero for small powers?

	\item In view of this proof, can we better characterise $a_k^{(k+1)}$?

	\item Why did Mario's and Tom's simulation show that for fixed $C$ the contribution coefficients have constant sign? Is it relevant for proving \ref{it:pos}-\ref{it:geq}?

\end{itemize}

	%I think the same arguments would translate to the torus and other translationally invariant spaces, so we could go higher dimensional as Mario suggested. Then I think one would need to replace $|S_{><}|$ by the minimal number $k$ such that there is a $C$ set for which $S\cup C$ is connected. I am not entirely sure how to generalise Lemma~\ref{lemma:probIndepNewGen} though, which has key importance in the present proof.

\newpage

\section{Characterisation of $p_c$}

\textbf{Conjecture} for a fixed $p\in [0,1]$ the following are equivalent:

\begin{enumerate}

	\item $\lim_{n\to\infty}\P^{[-n,n]}_{\overline{\{0\}}}(\Z{\{n\}})>0$

	\item $\P^{[-\infty,\infty]}_{\overline{\{0\}}}(\text{Not reaching the all 1 state})>0$

	\item $\P^{[-\infty,\infty]}(\NZ{\{0\}})>0$

	\item $\P^{[0,\infty]}(\NZ{\{0\}})>0$

	\item $\lim_{n\to\infty}\P^{[0,n]}(\NZ{\{0\}})>0$

	\item $\exists c,\lambda>0:\P^{[-\infty,\infty]}(\Z{[k]})<ce^{-\lambda k}$

	\item $\exists c,\lambda>0:\mathrm{Cov}^{[-\infty,\infty]}(A,B)<ce^{-\lambda d(A,B)}$

	\item $\exists c,\lambda>0\,\forall n\in\mathbb{N}:\mathrm{Cov}^{[n]}(A,B)<ce^{-\lambda d(A,B)}$

	\item $R^{(\infty)}<\infty$

\end{enumerate}

\begin{proof}

	$1\Leftrightarrow 2:$

	\begin{align*}

		\P^{[-\infty,\infty]}_{\overline{\{0\}}}(\text{Not reaching the all 1 state})>0

		&=\P^{[-\infty,\infty]}_{\overline{\{0\}}}(\text{Resampling arbitrary far away})>0\\

		&=\P^{[-\infty,\infty]}_{\overline{\{0\}}}\left(\bigcap_{n=1}^{\infty}\Z{\{-n\}}\cup\Z{\{n\}}\right)>0\\

		&=\lim_{n\to\infty}\P^{[-\infty,\infty]}(\Z{\{-n\}}_{\overline{\{0\}}}\cup\Z{\{n\}})>0\\

		&=\lim_{n\to\infty}\P^{[-n,n]}_{\overline{\{0\}}}(\Z{\{-n\}}\cup\Z{\{n\}})>0

	\end{align*}

\end{proof}

\newpage

\section{Quasiprobability method}

Let us first introduce notation for paths of the Markov Chain

\begin{definition}[Paths]

	We define a \emph{path} of the Markov Chain as a sequence of states and resampling choices $\xi=((b_0,r_0),(b_1,r_1),...,(b_k,r_k)) \in (\{0,1\}^n\times[n])^k$ indicating that at time $t$ Markov Chain was in state $b_t\in\{0,1\}^n$ and then resampled site $r_t$. We denote by $|\xi|$ the length $k$ of such a path, i.e. the number of resamples that happened, and by $\mathbb{P}[\xi]$ the probability associated to this path.

	We denote by $\paths{b}$ the set of all valid paths $\xi$ that start in state $b$ and end in state $\mathbf{1} := 1^n$.

\end{definition}

We can write the expected number of resamplings per site $R^{(n)}(p)$ as

\begin{align}

R^{(n)}(p) &= \frac{1}{n}\sum_{b\in\{0,1\}^{n}} \rho_b \; R_b(p) \label{eq:originalsum} ,

\end{align}

where $R_b(p)$ is the expected number of resamplings when starting from configuration $b$

\begin{align*}

R_b(p) &= \sum_{\xi \in \paths{b}} \mathbb{P}[\xi] \cdot |\xi| .

\end{align*}

We consider $R^{(n)}(p)$ as a power series in $p$ and show that many terms in (\ref{eq:originalsum}) cancel out if we only consider the series up to some finite order $p^k$. The main idea is that if a path samples a $0$ then $\mathbb{P}[\xi]$ gains a factor $p$ so paths that contribute to $p^k$ can't be arbitrarily long.\\

To see this, we split the sum in (\ref{eq:originalsum}) into parts that will later cancel out. The initial probabilities $\rho_b$ contain a factor $p$ for every $0$ and a factor $(1-p)$ for every $1$. When expanding this product of $p$s and $(1-p)$s, we see that the $1$s contribute a factor $1$ and a factor $(-p)$ and the $0$s only give a factor $p$. We want to expand this product explicitly and therefore we no longer consider bitstrings $b\in\{0,1\}^n$ but bitstrings $b\in\{0,1,1'\}^n$. We view this as follows: every site can have one of $\{0,1,1'\}$ with `probabilities' $p$, $1$ and $-p$ respectively. A configuration $b=101'1'101'$ now has probability $\rho_{b} = 1\cdot p\cdot(-p)\cdot(-p)\cdot 1\cdot p\cdot(-p) = -p^5$ in the starting state $\rho$. It should not be hard to see that we have

\begin{align*}

R^{(n)}(p) &= \frac{1}{n}\sum_{b\in\{0,1,1'\}^{n}} \rho_{b} \; R_{\bar{b}}(p) ,

\end{align*}

where $\bar{b}$ is the bitstring obtained by changing every $1'$ in it back to a $1$. It is simply the same sum as (\ref{eq:originalsum}) but now every factor $(1-p)$ is explicitly split into $1$ and $(-p)$.

Some terminology: for any configuration we call a $0$ a \emph{particle} (probability $p$) and a $1'$ an \emph{antiparticle} (probability $-p$). We use the word \emph{slot} for a position that is occupied by either a paritcle or antiparticle ($0$ or $1'$). In the initial state, the probability of a configuration is given by $\pm p^{\mathrm{\#slots}}$ where the $\pm$ sign depends on the parity of the number of antiparticles.

We can further rewrite the sum over $b\in\{0,1,1'\}^n$ as a sum over all slot configurations $C\subseteq[n]$ and over all possible fillings of these slots.

\begin{align*}

R^{(n)}(p) &= \frac{1}{n} \sum_{C\subseteq[n]} \sum_{f\in\{0,1'\}^{|C|}} \rho_{C(f)} R_{C(f)} ,

\end{align*}

where $C(f)\in\{0,1,1'\}^n$ denotes a configuration with slots on the sites $C$ filled with (anti)particles described by $f$. The non-slot positions are filled with $1$s.

\begin{definition}[Diameter and gaps] \label{def:diameter} \label{def:gaps}

	For a subset $C\subseteq[n]$, we define the \emph{diameter} $\diam{C}$ to be the minimum size of an integer interval $I$ containing $C$. Here we consider both $C$ and the interval modulo $n$. In other words $\diam{C} = \min\{ j \vert \exists i : C\subseteq [i,i+j-1] \}$. We define the \emph{gaps} of $C$, as $I\setminus C$ and denote this by $\gaps{C}$. Note that $\diam{C} = |C| + |\gaps{C}|$.  Define $\maxgap{C}$ as the size of the largest connected component of $\gaps{C}$. Figure \ref{fig:diametergap} illustrates these concepts with a picture.

\end{definition}

\begin{figure}

	\begin{center}

		\includegraphics{diagram_gap.pdf}

	\end{center}

	\caption{\label{fig:diametergap} Illustration of Definition \ref{def:diameter}. A set $C=\{1,2,4,7,9\}\subseteq[n]$ consisting of 5 positions is shown by the red dots. The smallest interval containing $C$ is $[1,9]$, so the diameter is $\diam{C}=9$. The blue squares denote the set $\gaps{C} = \{3,5,6,8\}$. The dotted line at the top depicts the rest of the cycle which may be much larger. The largest gap of $C$ is $\maxgap{C}=2$ which is the largest connected component of $\gaps{C}$.}

\end{figure}

\begin{claim}[Strong cancellation claim] \label{claim:strongcancel}

	The lowest order term in

	\begin{align*}

	\sum_{f\in\{0,1'\}^{|C|}} \rho_{C(f)} R_{C(f)} ,

	\end{align*}