🐘 The Elephant Random Walk

May 2023, refreshed in early 2026

In the Simple Random Walk (ERW), a walker moves in random directions; all of her steps are independent. The variance of the displacement of a random walker after nn steps typically grows linearly with nn: for example in the 1d symmetric random walk defined by

SRWn=S1+β‹―+Sn\mathrm{SRW}_n = S_1 + \dotsb + S_n

with P(Sn=Β±1)=1/2\mathbf{P}(S_n =\pm 1) = 1/2, one has Var(SRWn)=n\mathrm{Var}(\mathrm{SRW}_n) = n. This behaviour is called diffusive, because it is a discrete analog of heat dissipation.

The introduction of long-term memory in Random Walks breaks this diffusive behaviour. In the Simple Random Walk, the walker takes a random step independently of its former moves, oblivious of her past; but instead, she could remember one of her previous steps and reproduce it, just like Elephants whose memory is said to be surprisingly vast. The Elephant Randow Walk (ERW) has been thoroughly studied since its introduction in the paper Elephants can always remember by SchΓΌtz and Trimper, both for its mathematical tractability and rich behaviour.

Here is a description of the Elephant Randow Walk (ERW) with memory parameter pp.

  • The Elephant starts at 0 and takes a first step S1=+1S_1=+1 with initial probability qq or S1=βˆ’1S_1 = -1 with probability 1βˆ’q1-q; the influence of qq is quite subtle, we’ll explain it later.

  • Then, at time nn, the Elephant randomly remembers one of the steps she took in the past; each of the former steps has the same chance to be remembered. Then, with probability pp, she reproduces this step; otherwise she goes the opposite way.

Formally, we set W0=0W_0=0. The first step is W1=Β±1W_1 = \pm 1 with some probability qq.

At step kk, let Uk∼Uniform({1,…,k})U_k \sim \mathrm{Uniform}(\{1, \dotsc, k\}) be the Β« remembered past step Β» and let Ξ΅k\varepsilon_k be whether the elephant goes in the same or opposite direction, that is P(Ξ΅k=1)=p\mathbf{P}(\varepsilon_k = 1) = p and P(Ξ΅k=βˆ’1)=1βˆ’p\mathbf{P}(\varepsilon_k = -1) = 1-p.

The n+1n+1-th step is Wn+1=Wn+Sn+1W_{n+1} = W_n + S_{n+1}, with the last step Sn+1=Ξ΅nSUnS_{n+1} = \varepsilon_n S_{U_n}.

When the memory parameter pp is close to 11 (full memory), the Elephant tends to reproduce always the same steps, thus going way further than the Simple Randow Walker corresponding to p=1/2p=1/2, and the histogram of its final position is less concentrated.

The goal of this note is to explain the appearance of a sudden and surprising transition, happening at the critical value p=3/4p=3/4: the ERW is still diffusive for p<3/4p<3/4, and becomes super-diffusive if p>3/4p>3/4. It all boils down to computing the variance.

Computing the variance

When the initial step has a probability q=1/2q=1/2 to jump left or right, the whole ERW is symmetric, hence centered. Its variance Vn=E[Wn2]V_n = \mathbf{E}[W_n^2] solves the following recursion.

V1=1,Vn+1=1+(1+4pβˆ’2n)Vn. V_1 = 1, \qquad\qquad V_{n+1} = 1 + \left(1 + \frac{4p-2}{n}\right)V_n.

Proof. We note Fn\mathscr{F}_n the sigma-algebra generated by the jumps up to time nn. We have

Vn+1=E[Wn+12]=E[Wn2+2Sn+1Wn+Sn+12]=Vn+1+2E[WnE[Sn+1∣Fn]].V_{n+1}=\mathbf{E}[W_{n+1}^2] = \mathbf{E}[W_n^2 + 2S_{n+1}W_n + S_{n+1}^2]=V_n + 1 + 2\mathbf{E}[W_n\mathbf{E}[S_{n+1}\mid \mathscr{F}_n]].

Conditionnally on Fn\mathscr{F}_n, the last jump Sn+1S_{n+1} is distributed as a Rademacher random variable with a certain parameter pnp_n. Let NnN_n denote the number of +1 jumps before nn;Β then pn=pNn/n+(1βˆ’p)(nβˆ’Nn)/np_n = pN_n/n + (1-p)(n-N_n)/n. Since Nn=(n+Wn)/2N_n = (n+W_n)/2, we obtain

pn=1n(pn+Wn2+(1βˆ’p)nβˆ’Wn2)=n+(2pβˆ’1)Wn2n.p_n = \frac{1}{n}\left(p\frac{n+W_n}{2} + (1-p)\frac{n-W_n}{2}\right) = \frac{n+(2p-1)W_n}{2n}.

The expectation of a Rademacher random variable with parameter pnp_n is pnβˆ’(1βˆ’pn)=2pnβˆ’1p_n - (1-p_n) = 2p_n -1, hence

E[Sn+1∣Fn]=(2pβˆ’1)Wnn\mathbf{E}[S_{n+1}\mid \mathscr{F}_n] = (2p-1)\frac{W_n}{n}

which gives (2) when plugged back in the first expression.

The variance of the Elephant Randow Walk has the following explicit expression depending on Ξ±=4pβˆ’2\alpha = 4p-2: Var(Wn)=Ξ“(Ξ±+n)Ξ“(Ξ±)Ξ“(n)∫01(1βˆ’x)(Ξ±βˆ’1)βˆ’1(1βˆ’xn)dx.\mathrm{Var}(W_n) = \frac{\Gamma(\alpha+n)}{\Gamma(\alpha)\Gamma(n)} \int_0^1 (1-x)^{(\alpha-1)-1} (1-x^n)\mathrm{d}x.

The proof is postponed to later. This expression might not be very informative at first sight, but its large-nn asymptotics are easily understood and unveil a qualitative change in the behaviour of the ERW at the critical value p=3/4p=3/4. By symmetry we restrict to p>1/2p>1/2.

Diffusive case: if p<3/4p<3/4,

Vn∼n3βˆ’4p.V_n \sim \frac{n}{3-4p}.

Critical case: if p=3/4p=3/4,

Vn∼nlog⁑(n).V_n \sim n\log(n).

Super-diffusive case: if p>3/4p>3/4,

Vn∼nΓ—n4pβˆ’3(4pβˆ’3)Ξ“(4pβˆ’2)V_n \sim \frac{n\times n^{4p-3}}{(4p-3)\Gamma(4p-2)}

Proof. Let us note In(Ξ±)I_n(\alpha) the integral in (6). Its behaviour depends on wheter Ξ±βˆ’1=4pβˆ’3\alpha-1 = 4p-3 is positive or negative. We'll make use of Stirling's formula, Ξ“(x)∼xxeβˆ’x2Ο€x\Gamma(x) \sim x^{x}e^{-x}\sqrt{2\pi x}, which ensures that

Ξ“(Ξ±+n)Ξ“(Ξ±)Ξ“(n)∼(Ξ±+n)Ξ±Ξ“(Ξ±). \frac{\Gamma(\alpha+n)}{\Gamma(\alpha)\Gamma(n)} \sim \frac{(\alpha+n)^\alpha}{\Gamma(\alpha)}.

Now,

  • If Ξ±βˆ’1>0\alpha-1>0 then In(Ξ±)∼∫01(1βˆ’x)Ξ±βˆ’2dx=(Ξ±βˆ’1)βˆ’1I_n(\alpha)\sim \int_0^1 (1-x)^{\alpha-2}dx = (\alpha-1)^{-1};

  • If Ξ±=1\alpha=1 then In(Ξ±)=1+1/2+1/3+β‹―+1/n∼log⁑(n)I_n(\alpha)=1+1/2+1/3 +\dotsb + 1/n\sim \log(n);

  • If Ξ±βˆ’1<0\alpha-1<0 then In(Ξ±)βˆΌΞ“(Ξ±)nβˆ’Ξ±n/(1βˆ’Ξ±)I_n(\alpha)\sim \Gamma(\alpha)n^{-\alpha}n/(1-\alpha)

Here is a log-plot of the empirical variance of 100 runs of Elephant Randow Walks with various memory parameters pp ranging between 0.50.5 and 0.90.9. The white line marks the transition at p=3/4p=3/4, between a diffusive behaviour and a super-diffusive behaviour. The red line represents the Simple Random Walk with a variance of nn.

Proof of (6):Β exact computation of the variance VnV_n

Recursions like (2) are easy to solve. Let us note Ξ±=4pβˆ’2\alpha = 4p-2 and xn=(n+Ξ±)/nx_n = (n + \alpha)/n so that Vn+1=1+xnVnV_{n+1} = 1 + x_n V_n. Define

cn=xnβˆ’1xnβˆ’2Γ—β‹―Γ—x2x1c_n = x_{n-1}x_{n-2}\times\dotsm\times x_2x_1

and Vnβ€²=Vn/cnV'_n = V_n/c_n –- we chose cnc_n so that cnxn=cn+1c_nx_n = c_{n+1}. Then, (2) becomes

Vn+1β€²=1cn+1+Vncncnxncn+1=1cn+1+Vnβ€². V'_{n+1} = \frac{1}{c_{n+1}} + \frac{V_n}{c_{n}}\frac{c_n x_n}{c_{n+1}} = \frac{1}{c_{n+1}} + V'_n.

By a telescoping sum and using V1β€²=1V'_1 = 1,

Vnβ€²=1cn+1cnβˆ’1+β‹―+1c2+1.V'_{n} = \frac{1}{c_{n}} + \frac{1}{c_{n-1}} + \dotsb + \frac{1}{c_2} + 1.

Now we see that

cn+1=xnxnβˆ’1Γ—β‹―Γ—x1=(Ξ±+n)(Ξ±+nβˆ’1)Γ—β‹―Γ—(2+Ξ±)(1+Ξ±)n(nβˆ’1)Γ—β‹―Γ—2Γ—1=Ξ“(Ξ±+n)Ξ“(n)Ξ“(Ξ±+1)c_{n+1} = x_n x_{n-1}\times \dotsm \times x_1 = \frac{(\alpha+n)(\alpha+n-1)\times \dotsm \times (2+\alpha)(1+\alpha)}{n(n-1)\times \dotsm \times 2\times 1}= \frac{\Gamma(\alpha+n)}{\Gamma(n)\Gamma(\alpha+1)}

hence the exact formula

Vnβ€²=βˆ‘k=1nΞ“(k)Ξ“(1+Ξ±)Ξ“(k+Ξ±).\begin{aligned}V'_{n} &= \sum_{k=1}^n \frac{\Gamma(k)\Gamma(1+\alpha)}{\Gamma(k+\alpha)}. \end{aligned}

Now since Vn=cnVnβ€²V_n = c_n V'_n, we get

Vn=Ξ“(Ξ±+n)Ξ“(Ξ±+1)Ξ“(n)βˆ‘k=1nΞ“(k)Ξ“(1+Ξ±)Ξ“(k+Ξ±)=Ξ“(Ξ±+n)Ξ“(Ξ±)Ξ“(n)βˆ‘k=1nΞ“(k)Ξ“(Ξ±)Ξ“(k+Ξ±).\begin{aligned} V_n = \frac{\Gamma(\alpha+n)}{\Gamma(\alpha+1)\Gamma(n)}\sum_{k=1}^n\frac{\Gamma(k)\Gamma(1+\alpha)}{\Gamma(k+\alpha)} &= \frac{\Gamma(\alpha+n)}{\Gamma(\alpha)\Gamma(n)}\sum_{k=1}^n\frac{\Gamma(k)\Gamma(\alpha)}{\Gamma(k+\alpha)}. \end{aligned}

This expression can be further simplified using the Ξ“βˆ’Ξ²\Gamma-\beta formula, valid as soon as Ξ±>0\alpha>0 (this is the case when p>1/2p>1/2).

Vn=Ξ“(Ξ±+n)Ξ“(Ξ±)Ξ“(n)βˆ‘k=1n∫01xkβˆ’1(1βˆ’x)Ξ±βˆ’1dx=Ξ“(Ξ±+n)Ξ“(Ξ±)Ξ“(n)∫01(1βˆ’x)Ξ±βˆ’11βˆ’xn1βˆ’xdx=Ξ“(Ξ±+n)Ξ“(Ξ±)Ξ“(n)∫01(1βˆ’x)Ξ±βˆ’2(1βˆ’xn)dx.\begin{aligned} V_n&=\frac{\Gamma(\alpha+n)}{\Gamma(\alpha)\Gamma(n)}\sum_{k=1}^n \int_0^1 x^{k-1}(1-x)^{\alpha-1}dx\\ &=\frac{\Gamma(\alpha+n)}{\Gamma(\alpha)\Gamma(n)} \int_0^1 (1-x)^{\alpha-1} \frac{1-x^n}{1-x}dx\\ &=\frac{\Gamma(\alpha+n)}{\Gamma(\alpha)\Gamma(n)} \int_0^1 (1-x)^{\alpha-2} (1-x^n)dx. \end{aligned}

The limiting distribution

The variance computation above is enough to show the existence of a transition in the behaviour of the ERW. However, it does not tell the whole story. Indeed, scaling limits for the ERW are now quite well understood. In the superdiffusive case, which is the most interesting one, the scaling limit was proved essentially in Baur and Bertoin’s paper (see the references after):β€―they proved the existence of a random variable LqL_q, depending on qq, the probability of the very first step going up, and such that with probability 100%, we have

lim⁑nβ†’βˆžWnn2pβˆ’1=Lq.\lim_{n\to\infty}\frac{W_n}{n^{2p - 1}} = L_q.

This random variable LqL_q is, still today, quite mysterious! In a very good paper by GuΓ©rin, Laulin, Raschel and Simon, I found some of its properties:β€―

Conclusion and references

Many aspects of the ERW are now well understood. Its introduction in the papers Elephants can always remember and Anomalous dynamics… was mostly motivated by the physics of non-Markovian diffusions;Β they even formulated the continuous-time version with a non-diffusive Fokker-Planck equation. The effect of the type of memory on diffusive properties was studied in this paper.

Mathematically, the functional limit for the ERW was proved in a very elegant way in this paper – it appears that the ERW converges towards a continuous Gaussian process with a particular covariance kernel. The use of martingale theory turned out to be very powerful and allows for an efficient identification of the scaling limits of the ERW, see Bercu's paper. Multi-dimensional versions were studied in this paper.

The limiting distribution of LqL_q is studied at lengths in this paper by GuΓ©rin et al.