Skip to content Skip to navigation

OpenStax_CNX

You are here: Home » Content » Continuous Markov Processes

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

Continuous Markov Processes

Module by: Dr Zdzislaw (Gustav) Meglicki, Jr. E-mail the author

Summary: Yet to be written. Preliminary.

Preliminaries

  1. “Probability Distributions,” Connexions module m43336
  2. “Introduction to Markov Processes,” Connexions module m44014
  3. “Integral of a Markov Process,” Connexions module m44376: this module will not be needed until the section about Brownian motion, "Brownian Motion". In turn, some of its content depends on the material covered in this module, m44258, prior to "Brownian Motion". Therefore one should proceed reading m44258 until (but excluding) Brownian motion, one should then switch to module m44376, and then return to read about Brownian motion and continue with m44258. We're postponing the discussion of the integral of a Markov process, because we want to get to some interesting stuff, namely, the Fokker-Planck equations and the Schrödinger equation—the primary motivation behind the development of this collection—as soon as possible. If your interest is in stochastic quantum mechanics, then you only need to read until "The Schrödinger Equation".

Introduction

Why study continuous Markov processes?

Continuous Markov processes are surprisingly precisely defined. Seemingly general assumptions about the smoothness of the propagator, in combination with the Central Limit Theorem, determine uniquely that the Markov propagator density function must have the following form

Π x ' | d t | ( x , t ) = 1 2 π D ( x , t ) d t exp - x ' - μ ( x , t ) d t 2 2 D ( x , t ) d t , Π x ' | d t | ( x , t ) = 1 2 π D ( x , t ) d t exp - x ' - μ ( x , t ) d t 2 2 D ( x , t ) d t ,
(1)

where μ(x,t)μ(x,t) is called the drift function and D(x,t)D(x,t) is called the diffusion function. μ(x,t)μ(x,t) and D(x,t)D(x,t) together are called the characterizing functions of a continuous Markov process.

The Kramers-Moyal equations that correspond to this propagator simplify too. In this form, they are called the Fokker-Planck equations:

  • forward Fokker-Planck equation:
    t P ( x , t ) | ( x 0 , t 0 ) = - x μ ( x , t ) P ( x , t ) | ( x 0 , t 0 ) + 1 2 2 x 2 D ( x , t ) P ( x , t ) | ( x 0 , t 0 ) , t P ( x , t ) | ( x 0 , t 0 ) = - x μ ( x , t ) P ( x , t ) | ( x 0 , t 0 ) + 1 2 2 x 2 D ( x , t ) P ( x , t ) | ( x 0 , t 0 ) ,
    (2)
  • backward Fokker-Planck equation:
    - t 0 P ( x , t ) | ( x 0 , t 0 ) = μ ( x 0 , t 0 ) x 0 P ( x , t ) | ( x 0 , t 0 ) + 1 2 D ( x 0 , t 0 ) 2 x 0 2 P ( x , t ) | ( x 0 , t 0 ) . - t 0 P ( x , t ) | ( x 0 , t 0 ) = μ ( x 0 , t 0 ) x 0 P ( x , t ) | ( x 0 , t 0 ) + 1 2 D ( x 0 , t 0 ) 2 x 0 2 P ( x , t ) | ( x 0 , t 0 ) .
    (3)

The moment evolution equations are expressed similarly in terms of the characterizing functions μ(x,t)μ(x,t) and D(x,t)D(x,t). The homogeneous processes become even simpler to analyze, because the characterizing functions either do not depend on tt for temporally homogeneous processes, or are constants for completely homogeneous processes. Completely homogeneous continuous Markov processes are called Wiener processes. Processes for which the diffusion function D(x,t)D(x,t) is a constant, but the drift function μ(x,t)μ(x,t) assumes the form μ(x,t)=-kxμ(x,t)=-kx are called Ornstein-Uhlenbeck processes. The early analysis of Brownian motion, first by Einstein (1905) and Smoluchowski (1906) and somewhat later by Langevin (1908), was presented as just such a process.

One could think that on account of being so easy to capture continuous Markov processes would not be interesting. But they are surprisingly interesting, first, because so much can be said about them, and, second, because jump Markov processes become continous Markov processes in a certain limit—crudly, oftentimes if one does not look at jump processes closely enough, they seem smooth enough to fit into the continuous Markov process formalism. The Fokker-Planck equation can be reformulated as the path integral, so here we suddenly find an interesting bridge to quantum field theory and the theory of critical dynamics.

But first, we are going to look at how Equation 1 comes about.

The Propagator

We assume that

  1. the process under the consideration is Markovian. Furthermore......
  2. the propagator density function of the process, Π(x'|dt|(x,t))Π(x'|dt|(x,t)), is a smooth function of dtdt and (x,t)(x,t), and has to satisfy the Chapman-Kolmogorov equation. Furthermore......
  3. we assume that Π(x'|dt|(x,t))0Π(x'|dt|(x,t))0 in an infinitesimal neighbourhood of x'=0x'=0 only. Of course, it must be δ(x')δ(x') for dt=0dt=0. So, for dt>0dt>0, but still infinitesimal, we expect it to slowly dissolve the delta into some sort of a finite distribution that converges to delta as dt0dt0. Furthermore......
  4. we assume that x'x', the random variable associated with Π(x'|dt|(x,t))Π(x'|dt|(x,t)), has finite mean and variance, which automatically eliminates distributions, such as Cauchy, that do not have this property. Furthermore......
  5. we assume that x'x' and var(x')var(x') are smooth functions of dtdt and (x,t)(x,t).

Additionally we need to prove the following

Lemma 3.1 (Proportional Function)

If

  1. f(x)f(x) is smooth,
  2. nN:f(x)=nf(x/n)nN:f(x)=nf(x/n), where NN is the set of natural numbers (positive integers),

then f(x)=αxf(x)=αx where αα is a constant in this context, that is a parameter that does not depend on xx (but it may depend on other arguments if such are present).

To prove the lemma we start with assumption 2 and differentiate both sides with respect to xx:

d f ( x ) d x = n d f ( x / n ) d ( x / n ) 1 n = d f ( x / n ) d ( x / n ) . d f ( x ) d x = n d f ( x / n ) d ( x / n ) 1 n = d f ( x / n ) d ( x / n ) .
(4)

In particular, for nn

x : d f ( x ) d x = d f ( x / n ) d ( x / n ) n = d f ( x ) d x x = 0 . x : d f ( x ) d x = d f ( x / n ) d ( x / n ) n = d f ( x ) d x x = 0 .
(5)

The rightmost derivative at x=0x=0 is just a number, a constant. Let us call it αα. The equation then tells us that

x : d f ( x ) d x = α , x : d f ( x ) d x = α ,
(6)

wherefrom f(x)=αx+βf(x)=αx+β, where ββ is another constant. But since f(x)/n=f(x/n)f(x)/n=f(x/n) taking nn we find that f(0)=0f(0)=0, therefore β=0β=0, which ends the proof.

We are going to make use of the lemma as follows. We will show that, given the assumptions made,

x ' ( d t , ( x , t ) ) = n x ' d t n , ( x , t ) , var x ' ( d t , ( x , t ) ) = n var x ' d t n , ( x , t ) . x ' ( d t , ( x , t ) ) = n x ' d t n , ( x , t ) , var x ' ( d t , ( x , t ) ) = n var x ' d t n , ( x , t ) .
(7)

Hence, according to the lemma,

x ' ( d t , ( x , t ) ) = μ ( x , t ) d t , var x ' ( d t , ( x , t ) ) = D ( x , t ) d t . x ' ( d t , ( x , t ) ) = μ ( x , t ) d t , var x ' ( d t , ( x , t ) ) = D ( x , t ) d t .
(8)

Furthermore, along the way, we'll show that

x ' ( d t , ( x , t ) ) = i = 1 n x i ' d t n , ( x , t ) x ' ( d t , ( x , t ) ) = i = 1 n x i ' d t n , ( x , t )
(9)

which implies that x'x' is the sum of nn statistically independent, random variables associated with identical probability densities, each with the same well defined mean, μ(x,t)dt/nμ(x,t)dt/n, and variance, D(x,t)dt/nD(x,t)dt/n. Because the nn here can be arbitrarily large, according to the Central Limit Theorem, the probability density of the sum, that is x'x', must be given by the Gaussian distribution centered on μ(x,t)dtμ(x,t)dt and with the variance of D(x,t)dtD(x,t)dt. So this, in a nutshell, is how we arrive at Equation 1.

But let's get back to the beginning.

We have some Markovian process described by x,P(x,t)|(x0,t0)x,P(x,t)|(x0,t0), where xx is the random variable and PP is its probability density. Assuming the process restarts from (x,t)(x,t), after an infinitesimal time dtdt it deflects from xx by some x'x' with the probability density of Πx'|dt|(x,t)Πx'|dt|(x,t). To describe this process we use a notational shortcut

x ( t + d t ) - x ( t ) = x ' ( d t ) x ( t + d t ) - x ( t ) = x ' ( d t )
(10)

but this expression should not be understood “as it's written.” Here, x(t+dt)x(t+dt) may assume any value with probability density of P(x,t+dt)|(x,t)P(x,t+dt)|(x,t) and x'(dt)x'(dt) may assume any value with probability density of Πx'|dt|(x,t)Πx'|dt|(x,t). Knowing the latter and x,P(x,t)|(x0,t0)x,P(x,t)|(x0,t0) we can reconstruct P(x,t+dt)|(x,t)P(x,t+dt)|(x,t) by using the Random Variable Transformation theorem, which says that

Theorem 3.1 (Random Variable Transformation)

If x=x1,x2,...,xmx=x1,x2,...,xm are random variables described by a joint probability density Px(x)Px(x) and yi=fi(x),i=1,...,nyi=fi(x),i=1,...,n, where nn is not necessarily equal mm, then the probability density Pyy=y1,y2,...,ynPyy=y1,y2,...,yn is given by

P y ( y ) = Ω ( x 1 ) ... Ω ( x m ) P x ( x ) i = 1 n δ y i - f i ( x ) d x 1 ... d x m , P y ( y ) = Ω ( x 1 ) ... Ω ( x m ) P x ( x ) i = 1 n δ y i - f i ( x ) d x 1 ... d x m ,
(11)

where Ω(xi)Ω(xi) is the domain of xixi.

So, it is in this sense that we understand Equation 10. The expression listed there represents function ff of Equation 11.

Now, let us divide the infinitesimal interval [t,t+dt][t,t+dt] into nn sub-infinitesimals, each of the same length dt/ndt/n. The subinfinitesimal intervals will be [t,t1=t+dt][t,t1=t+dt], [t1,t2=t1+dt][t1,t2=t1+dt], and so on. The ff equation for the corresponding xx will be

x t i = x t i - 1 + x i ' d t n , x t i - 1 , t i - 1 , i = 1 , 2 , ... , n x t i = x t i - 1 + x i ' d t n , x t i - 1 , t i - 1 , i = 1 , 2 , ... , n
(12)

or

x t i - x t i - 1 = x i ' d t n , x t i - 1 , t i - 1 . x t i - x t i - 1 = x i ' d t n , x t i - 1 , t i - 1 .
(13)

Clearly

i = 1 n x t i - x t i - 1 = x ( t + d t ) - x ( t ) = x ' d t , ( x , t ) . i = 1 n x t i - x t i - 1 = x ( t + d t ) - x ( t ) = x ' d t , ( x , t ) .
(14)

Therefore

x ' d t , ( x , t ) = i = 1 n x i ' d t n , x t i - 1 , t i - 1 . x ' d t , ( x , t ) = i = 1 n x i ' d t n , x t i - 1 , t i - 1 .
(15)

This is essentially the compounded Chapman-Kolmogorov equation.

Now we make use of the fact that dtdt is infinitesimal to begin with. Therefore we can replace x(ti-1)x(ti-1) and ti-1ti-1 with just xx and tt, so that

x ' d t , ( x , t ) = i = 1 n x i ' d t n , x , t . x ' d t , ( x , t ) = i = 1 n x i ' d t n , x , t .
(16)

This makes the xi'dtn,x,txi'dtn,x,t independent of each other, so that we can apply the Central Limit Theorem to this process. Furthermore, making use of

Lemma 3.2 (Sum of Uncorrelated Variables)

If pp and qq are uncorrelated random variables and s=p+qs=p+q in the RVT theorem sense, then

s = p + q and var ( s ) = var ( p ) + var ( q ) . s = p + q and var ( s ) = var ( p ) + var ( q ) .
(17)

...... we can immediately write

x ' d t , ( x , t ) = i = 1 n x i ' d t n , x , t , var x ' d t , ( x , t ) = i = 1 n var x i ' d t n , x , t . x ' d t , ( x , t ) = i = 1 n x i ' d t n , x , t , var x ' d t , ( x , t ) = i = 1 n var x i ' d t n , x , t .
(18)

Furthermore, because this is the Markov process, the xi'xi' all have the same probability density Πxi'|dtn|(x,t)Πxi'|dtn|(x,t) and therefore the same mean and variance, hence

x ' d t , ( x , t ) = n x ' d t n , ( x , t ) , var x ' d t , ( x , t ) = n var x ' d t n , ( x , t ) , x ' d t , ( x , t ) = n x ' d t n , ( x , t ) , var x ' d t , ( x , t ) = n var x ' d t n , ( x , t ) ,
(19)

which completes our proof of Equation 1.

Moments and the Kramers-Moyal Equations

The propagator moment functions of the Markov process are defined by

Π n ( x , t ) d t = Ω ( x ' ) x ' n Π x ' | d t | ( x , t ) d x ' . Π n ( x , t ) d t = Ω ( x ' ) x ' n Π x ' | d t | ( x , t ) d x ' .
(20)

Wherefrom we obtain immediately

Π 1 ( x , t ) d t = x ' = μ ( x , t ) d t . Π 1 ( x , t ) d t = x ' = μ ( x , t ) d t .
(21)

The second moment is related to variance, namely

D ( x , t ) d t = var x ' = x ' 2 - x ' 2 = Π 2 ( x , t ) d t - μ ( x , t ) d t 2 = Π 2 ( x , t ) d t - O ( d t ) 2 . D ( x , t ) d t = var x ' = x ' 2 - x ' 2 = Π 2 ( x , t ) d t - μ ( x , t ) d t 2 = Π 2 ( x , t ) d t - O ( d t ) 2 .
(22)

But dtdt is infinitesimal, therefore, dropping the higher than linear powers of it

Π 2 ( x , t ) d t = D ( x , t ) d t . Π 2 ( x , t ) d t = D ( x , t ) d t .
(23)

In summary

Π 1 ( x , t ) = μ ( x , t ) and Π 2 ( x , t ) = D ( x , t ) . Π 1 ( x , t ) = μ ( x , t ) and Π 2 ( x , t ) = D ( x , t ) .
(24)

The higher moments all vanish in the infinitesimal limit, because they are all of O(dt)2O(dt)2 or higher. That this is so can be seen by making use of the general expression for the nn-th moment of the Gaussian distribution

x n = n ! k = 0 , 2 , 4 , ... n x 0 n - k σ k 2 k ( n - k ) ! ( k / 2 ) ! x n = n ! k = 0 , 2 , 4 , ... n x 0 n - k σ k 2 k ( n - k ) ! ( k / 2 ) !
(25)

and substituting D(x,y)dtD(x,y)dt in place of σσ and μ(x,y)dtμ(x,y)dt in place of x0x0. For example, for the third moment we'll have terms such as (neglecting coefficients)

μ d t 3 , μ d t × D d t 2 . μ d t 3 , μ d t × D d t 2 .
(26)

The Kramers-Moyal equations terminate after two terms only and look as follows

  • forward:
    t P ( x , t ) | ( x 0 , t 0 ) = - x Π 1 ( x , t ) P ( x , t ) | ( x 0 , t 0 ) + 1 2 2 x 2 Π 2 ( x , t ) P ( x , t ) | ( x 0 , t 0 ) = - x μ ( x , t ) P ( x , t ) | ( x 0 , t 0 ) + 1 2 2 x 2 D ( x , t ) P ( x , t ) | ( x 0 , t 0 ) , t P ( x , t ) | ( x 0 , t 0 ) = - x Π 1 ( x , t ) P ( x , t ) | ( x 0 , t 0 ) + 1 2 2 x 2 Π 2 ( x , t ) P ( x , t ) | ( x 0 , t 0 ) = - x μ ( x , t ) P ( x , t ) | ( x 0 , t 0 ) + 1 2 2 x 2 D ( x , t ) P ( x , t ) | ( x 0 , t 0 ) ,
    (27)
  • backward:
    - t 0 P ( x , t ) | ( x 0 , t 0 ) = Π 1 ( x 0 , t 0 ) x 0 P ( x , t ) | ( x 0 , t 0 ) + 1 2 Π 2 ( x 0 , t 0 ) 2 x 0 2 P ( x , t ) | ( x 0 , t 0 ) = μ ( x 0 , t 0 ) x 0 P ( x , t ) | ( x 0 , t 0 ) + 1 2 D ( x 0 , t 0 ) 2 x 0 2 P ( x , t ) | ( x 0 , t 0 ) , - t 0 P ( x , t ) | ( x 0 , t 0 ) = Π 1 ( x 0 , t 0 ) x 0 P ( x , t ) | ( x 0 , t 0 ) + 1 2 Π 2 ( x 0 , t 0 ) 2 x 0 2 P ( x , t ) | ( x 0 , t 0 ) = μ ( x 0 , t 0 ) x 0 P ( x , t ) | ( x 0 , t 0 ) + 1 2 D ( x 0 , t 0 ) 2 x 0 2 P ( x , t ) | ( x 0 , t 0 ) ,
    (28)

which are the forward and the backward Fokker-Planck equations, see Equation 2 and Equation 3.

Introducing

J ( x , t ) | ( x 0 , t 0 ) = μ ( x , t ) P ( x , t ) | ( x 0 , t 0 ) - 1 2 x D ( x , t ) P ( x , t ) | ( x 0 , t 0 ) J ( x , t ) | ( x 0 , t 0 ) = μ ( x , t ) P ( x , t ) | ( x 0 , t 0 ) - 1 2 x D ( x , t ) P ( x , t ) | ( x 0 , t 0 )
(29)

we can rewrite the forward Fokker-Planck equation as

t P ( x , t ) | ( x 0 , t 0 ) = - x J ( x , t ) | ( x 0 , t 0 ) , t P ( x , t ) | ( x 0 , t 0 ) = - x J ( x , t ) | ( x 0 , t 0 ) ,
(30)

which gives J(x,t)|(x0,t0)J(x,t)|(x0,t0) the interpretation of the probability current density for the continuous Markov process: the change of probability within any enclosed volume is equal to the probability current flow through the boundary of the volume.

The Evolution of the Moments

The Markov moments evolution equation

d d t x n ( t ) = k = 1 n n k x n - k ( t ) Π k x ( t ) , t d d t x n ( t ) = k = 1 n n k x n - k ( t ) Π k x ( t ) , t
(31)

simplifies too, because the sequence cuts off just after two terms:

d d t x n ( t ) = n x n - 1 Π 1 x , t + n ( n - 1 ) 2 x n - 2 Π 2 x , t = x n - 1 μ x , t + n ( n - 1 ) 2 x n - 2 D x , t d d t x n ( t ) = n x n - 1 Π 1 x , t + n ( n - 1 ) 2 x n - 2 Π 2 x , t = x n - 1 μ x , t + n ( n - 1 ) 2 x n - 2 D x , t
(32)

with the initial condition

x n ( t 0 ) = x 0 n . x n ( t 0 ) = x 0 n .
(33)

Specifically, for the mean

d d t x ( t ) = μ ( x , t ) , x ( t 0 ) = x 0 . d d t x ( t ) = μ ( x , t ) , x ( t 0 ) = x 0 .
(34)

This is why the drift function μ(x,t)μ(x,t) is sometimes called the drift velocity.

For the variance we have the following evolution equation, which is derived from the moments, given that var(x)=x2-x2var(x)=x2-x2

d d t var x ( t ) = 2 x Π 1 - x Π 1 + Π 2 , d d t var x ( t ) = 2 x Π 1 - x Π 1 + Π 2 ,
(35)

which for the continous Markov process translates to

d d t var x ( t ) = 2 x μ ( x , t ) - x μ ( x , t ) + D ( x , t ) , var x ( t 0 ) = 0 . d d t var x ( t ) = 2 x μ ( x , t ) - x μ ( x , t ) + D ( x , t ) , var x ( t 0 ) = 0 .
(36)

Finally, the covariance equation reads

d d t 2 cov x ( t 1 ) , x ( t 2 ) = x ( t 1 ) Π 1 ( x ( t 2 ) , t 2 ) - x ( t 1 ) Π 1 ( x ( t 2 ) , t 2 ) , d d t 2 cov x ( t 1 ) , x ( t 2 ) = x ( t 1 ) Π 1 ( x ( t 2 ) , t 2 ) - x ( t 1 ) Π 1 ( x ( t 2 ) , t 2 ) ,
(37)

which here translates to

d d t 2 cov x ( t 1 ) , x ( t 2 ) = x ( t 1 ) μ ( x ( t 2 ) , t 2 ) - x ( t 1 ) μ ( x ( t 2 ) , t 2 ) , cov ( x ( t 1 ) , x ( t 1 ) ) = var ( x ( t 1 ) ) . d d t 2 cov x ( t 1 ) , x ( t 2 ) = x ( t 1 ) μ ( x ( t 2 ) , t 2 ) - x ( t 1 ) μ ( x ( t 2 ) , t 2 ) , cov ( x ( t 1 ) , x ( t 1 ) ) = var ( x ( t 1 ) ) .
(38)

Example Processes

  • Liouville Processes: Joseph Liouville (1809–1882) was a French mathematician who taught mathematics and mechanics at the Collège de France. The Liouville process is a continuous Markov process characterized by its diffusion coefficient being zero:
    D(x,t)=0.D(x,t)=0.
    (39)
    But the Gaussian distribution for σ=0σ=0 is a Dirac delta. Thus the Liouville process is not a real Markov process. It is a deterministic process instead, defined by the ordinary differential equation
    dx(t)dt=μ(x,t),x(t0)=x0.dx(t)dt=μ(x,t),x(t0)=x0.
    (40)
    The Markov probability density function P(x,t)|(x0,t0)P(x,t)|(x0,t0) for the Liouville process remains the Dirac delta throughout the evolution. It is because this is a deterministic process that we can write the differential equation for it. This cannot be done for a real, stochastic Markov process.
  • Wiener Processes: Norbert Wiener (1894–1964) was an American mathematician, a professor at MIT, interested in stochastic and noise processes and their applications to electronic engineering, communication and control systems. Wiener processes arise when the characterizing functions of the continuous Markov process are constants, that is
    μ(x,t)=μD(x,t)=D.μ(x,t)=μD(x,t)=D.
    (41)
    Setting
    μ=0andD=1μ=0andD=1
    (42)
    creates an archetypal process of this type, called the special Wiener process, its propagator density function
    Πx'|dt|(x,t)=e-x'2/(2dt)2πdt.Πx'|dt|(x,t)=e-x'2/(2dt)2πdt.
    (43)
    For a not-necessarily special Wiener process the evolution equations for the moments, Equation 34, Equation 36 and Equation 38, solve trivially to
    x(t)=x0+μ·t-t0,var(x(t))=D·t-t0,covxt1,xt2=D·t1-t0.x(t)=x0+μ·t-t0,var(x(t))=D·t-t0,covxt1,xt2=D·t1-t0.
    (44)
    Wiener processes are completely homogeneous. For such processes we have a formula that lets us compute P(x,t)|(x0,t0)P(x,t)|(x0,t0), namely
    P(x,t)|(x0,t0)=12π-eik(x-x0)Π^k|dt|ndk,P(x,t)|(x0,t0)=12π-eik(x-x0)Π^k|dt|ndk,
    (45)
    where Π^Π^ is the Fourier transform of ΠΠ
    Π^k|dt|=-Πx'|dt|eikx'dx'Π^k|dt|=-Πx'|dt|eikx'dx'
    (46)
    and nn is the number of divisions of the [t0,t][t0,t] interval of width dtdt each. It is a truth universally acknowledged, that a Fourier transform of a Gaussian is a Gaussian. Specifically
    -e-ikx12πσ2e-(x-x0)2/(2σ2)dx=e-ikx0e-k2σ2/2-e-ikx12πσ2e-(x-x0)2/(2σ2)dx=e-ikx0e-k2σ2/2
    (47)
    Por consiguiente,
    Π^k|dt|=-e-ikx'12πDdte-(x'-μdt)2/(2Ddt)dx'=e-ikμdte-k2Ddt/2.Π^k|dt|=-e-ikx'12πDdte-(x'-μdt)2/(2Ddt)dx'=e-ikμdte-k2Ddt/2.
    (48)
    This lets us evaluate the seemingly cumbersome Π^nΠ^n, because ndt=t-t0ndt=t-t0, namely
    Π^k|dt|n=e-ikμdte-k2Ddt/2n=e-ikμndte-k2Dndt/2=e-ikμ·(t-t0)e-k2D·(t-t0)/2.Π^k|dt|n=e-ikμdte-k2Ddt/2n=e-ikμndte-k2Dndt/2=e-ikμ·(t-t0)e-k2D·(t-t0)/2.
    (49)
    We use this to rewrite Equation 45
    P(x,t)|(x0,t0)=12π-eik(x-x0)e-ikμ·(t-t0)e-k2D·(t-t0)/2dk12π-eik(x-x0)-μ·(t-t0)e-k2D·(t-t0)/2dk.P(x,t)|(x0,t0)=12π-eik(x-x0)e-ikμ·(t-t0)e-k2D·(t-t0)/2dk12π-eik(x-x0)-μ·(t-t0)e-k2D·(t-t0)/2dk.
    (50)
    It is a truth universally acknowledged that a reverse Fourier transform of a Gaussian is a Gaussian too. We could use Equation 47 here and just change the sign of kk. But with the ππ factor all over the place, it's easy to make a mistake, so the equation to follow this time is
    -eikxe-ak2dk=π|a|e-x2/(4a).-eikxe-ak2dk=π|a|e-x2/(4a).
    (51)
    This yields for Equation 50
    P(x,t)|(x0,t0)=12πD·(t-t0)exp-(x-x0)-μ·(t-t0)22D·(t-t0).P(x,t)|(x0,t0)=12πD·(t-t0)exp-(x-x0)-μ·(t-t0)22D·(t-t0).
    (52)
  • Ornstein-Uhlenbeck Processes: Leonard Ornstein (1880–1941) was a Dutch physicist and a professor at University of Utrecht. George Eugene Uhlenbeck (1900–1988) was a Dutch physicist, who held simultaneous positions at universities in the Netherlands and in the US, but eventually settled in the US permanently. He is best known for his co-discovery of electron spin with Goudsmit in 1925. What is nowadays called the Ornstein-Uhlenbeck process is characterized by
    μ(x,t)=-kx,D(x,t)=D.μ(x,t)=-kx,D(x,t)=D.
    (53)
    Whereas the diffusion is like in the Wiener process, a constant, the drift is not. The equation for the mean is
    ddtx(t)=-kx,x(t0)=x0,ddtx(t)=-kx,x(t0)=x0,
    (54)
    which solves to
    x(t)=x0e-k(t-t0).x(t)=x0e-k(t-t0).
    (55)
    The variance equation is
    ddtvar(x(t))=2xμ-xμ+D=-2kx2-x2+D=-2kvar(x)+D,var(x(t0))=0.ddtvar(x(t))=2xμ-xμ+D=-2kx2-x2+D=-2kvar(x)+D,var(x(t0))=0.
    (56)
    This is again a simple non-homogeneous equation. We try a solution to the full non-homogeneous problem by the sum of the homogeneous solution and a constant:
    var(x)=Ae-2kt+B.var(x)=Ae-2kt+B.
    (57)
    We can specify the two constants by testing the solution against the full, non-homogeneous equation and the initial condition.
    ddtAe-2kt+B=-2kAe-2kt=-2kAe-2kt+B+D=-2kAe-2kt-2kB+D.ddtAe-2kt+B=-2kAe-2kt=-2kAe-2kt+B+D=-2kAe-2kt-2kB+D.
    (58)
    Thus, to satisfy the equation we have to make
    B=D2k.B=D2k.
    (59)
    Now, we must choose AA so that var(x(t0))=0var(x(t0))=0.
    0=Ae-2kt0+D2k,0=Ae-2kt0+D2k,
    (60)
    wherefrom
    A=-D2ke2kt0.A=-D2ke2kt0.
    (61)
    Por consiguiente
    var(x)=-D2ke2kt0e-2kt+D2k=D2k1-e-2k(t-t0).var(x)=-D2ke2kt0e-2kt+D2k=D2k1-e-2k(t-t0).
    (62)
    The equation for the covariance is
    ddt2covx(t1),x(t2)=x(t1)μ(x(t2),t2)-x(t1)μ(x(t2),t2)=-kx(t1)x(t2)-x(t1)x(t2)=-kcovx(t1),x(t2),covx(t1),x(t2=t1)=var(x(t1)),ddt2covx(t1),x(t2)=x(t1)μ(x(t2),t2)-x(t1)μ(x(t2),t2)=-kx(t1)x(t2)-x(t1)x(t2)=-kcovx(t1),x(t2),covx(t1),x(t2=t1)=var(x(t1)),
    (63)
    the solution of which is
    covx(t1),x(t2)=varx(t1)e-k(t2-t1)=D2k1-e-2k(t-t0)e-k(t2-t1).covx(t1),x(t2)=varx(t1)e-k(t2-t1)=D2k1-e-2k(t-t0)e-k(t2-t1).
    (64)
    The solution for P(x,t)|(x0,t0)P(x,t)|(x0,t0) in this case is a little complicated. We'll write it here without derivation. It can be checked directly against the Fokker-Planck equation.
    P(x,t)|(x0,t0)=1πDk1-e-2k(t-t0)exp-x-x0e-k(t-t0)2Dk1-e-2k(t-t0).P(x,t)|(x0,t0)=1πDk1-e-2k(t-t0)exp-x-x0e-k(t-t0)2Dk1-e-2k(t-t0).
    (65)

The Langevin Equation

This is prounounced the French way (the “e” in the middle is not pronounced), since Paul Langevin (1872–1946) was a French physicist. He was a student of Pierre Curie and briefly a lover of Marie Curie, his senior by 5 years (sic! in 1910–1911, Pierre Curie had been killed in a street accident in 1906), which lead to a press scandal. He co-invented and patented the ultrasonic methods of submarine detection in 1916–1917. He spent WWII under the house arrest in Vichy France and after his death was entombed in the French Panthéon.

The Langevin equation describes continuous Markov processes not in terms of P(x,t)|(x0,t0)P(x,t)|(x0,t0) or Πx'|dt|(x,t)Πx'|dt|(x,t), but in terms of x'x' itself. However, x'x' is the propagator, the random variable of the propagator density function, Πx'|dt|(x,t)Πx'|dt|(x,t), and operations on such variables are tricky. They must be understood in terms of the Random Variable Transformation theorem, not in terms of direct arithmetic. Because the continuous Markov process propagator density function is a Gaussian, we begin by refreshing some properties of the Gaussian random variable and rules that govern its arithmetic.

As we have remarked already in "The Propagator" a random variable is an argument of a probability density function. To emphasize this we write it down as an ordered pair (x,P(x))(x,P(x)). If x1,P1(x1)x1,P1(x1) and x2,P2(x2)x2,P2(x2) are two independent random variables, and y=f(x1,x2)y=f(x1,x2), then the probability distribution of yy is given by

P y ( y ) = Ω ( x 1 ) Ω ( x 2 ) P 1 ( x 1 ) P 2 ( x 2 ) δ y - f ( x 1 , x 2 ) d x 1 d x 2 . P y ( y ) = Ω ( x 1 ) Ω ( x 2 ) P 1 ( x 1 ) P 2 ( x 2 ) δ y - f ( x 1 , x 2 ) d x 1 d x 2 .
(66)

Next, we introduce the following common notation for Gaussians, namely

N x 0 , σ x 2 ( x ) = def 1 2 π σ x 2 e - ( x - x 0 ) 2 / ( 2 σ x 2 ) . N x 0 , σ x 2 ( x ) = def 1 2 π σ x 2 e - ( x - x 0 ) 2 / ( 2 σ x 2 ) .
(67)

In particular,

N 0 , 1 = def N . N 0 , 1 = def N .
(68)

Now we can write down a number of formulas for operations on Gaussian variables

y , N y 0 , σ y 2 z = σ z y + z 0 z , P z = N σ z y 0 + z 0 , σ y 2 σ z 2 , specifically y , N z = σ z y + z 0 z , P z = N z 0 , σ z 2 , x , N x 0 , σ x 2 , y , N y 0 , σ y 2 z = x + y z , P z = N x 0 + y 0 , σ x 2 + σ y 2 , x , N 0 , 1 , y , N 0 , 1 z = x / y z , P z = C 0 , 1 , y , N y 0 , σ y 2 z = σ z y + z 0 z , P z = N σ z y 0 + z 0 , σ y 2 σ z 2 , specifically y , N z = σ z y + z 0 z , P z = N z 0 , σ z 2 , x , N x 0 , σ x 2 , y , N y 0 , σ y 2 z = x + y z , P z = N x 0 + y 0 , σ x 2 + σ y 2 , x , N 0 , 1 , y , N 0 , 1 z = x / y z , P z = C 0 , 1 ,
(69)

where

C z 0 , a ( z ) = def 1 π a ( z - z 0 ) 2 + a 2 C z 0 , a ( z ) = def 1 π a ( z - z 0 ) 2 + a 2
(70)

is the Cauchy-Lorentz density. In this case

C 0 , 1 ( z ) = 1 π 1 z 2 + 1 . C 0 , 1 ( z ) = 1 π 1 z 2 + 1 .
(71)

Sometimes people write the above equations in a form (horrible) that seems to blur the distinction between random variables and traditionally understood variables. So, for example, Equation 69 would look as follows

σ z N y 0 , σ y 2 + z 0 = N σ z y 0 + z 0 , σ y 2 σ z 2 , specifically σ z N + z 0 = N z 0 , σ z 2 , N x 0 , σ x 2 + N y 0 , σ y 2 = N x 0 + y 0 , σ x 2 + σ y 2 . σ z N y 0 , σ y 2 + z 0 = N σ z y 0 + z 0 , σ y 2 σ z 2 , specifically σ z N + z 0 = N z 0 , σ z 2 , N x 0 , σ x 2 + N y 0 , σ y 2 = N x 0 + y 0 , σ x 2 + σ y 2 .
(72)

The probability density function is substituted in place of the random variable yy. When one encounters such expressions, one needs to remember that what these equation really mean is Equation 69.

It is in this spirit that we would write for the Markov propagator

x ' d t | ( x , t ) = x ( t + d t ) - x ( t ) x ' d t | ( x , t ) = x ( t + d t ) - x ( t )
(73)

or, renaming x'x' by dxdx,

d x d t | ( x , t ) = x ( t + d t ) - x ( t ) . d x d t | ( x , t ) = x ( t + d t ) - x ( t ) .
(74)

The propagator dxdx is associated with the propagator density function Nμ(x,t)dt,D(x,t)dtNμ(x,t)dt,D(x,t)dt, which we can rewrite as ND(x,t)dt+μ(x,t)dtND(x,t)dt+μ(x,t)dt. So, we end up with

d x d t | ( x , t ) = x ( t + d t ) - x ( t ) = N D ( x , t ) d t + μ ( x , t ) d t , d x d t | ( x , t ) = x ( t + d t ) - x ( t ) = N D ( x , t ) d t + μ ( x , t ) d t ,
(75)

wherefrom

x ( t + d t ) = x ( t ) + N D ( x ( t ) , t ) d t + μ ( x ( t ) , t ) d t , x ( t + d t ) = x ( t ) + N D ( x ( t ) , t ) d t + μ ( x ( t ) , t ) d t ,
(76)

where we have emphasized that xx in the arguments of DD and μμ is a function of tt. This is called the first form of the Langevine equation.

A special Wiener process is a Wiener process for which μdt=0μdt=0 (because μ=0μ=0) and Ddt=dtDdt=dt (because D=1D=1). Therefore for this process

d w ( d t ) = Π w | d t | ( x , t ) = N 0 , d t ( w ) = N ( w ) d t . d w ( d t ) = Π w | d t | ( x , t ) = N 0 , d t ( w ) = N ( w ) d t .
(77)

This lets us replace ND(x,t),dtND(x,t),dt in the Langevin equation with dw(dt)D(x,t)dw(dt)D(x,t), the dtdt factor being absorbed into the special Wiener process propagator. The resulting equation

x ( t + d t ) = x ( t ) + d w ( d t ) D ( x ( t ) , t ) + μ ( x ( t ) , t ) d t x ( t + d t ) = x ( t ) + d w ( d t ) D ( x ( t ) , t ) + μ ( x ( t ) , t ) d t
(78)

is called the second form of the Langevin equation.

The Schrödinger Equation

In 1966, Edward Nelson, a Princeton professor of mathematics, demonstrated a derivation of the Schrödinger equation from Brownian motion. His 7-pages long paper, published in Physical Review, vol. 150, no. 4, pp. 1079–1085, presented an introduction to stochastic mechanics, Brownian motion and, eventually, the actual derivation of the Schrödinger equation, based on a number of scattered assumptions. A much simpler derivation, the one we're going to present here, was delivered a year later by Luis de la Peña-Auerbach, a professor of physics at Universidad Nacional de México in a brief, 2-pages long letter published in Physics Letters, vol. 24A, no. 11, pp. 603–604.

The starting point in de la Peña's derivation is the continuity equation, Equation 30,

t P ( x , t ) | ( x 0 , t 0 ) = - x J ( x , t ) | ( x 0 , t 0 ) , t P ( x , t ) | ( x 0 , t 0 ) = - x J ( x , t ) | ( x 0 , t 0 ) ,
(79)

where, Equation 29

J ( x , t ) | ( x 0 , t 0 ) = μ ( x , t ) P ( x , t ) | ( x 0 , t 0 ) - 1 2 x D ( x , t ) P ( x , t ) | ( x 0 , t 0 ) . J ( x , t ) | ( x 0 , t 0 ) = μ ( x , t ) P ( x , t ) | ( x 0 , t 0 ) - 1 2 x D ( x , t ) P ( x , t ) | ( x 0 , t 0 ) .
(80)

The probability density function P(x,t)|(x0,t0)P(x,t)|(x0,t0) is real and positive or zero everywhere, therefore we can represent it by

P ( x , t ) | ( x 0 , t 0 ) = e 2 R ( x , t ) . P ( x , t ) | ( x 0 , t 0 ) = e 2 R ( x , t ) .
(81)

Why 2R2R instead of just RR will become clear later. It'll make our equations down the road prettier. Of course, RR has to be such that

- e 2 R ( x , t ) d x = 1 - e 2 R ( x , t ) d x = 1
(82)

for all tt.

We can rewrite Equation 80 in terms of RR too

J ( x , t ) | ( x 0 , t 0 ) = μ ( x , t ) e 2 R ( x , t ) - 1 2 x D ( x , t ) e 2 R ( x , t ) = μ ( x , t ) e 2 R ( x , t ) - 1 2 e 2 R ( x , t ) x D ( x , t ) + D ( x , t ) 2 e 2 R ( x , t ) x R ( x , t ) = e 2 R ( x , t ) μ ( x , t ) - 1 2 x D ( x , t ) - D ( x , t ) x R ( x , t ) , J ( x , t ) | ( x 0 , t 0 ) = μ ( x , t ) e 2 R ( x , t ) - 1 2 x D ( x , t ) e 2 R ( x , t ) = μ ( x , t ) e 2 R ( x , t ) - 1 2 e 2 R ( x , t ) x D ( x , t ) + D ( x , t ) 2 e 2 R ( x , t ) x R ( x , t ) = e 2 R ( x , t ) μ ( x , t ) - 1 2 x D ( x , t ) - D ( x , t ) x R ( x , t ) ,
(83)

wherefrom we derive

v ( x , t ) = μ ( x , t ) - 1 2 x D ( x , t ) - D ( x , t ) x R ( x , t ) , v ( x , t ) = μ ( x , t ) - 1 2 x D ( x , t ) - D ( x , t ) x R ( x , t ) ,
(84)

the expression for current velocity, given that

J = ρ · v J = ρ · v
(85)

normally, where in our case

ρ = P ( x , t ) | ( x 0 , t 0 ) = e 2 R ( x , t ) . ρ = P ( x , t ) | ( x 0 , t 0 ) = e 2 R ( x , t ) .
(86)

We return to Equation 79, substituting Equation 81 and Equation 84, which yields

t P ( x , t ) | ( x 0 , t 0 ) = t e 2 R ( x , t ) = 2 e 2 R ( x , t ) t R ( x , t ) t P ( x , t ) | ( x 0 , t 0 ) = t e 2 R ( x , t ) = 2 e 2 R ( x , t ) t R ( x , t )
(87)

on the left side and

- x e 2 R ( x , t ) v ( x , t ) = - 2 e 2 R ( x , t ) v ( x , t ) x R ( x , t ) - e 2 R ( x , t ) x v ( x , t ) - x e 2 R ( x , t ) v ( x , t ) = - 2 e 2 R ( x , t ) v ( x , t ) x R ( x , t ) - e 2 R ( x , t ) x v ( x , t )
(88)

on the right which, upon division by 2e2R(x,t)2e2R(x,t), yields

t R ( x , t ) = - v ( x , t ) x R ( x , t ) - 1 2 x v ( x , t ) . t R ( x , t ) = - v ( x , t ) x R ( x , t ) - 1 2 x v ( x , t ) .
(89)

So far we have merely rewritten the Fokker-Planck equation using R(x,t)R(x,t) and v(x,t)v(x,t). Equation 89 is fully equivalent to the forward Fokker-Planck. But now we introduce an assumption that goes beyond Fokker-Planck (and beyond Markov). We assume that

v = α x S ( x , t ) . v = α x S ( x , t ) .
(90)

If we were to carry out this reasoning in 3D, the assumption would be

v ( r , t ) = α S ( r , t ) . v ( r , t ) = α S ( r , t ) .
(91)

This would imply that the velocity field is curl free

× v ( x , t ) = 0 . × v ( x , t ) = 0 .
(92)

This is also true in the opposite direction, that is, any continuous, differentiable and curl-free velocity field can be represented by a gradient. A vector field that is a gradient of a function is called conservative. Conservative vector fields have various interesting properties including path independence. The line integral of such a field from one point to another depends on end points only, and not on the path taken.

Having made this assumption, we combine R(x,t)R(x,t) and S(x,t)S(x,t) into a single complex-valued function

Ψ ( x , t ) = e R ( x , t ) + i S ( x , t ) , Ψ ( x , t ) = e R ( x , t ) + i S ( x , t ) ,
(93)

where RR and SS are both real. It is easy to see that

Ψ ¯ ( x , t ) Ψ ( x , t ) = e R ( x , t ) - i S ( x , t ) e R ( x , t ) + i S ( x , t ) = e 2 R ( x , t ) = P ( x , t ) | ( x 0 , t 0 ) , Ψ ¯ ( x , t ) Ψ ( x , t ) = e R ( x , t ) - i S ( x , t ) e R ( x , t ) + i S ( x , t ) = e 2 R ( x , t ) = P ( x , t ) | ( x 0 , t 0 ) ,
(94)

so Ψ(x,t)Ψ(x,t) is the probability amplitude. The above also explains why we chose to have the 2 in e2R(x,t)e2R(x,t).

To derive a differential equation for ΨΨ we differentiate it first

t Ψ ( x , t ) = Ψ ( x , t ) t R ( x , t ) + i t S ( x , t ) , x Ψ ( x , t ) = Ψ ( x , t ) x R ( x , t ) + i x S ( x , t ) , 2 x 2 Ψ ( x , t ) = x Ψ ( x , t ) x R ( x , t ) + i x S ( x , t ) + Ψ ( x , t ) 2 x 2 R ( x , t ) + i 2 x 2 S ( x , t ) = Ψ ( x , t ) x R ( x , t ) + i x S ( x , t ) 2 + Ψ ( x , t ) 2 x 2 R ( x , t ) + i 2 x 2 S ( x , t ) = Ψ ( x , t ) x R ( x , t ) 2 + 2 i x R ( x , t ) x S ( x , t ) - x S ( x , t ) 2 + Ψ ( x , t ) 2 x 2 R ( x , t ) + i 2 x 2 S ( x , t ) , t Ψ ( x , t ) = Ψ ( x , t ) t R ( x , t ) + i t S ( x , t ) , x Ψ ( x , t ) = Ψ ( x , t ) x R ( x , t ) + i x S ( x , t ) , 2 x 2 Ψ ( x , t ) = x Ψ ( x , t ) x R ( x , t ) + i x S ( x , t ) + Ψ ( x , t ) 2 x 2 R ( x , t ) + i 2 x 2 S ( x , t ) = Ψ ( x , t ) x R ( x , t ) + i x S ( x , t ) 2 + Ψ ( x , t ) 2 x 2 R ( x , t ) + i 2 x 2 S ( x , t ) = Ψ ( x , t ) x R ( x , t ) 2 + 2 i x R ( x , t ) x S ( x , t ) - x S ( x , t ) 2 + Ψ ( x , t ) 2 x 2 R ( x , t ) + i 2 x 2 S ( x , t ) ,
(95)

wherefrom

Ψ ( x , t ) t R ( x , t ) = t Ψ ( x , t ) - i Ψ ( x , t ) t S ( x , t ) . Ψ ( x , t ) t R ( x , t ) = t Ψ ( x , t ) - i Ψ ( x , t ) t S ( x , t ) .
(96)

This tells us that by multiplying both sides of Equation 89 by Ψ(x,t)Ψ(x,t) we'll obtain an equation for ΨΨ itself:

Ψ ( x , t ) t R ( x , t ) = t Ψ ( x , t ) - i Ψ ( x , t ) t S ( x , t ) = - Ψ ( x , t ) v ( x , t ) x R ( x , t ) + 1 2 x v ( x , t ) = - α Ψ ( x , t ) x S ( x , t ) x R ( x , t ) + 1 2 2 x 2 S ( x , t ) . Ψ ( x , t ) t R ( x , t ) = t Ψ ( x , t ) - i Ψ ( x , t ) t S ( x , t ) = - Ψ ( x , t ) v ( x , t ) x R ( x , t ) + 1 2 x v ( x , t ) = - α Ψ ( x , t ) x S ( x , t ) x R ( x , t ) + 1 2 2 x 2 S ( x , t ) .
(97)

Now we make use of the expression for the second derivative of ΨΨ given by Equation 95 to rewrite the right side of this equation in terms of 2Ψ/x22Ψ/x2 minus stuff we don't want,

- α Ψ ( x , t ) x S ( x , t ) x R ( x , t ) + 1 2 2 x 2 S ( x , t ) = - α 2 i 2 x 2 Ψ ( x , t ) - Ψ R x 2 - S x 2 + 2 R x 2 . - α Ψ ( x , t ) x S ( x , t ) x R ( x , t ) + 1 2 2 x 2 S ( x , t ) = - α 2 i 2 x 2 Ψ ( x , t ) - Ψ R x 2 - S x 2 + 2 R x 2 .
(98)

We combine this result with Equation 97, lift the imaginary unit from the denominator to the numerator (which changes the sign) and transfer -iΨS/t-iΨS/t to the right side of the equation to obtain

t Ψ ( x , t ) = i α 2 2 x 2 Ψ ( x , t ) - i Ψ ( x , t ) α 2 R x 2 - S x 2 + 2 R x 2 - S t , t Ψ ( x , t ) = i α 2 2 x 2 Ψ ( x , t ) - i Ψ ( x , t ) α 2 R x 2 - S x 2 + 2 R x 2 - S t ,
(99)

which, upon the multiplication of both sides by ii becomes

i t Ψ ( x , t ) = - α 2 2 x 2 Ψ ( x , t ) + V ( x , t ) Ψ ( x , t ) , i t Ψ ( x , t ) = - α 2 2 x 2 Ψ ( x , t ) + V ( x , t ) Ψ ( x , t ) ,
(100)

where

V ( x , t ) = α 2 R x 2 - S x 2 + 2 R x 2 - S t . V ( x , t ) = α 2 R x 2 - S x 2 + 2 R x 2 - S t .
(101)

Equation 100 is the celebrated Schrödinger equation with a funny potential that depends on the probability density itself through RR and on the velocity of the probability current through SS.

In summary, we find that the Schrödinger equation is an equation of the continuous Markov process theory with some additional assumptions, one of which is given by Equation 90, that makes the probability current velocity a conservative field. But we have made another non-trivial assumption on the way, about which we're going to say more towards the end of this section. The actual coefficient in the quantum Schrödinger equation is

α = m , α = m ,
(102)

but we should remember that the Schrödinger equation is not limited to quantum physics. For example, the nonlinear Schrödinger equation,

i t Ψ ( x , t ) = - 1 2 2 x 2 Ψ ( x , t ) + κ Ψ ( x , t ) 2 Ψ ( x , t ) , i t Ψ ( x , t ) = - 1 2 2 x 2 Ψ ( x , t ) + κ Ψ ( x , t ) 2 Ψ ( x , t ) ,
(103)

which we might say is slightly similar to Equation 100 on account of VV in Equation 101 being a complicated function of ΨΨ, occurs in Manakov systems in fiber optics and in hydrodynamics, where it describes the formation of freak waves. So αα can be anything, depending on the context. The Schrödinger equation is an equation of mathematics that shows up in various situations.

We are now going to show that the definition of the potential given by Equation 101 is consistent with the true quantum Schrödinger equation. Let Ψ(x,t)Ψ(x,t) be an arbitrary solution of such an equation. We will demonstrate that this Ψ(x,t)Ψ(x,t) is also a solution of Equation 100 with VV defined by Equation 101, assuming that the Fokker-Planck equation in the form given by Equation 89 is also satisfied. So, for example, the seeming nonlinearity of Equation 100 should not exclude superpositions of various solutions of the “normal” Schrödinger equation.

We begin by substituting Ψ(x,t)=eR(x,t)+iS(x,t)Ψ(x,t)=eR(x,t)+iS(x,t) into

i t Ψ ( x , t ) = - 2 m 2 x 2 Ψ ( x , t ) + V ( x , t ) Ψ ( x , t ) , i t Ψ ( x , t ) = - 2 m 2 x 2 Ψ ( x , t ) + V ( x , t ) Ψ ( x , t ) ,
(104)

where V(x,t)V(x,t) is an arbitrary function, not related to (or dependent on) RR and SS explicitly. In particular, we do not assume that it is given by Equation 101. First we find that on such a substitution

t Ψ ( x , t ) = Ψ ( x , t ) R t + i S t , x Ψ ( x , t ) = Ψ ( x , t ) R x + i S x , 2 x 2 Ψ ( x , t ) = Ψ ( x , t ) R x 2 + 2 i R x S x - S x 2 + 2 R x 2 + i 2 S x 2 . t Ψ ( x , t ) = Ψ ( x , t ) R t + i S t , x Ψ ( x , t ) = Ψ ( x , t ) R x + i S x , 2 x 2 Ψ ( x , t ) = Ψ ( x , t ) R x 2 + 2 i R x S x - S x 2 + 2 R x 2 + i 2 S x 2 .
(105)

Now we plug this into Equation 104 to find

i Ψ R t + i S t = - 2 m Ψ R x 2 + 2 i R x S x - S x 2 + 2 R x 2 + i 2 S x 2 + V Ψ . i Ψ R t + i S t = - 2 m Ψ R x 2 + 2 i R x S x - S x 2 + 2 R x 2 + i 2 S x 2 + V Ψ .
(106)

For this to be true at all points where Ψ(x,t)0Ψ(x,t)0 we must have that

- S t = - 2 m R x 2 - S x 2 + 2 R x 2 + V , R t = - m R x S x - 2 m 2 S x 2 , - S t = - 2 m R x 2 - S x 2 + 2 R x 2 + V , R t = - m R x S x - 2 m 2 S x 2 ,
(107)

where the top equation corresponds to the real part of Equation 106 and the bottom equation corresponds to the imaginary part of Equation 106. Moving all but VV in the top equation to its left side we obtain Equation 101. In turn, substituting

- m S x = v - m S x = v
(108)

in the bottom equation yields

R t = - v R x - 1 2 v x , R t = - v R x - 1 2 v x ,
(109)

which is the same as Equation 89. This is why we can call the function defined by Equation 101 a VV.

We are now going to have an even closer look at the connection between the Fokker-Planck equation and the quantum Schrödinger equation and we will discover that in its construction we have unwittingly introduced two crucial assumptions that alter the physics of the underlying stochastic system dramatically.

Let us consider a ring of radius rr with a constant Markovian probability density PP and a constant probability current JJ. Equation 85 and Equation 86 imply that the probability current velocity vv must be constant too. Since

v = m S x v = m S x
(110)

we find that

S = m v x + f ( t ) . S = m v x + f ( t ) .
(111)

Let us introduce kk such that k=mvk=mv then

S = k x + f ( t ) S = k x + f ( t )
(112)

We are going to search for a solution of the Schrödinger-de la Peña equation such that V(x,t)=0V(x,t)=0, because then a constant PP would satisfy the remainder of the equation, as we well know from conventional quantum mechanics. Because PP is constant, RR is constant too, therefore Rx=2Rx2=0Rx=2Rx2=0. This way Equation 101 becomes

0 = - k 2 2 m - d f ( t ) d t , 0 = - k 2 2 m - d f ( t ) d t ,
(113)

which implies that

f ( t ) = - k 2 2 m t + C , f ( t ) = - k 2 2 m t + C ,
(114)

where CC is an additive constant and we can make it zero. Let us introduce ωω such that

ω = ( k ) 2 2 m , ω = ( k ) 2 2 m ,
(115)

then f(t)=-ωtf(t)=-ωt and

S = k x - ω t . S = k x - ω t .
(116)

But now we have a problem: we are on a ring, so for t=0t=0, as we increase xx from x=0x=0 the function SS would become multi-valued, first at x=2πrx=2πr, and then everywhere else. Luckily, function SS enters ΨΨ through eiSeiS, that is, through

cos S + i sin S . cos S + i sin S .
(117)

Therefore if we choose kk such that

k 2 π r = 2 π n k 2 π r = 2 π n
(118)

both the cosScosS and the sinSsinS will wrap onto themselves on having circumnavigated the ring, wherefrom we find that setting

k = n r k = n r
(119)

ensures that ΨΨ at least is going to remain a well defined single valued function on the ring. Because ω=(k)2/(2m)ω=(k)2/(2m) this condition yields

ω = 2 n 2 2 m r 2 . ω = 2 n 2 2 m r 2 .
(120)

In turn Equation 119 tells us that the velocity of the probability current in the ring must be quantized, too:

v = k m = n m r . v = k m = n m r .
(121)

How did we get from the constant probability density and the constant probability current in the ring on the Fokker-Planck side to the quantized velocity on the Schrödinger side? Clearly, we must have added something on the way to get this result. The physics obtained from the Schrödinger equation is quite different from physics that corresponds to the Fokker-Planck equation in this case.

Two additional assumptions we've made on the way are responsible for the change. The first assumption was that

v S x v S x
(122)

and this, together with Equation 101 led to

S = k x - ω t , S = k x - ω t ,
(123)

where ωω and kk are related to each other by Equation 115. But this by itself does not give us the quantization. If we were to use SS alone, we would have to conclude that k=ω=0k=ω=0, the only solution that would work on the ring.

The second assumption, however, that SS should enter ΨΨ as eiSeiS, has made SS into the phase of a wave. This is a new physical assumption. This is not just a substitution we can make willy-nilly. Together, the two assumptions, both outside the Markov process theory, introduce new physics described by

e i ( k x - ω t ) where ω = ( k ) 2 2 m e i ( k x - ω t ) where ω = ( k ) 2 2 m
(124)

This is the original de Broglie's 1924 postulate, the subject of his doctoral thesis. It is the addition of this postulate, a non-trivial step, that makes the Fokker-Planck equation into the Schrödinger equation.

Brownian Motion

At this stage you should hop to module m44376 that discusses the integral of a Markov process. There is some tedium involved, unfortunately, that we tried to stave off until absolutely necessary and now...... it is necessary. Having worked through m44376, come back here.

The Langevin equation based analysis of the Brownian motion is remarkably simple. But it is so only because we've covered so much material already. Capitalizing on this, we can derive the basic Brownian motion relationships in just a few lines.

The starting point is the Newton's equation that describes the movement of a Brownian particle of mass mm and velocity vv in a viscous fluid

m d v d t = - γ v , m d v d t = - γ v ,
(125)

where γγ is the viscosity coefficient. We can rewrite this equation in the following form

v ( t + d t ) = v ( t ) - γ m v ( t ) d t . v ( t + d t ) = v ( t ) - γ m v ( t ) d t .
(126)

But if the particle is small enough to feel individual hits of the fluid's molecules, its velocity becomes randomized. We assume that the randomization can be expressed in terms of a Gaussian random variable

N c d t ( v ) = N 0 , c d t ( v ) = 1 c d t 2 π e v 2 / ( 2 c d t ) , N c d t ( v ) = N 0 , c d t ( v ) = 1 c d t 2 π e v 2 / ( 2 c d t ) ,
(127)

where cc is a constant to be determined. In effect the vv itself becomes a random variable, this time characterized by

v ( t + d t ) = v ( t ) - γ m v ( t ) d t + N c d t . v ( t + d t ) = v ( t ) - γ m v ( t ) d t + N c d t .
(128)

We compare this against Equation 76, the Langevin equation, first form, to find that we're dealing with the continuous Markov process characterized by

μ ( v , t ) = - γ m v ( t ) and D ( v , t ) = c . μ ( v , t ) = - γ m v ( t ) and D ( v , t ) = c .
(129)

Because the diffusion of the process is constant, this is the Ornstein-Uhlenbeck process, for which we evaluated laboriously expressions for the mean Equation 55 and the variance Equation 62, namely

x ( t ) = x 0 e - k ( t - t 0 ) and var ( x ) = D 2 k 1 - e - 2 k ( t - t 0 ) . x ( t ) = x 0 e - k ( t - t 0 ) and var ( x ) = D 2 k 1 - e - 2 k ( t - t 0 ) .
(130)

In our case, substituting t00t00, xvxv, kγ/mkγ/m and DcDc, we obtain

v ( t ) = v 0 e - γ t / m and var v ( t ) = c m 2 γ 1 - e - 2 γ t / m . v ( t ) = v 0 e - γ t / m and var v ( t ) = c m 2 γ 1 - e - 2 γ t / m .
(131)

The important observation is that it is the velocity of the particle, not its position, that is subject to the Markov process. The particle's position is the integral of the Markov process. To find expressions for this in case of the Ornstein-Uhlenbeck process we need to look up module m44376, which tells us that

s ( t ) = x 0 k 1 - e - k ( t - t 0 ) , var s ( t ) = D k 2 ( t - t 0 ) - 2 k 1 - e - k ( t - t 0 ) + 1 2 k 1 - e - 2 k ( t - t 0 ) . s ( t ) = x 0 k 1 - e - k ( t - t 0 ) , var s ( t ) = D k 2 ( t - t 0 ) - 2 k 1 - e - k ( t - t 0 ) + 1 2 k 1 - e - 2 k ( t - t 0 ) .
(132)

Translating the constants to the Brownian motion case yields

x ( t ) = v 0 m γ 1 - e - γ t / m , var x ( t ) = c m 2 γ 2 t - 2 m γ 1 - e - γ t / m + m 2 γ 1 - e - 2 γ t / m . x ( t ) = v 0 m γ 1 - e - γ t / m , var x ( t ) = c m 2 γ 2 t - 2 m γ 1 - e - γ t / m + m 2 γ 1 - e - 2 γ t / m .
(133)

We determine constant cc by taking a limit tt and assuming that in this limit the Brownian particle will become thermalized, its mean velocity zero, and its mean kinetic energy

m v 2 2 = m var ( v ) 2 = k B T 2 . m v 2 2 = m var ( v ) 2 = k B T 2 .
(134)

It is kBT/2kBT/2 because we have only one degree of freedom here, and it's kBT/2kBT/2 per degree of freedom. Hence

m c m 2 γ = k B T , m c m 2 γ = k B T ,
(135)

wherefrom

c = 2 γ k B T m 2 , c = 2 γ k B T m 2 ,
(136)

wherefrom in the tt and t>>m/γt>>m/γ limit

var x = c m 2 γ 2 t = 2 γ k B T m 2 m 2 γ 2 t = 2 k B T γ t , var x = c m 2 γ 2 t = 2 γ k B T m 2 m 2 γ 2 t = 2 k B T γ t ,
(137)

wherefrom

σ x = 2 k B T γ t . σ x = 2 k B T γ t .
(138)

This is the famous Einstein's 1905 formula—the mean displacement of the particle is proportional to the square root of time—confirmed later by Chaudesaigues in 1908 and Perrin in 1909.

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks