The famous mathematician Bernhard Riemann did some novel and
exciting work in 1859 on the old question of how many primes there are
below a given number, which is usually written
πn
π
n
This function is equivalent to asking, "what is the value of the
kk-th prime for any given kk?" The latter is actually a more
specific question since an arbitrary
nn will in general fall between
primes. Indeed, the notation is most curious because
ππ refers to the number of primes up to
nn, not to any prime per se, while
nn actually refers to the values of
the primes (since ππ changes by unity
every time
nn
passes through a prime value. Thus if
nn is a prime, it is
the ππ-th prime, pretty much the inverse
of defining a function
pn
p
n
where pp is the value of the
nn-th prime. We will encounter
this inversion as a practical matter later. However, it is a testiment
to Riemann's genius that he derived closed expressions that gave
the exact number of primes. For example, the 4th prime is 7, but
if you chose n=8,9,10n8,9,10, Riemann's formula will give the same
answer (4) for each of these values for nn. For this reason, the
notation seems reversed, but actually makes sense.
The PNT as a theorem asserts that the following interpolation
formulas, which attempt to smoothly fit onto the trends in the
distribution of the primes, are asymptotically exact.
An early estimate was that
πn
π
n
was approximately
nlnn
n
n
This estimate systematically underestimates the numbers of
primes, except possibly for staggeringly large values of
nn. As a theorem it would state
that in the limit of nn going to
infinity, the ratio of
πn
π
n
to
nlnn
n
n
is unity. The theorem is of particular interest to
mathematicians in that it follows from an as-yet unproven
conjecture by Riemann on the distribution of zeros of the zeta
function, a complicated topic which is not required for our
simple analysis here.
An improved approximation is
Lin
Li
n
where LiLi, the log integral
function is defined to be
∫2n1lntdt
t
2
n
1
t
This estimate is substantially better and systematically
underestimates the number of primes (again until one gets to
staggeringly large values). Notice that the first approximation
results from considering the log term to be slowly varying and
taking it out of the integral to get the (crude) estimate of
nlnn
n
n
.
The log integral function diverges at
n=1
n
1
,
but this has no relevance to its applications at large
nn.
Some large values (from Derbyshire, p. 116).
| nn |
ππ |
Li-π
Li
π
|
|
108
10
8
|
5,761,455 |
754 |
|
109
10
9
|
50,847,534 |
1,701 |
|
1010
10
10
|
455,052,511 |
3,104 |
|
1011
10
11
|
4,118,054,813 |
11,588 |
|
1012
10
12
|
37,607,912,018 |
38,263 |
|
1013
10
13
|
346,065,536,839 |
108,971 |
|
1014
10
14
|
3,204,941,750,802 |
314,890 |
Anyone used to examining errors for trends would see that the
steady increase in the difference
Li-π
Li
π
by about a factor of 3 per decade in
nn suggests that there is a scaling
correction that needs to be applied.
The method of choice to calculate a list of primes is to use
the "sieve" of Eratosthenes. Here one starts with a list the
natural numbers. Then starting with 2, one removes every even
number. The next number is 3, so now one removes every third
number (half of these have already been removed by the 2).
Here it is easiest to simply replace the values with zero,
rather than literally removing the numbers. Now the next
(non-zero) number is 5, and we set every fifth number to zero.
Then 7 and then 11. By eleven, the first 121 non-zero numbers
will all be primes (i.e., 11 squared). The alternative of
testing each successive number to see if a smaller prime
divided it would be hopelessly inefficient.
Here is a simple MATLAB program (easily translated into the
program of your choice):
1 %make-prime routine using sieve
2 N = 100; %largest possible prime in list
3 rpr=linspace(1,N,N); %starting list=all numbers up to N
4 nextp = 2; %next prime to remove from list (this is ALSO its location)
5 for j=2:N
6 if (nextp*nextp)<N %quit once all remaining nonzeros are primes
7 for k=2:N %starts at 2 to preserve the prime itself
8 if nextp*k <= N %don't exceed length of vector
9 rpr(nextp*k) = 0; %the sieve!
10 end
11 end
12 for n = 1:N
13 if rpr(nextp+n) ~= 0 %run up list until first nonzero value
14 nextp = rpr(nextp+n); %update nextp
15 break %stop looking further, otherwise exit at N
16 end
17 end
18 primes=[ ]; %start with empty vector
19 for n=2:length(rpr) %now list the prime values only, excluding "1"
20 if (rpr(n)~=0)
21 primes=[primes,rpr(n)]; %add to existing vector of primes
22 end
23 end
Here one simply chooses the value of
NN, and after executing the
program, the vector "primes" will contain the list: 2; 3; 5; 7; 11; 13; ...; 97 (if N=100N100).
Note that the "sieve" itself never removes "1" and this number
has all of the properties of a prime (not divisible by any
other number besides 1 and itself). But it has no use in the
unique factoring of natural (whole) numbers into primes, and
is so excluded.
This program has been checked against the list given in
Abramowitz and Stegun (the primes up to 99,991, which are
roughly the first 10,000 primes).
The sieve suggests a simple way of estimating the distribution
of primes. After any given prime, there will be a density of
non-zero numbers remaining. For example, after 2, half the
remaining numbers will be nonzero. After the 3, one removes
1313 but 1212 are already gone, so we get
12-16=13
12
16
13
left. Since there are an infinite number of numbers left, it
is easier to think in terms of remaining blocks of numbers;
after removing 2 and 3, we have blocks of
2×3=6
2
3
6
,
and in any arbitrary block of 6 consecutive numbers beyond the
3, there will be exactly 2 non-zero numbers. If we go to 5,
the blocks become
2×3×5=30
2
3
5
30
,
and there will remain exactly 8 non-zero numbers in any block
of 30 numbers beyond 5. It is easy to verify that this
density evolves quite systematically. After each successive
prime pp, the average density per
block drops by exactly a factor of
p-1p
p
1
p
This density then corresponds to a mean distance between
primes, since the next prime is chosen from the first non-zero
number in the following block, and we can guess this distance
from the mean density. In particular, after each successive
prime pp, the density decreases
by
p-1p
p
1
p
and the mean distance correspondingly increases by
pp-1
p
p
1
.
Let us call this mean distance after the
kk-th prime
Δ
k
Δ
k
,
while the kk-th prime itself we
would call
P
k
P
k
.
It follows then that the next prime will be approximately
P
k
+
1
=
P
k
+
Δ
k
P
k
+
1
P
k
Δ
k
but given this new prime, the mean step size increases to
Δ
k
+
1
=
Δ
k
P
k
+
1
P
k
+
1
-1
Δ
k
+
1
Δ
k
P
k
+
1
P
k
+
1
1
But now we are all done. If we chose the first prime
(
P
1
P
1
)
and the distance to the next prime
(
Δ
1
Δ
1
),
we can simply iterate these relationships indefinetely. Indeed, we know
these values:
P
1
=2
P
1
2
and
Δ
1
=1
Δ
1
1
A MATLAB program would simply be
1 %recursion on steps to generate prime number counts
2 N=9500; %primes up to this number
3 pr(1) = 2; %initial prime
4 del(1) = 1; %intial step
5 for k= 1:N
6 pr(k+1) = pr(k) + del(k);
7 del(k+1) = del(k)*pr(k+1)/(pr(k+1)-1);
8 end
We chose
N=9500
N
9500
because this available on the tabulated lists, or if you created the
list of primes, you should get
primes9500=98,947
primes
9500
98,947
.
In contrast, the above program gives 102,014, which is too large by only about 3%.
This result is rather astonishingly good given that the initial primes and steps between primes are anything but regular.
It is easy to rearrange the equations as difference equations and then
write them as differential equations. First we write
P
k
+
1
-
P
k
=
Δ
k
P
k
+
1
P
k
Δ
k
and
Δ
k
+
1
-
Δ
k
=
Δ
k
1
P
k
+
1
-1
Δ
k
+
1
Δ
k
Δ
k
1
P
k
+
1
1
For large kk, we can approximate these as
ddkPk=Δk
k
P
k
Δ
k
and
ddkΔk=ΔkPk
k
Δ
k
Δ
k
P
k
One might worry about simplifying
P
k
+
1
-1
P
k
+
1
1
to
Pk
P
k
,
but the "corrections" are tiny, and unimportant as we will show.
The ratio is
ddΔP=P
Δ
P
P
which integrates to give
P=
P
1
ⅇΔ-
Δ
1
P
P
1
Δ
Δ
1
which then gives
∫1Ndk=∫
P
1
P1
Δ
1
+logP
P
1
dP
k
1
N
P
P
1
P
1
Δ
1
P
P
1
Notice that this is in fact the log integral function of the improved PNT. However, because of the curious choice of notation, the natural symbols for ππ (here NN) and nn (here PP) are reversed.
Notice that the
Δ
1
Δ
1
in the denominator can be absorbed into the
logP
P
1
P
P
1
and the rescaling PP moves these terms to the limits on the integral. Effectively, this simply rescales
Lin
Li
n
.
Accordingly, we can chose a
Δ
1
Δ
1
such that the curve is through the primes instead of being always off to one side. Such a rescaling makes perfect sense given that the PNT is just a fit to the primes and such a fit has to be global. Thus the fit is not obliged to either pass exactly through the prime 2 nor have the initial step size be exactly 1. Somehow the PNT derivation of
Lin
Li
n
has effectively slipped in an implicit assumption that
Δ
1
=log
P
1
Δ
1
P
1
(e.g.,
Δ
1
=log2=0.6931
Δ
1
2
0.6931
...).
In fact, choosing
Δ
1
=0.625
Δ
1
0.625
provides a better fit that at large kk gives an estimate for the value of the primes that oscillates about the correct values, being off a few percent to either side. This oscillation shows that there is little point in trying to do "better" (e.g., trying to solve a more exact differential equation above; in fact we just iterated instead). A significant improvement could be gotten by letting
Δ
1
Δ
1
be a weak function of kk.
Notice that we have not proven the PNT, we have just derived the same asymptotic functions cited in the proofs. One can create a number of interesting relationships using these approximate expressions and the PNT itself.
For example, if we integrate the log integral function between
two (large) successive primes, the result should be just unity
(i.e., one new prime). Since the
log
P
k
P
k
denominator will hardly change, the integral itself will be approximately
1=
P
k
+
1
-
P
k
log
P
k
1
P
k
+
1
P
k
P
k
(1)
which can be rewritten as the recursion relation
P
k
+
1
=
P
k
+log
P
k
P
k
+
1
P
k
P
k
(2)
which must be one of the simplest possible recursion relations for primes.
If we start it with the obviously simplest choice
P
1
=2
P
1
2
,
we get
Figure 1.
Another step is to replace
P
k
+
1
-
P
k
P
k
+
1
P
k
with
Δ
k
Δ
k
,
namely the mean distance between primes. This gives us
limk→∞log
P
k
1-1
P
1
1-1
P
2
...1-1
P
k
=const
k
k
P
k
1
1
P
1
1
1
P
2
...
1
1
P
k
const
and the result for the first million primes is plotted in
Figure 2.
The constant is actually
ⅇ-γ
γ
where gamma is Euler's constant, 0.5772157... and the exponential is 0.56145946... and we can see the convergence to this value. This result was derived in 1874 and is known as
Merten's theorem (but the proof was based on Riemann's
(1859) paper and not on a causal idea of a "mean distance between primes").
The above product form is a favorite of number theorists because its inverse
is a special case of the zeta function, namely the value of that function at
unity, which happens to be an infinity that just cancels the infinity in the
limit of
P
k
P
k
(in the sense of the above limit).
In Merten's theorem we have used the "natural" definition for
Δ
1
Δ
1
of
1212, whereas when we used it in the recursion relation for primes we instead defined
Δ
1
=1
Δ
1
1
,
since that was the actual spacing between the starting prime 2 and the next prime, 3. Alternatively, to agree with the log integral expression we would have
had to define
Δ
1
=log2
Δ
1
2
.
In recursion relations there is often such a freedom of choice, as well a "natural" choice.
-
Derbyshire, J. (2003). Prime Obsession. Joseph Henry Press.