Skip to content Skip to navigation

Connexions

You are here: Home » Content » The Prime Number Theorem (PNT)

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the author

Recently Viewed

This feature requires Javascript to be enabled.

The Prime Number Theorem (PNT)

Module by: F. Michel

Summary: The Prime Number Theorem is of key interest to number theorists owing to the importance of Riemann's work on the subject. But it can also be viewed as a fitting function that approximates the distribution of the prime numbers. We provide such a "pedestrian" viewpoint and development using elementary mathematics.

The famous mathematician Bernhard Riemann did some novel and exciting work in 1859 on the old question of how many primes there are below a given number, which is usually written πn π n This function is equivalent to asking, "what is the value of the kk-th prime for any given kk?" The latter is actually a more specific question since an arbitrary nn will in general fall between primes. Indeed, the notation is most curious because ππ refers to the number of primes up to nn, not to any prime per se, while nn actually refers to the values of the primes (since ππ changes by unity every time nn passes through a prime value. Thus if nn is a prime, it is the ππ-th prime, pretty much the inverse of defining a function pn p n where pp is the value of the nn-th prime. We will encounter this inversion as a practical matter later. However, it is a testiment to Riemann's genius that he derived closed expressions that gave the exact number of primes. For example, the 4th prime is 7, but if you chose n=8,9,10n8,9,10, Riemann's formula will give the same answer (4) for each of these values for nn. For this reason, the notation seems reversed, but actually makes sense.

The PNT as a theorem asserts that the following interpolation formulas, which attempt to smoothly fit onto the trends in the distribution of the primes, are asymptotically exact.

An early estimate was that πn π n was approximately nlnn n n This estimate systematically underestimates the numbers of primes, except possibly for staggeringly large values of nn. As a theorem it would state that in the limit of nn going to infinity, the ratio of πn π n to nlnn n n is unity. The theorem is of particular interest to mathematicians in that it follows from an as-yet unproven conjecture by Riemann on the distribution of zeros of the zeta function, a complicated topic which is not required for our simple analysis here.

An improved approximation is Lin Li n where LiLi, the log integral function is defined to be 2n1lntdt t 2 n 1 t This estimate is substantially better and systematically underestimates the number of primes (again until one gets to staggeringly large values). Notice that the first approximation results from considering the log term to be slowly varying and taking it out of the integral to get the (crude) estimate of nlnn n n . The log integral function diverges at n=1 n 1 , but this has no relevance to its applications at large nn.

Some large values (from Derbyshire, p. 116).

nn ππ Li-π Li π
108 10 8 5,761,455 754
109 10 9 50,847,534 1,701
1010 10 10 455,052,511 3,104
1011 10 11 4,118,054,813 11,588
1012 10 12 37,607,912,018 38,263
1013 10 13 346,065,536,839 108,971
1014 10 14 3,204,941,750,802 314,890

Anyone used to examining errors for trends would see that the steady increase in the difference Li-π Li π by about a factor of 3 per decade in nn suggests that there is a scaling correction that needs to be applied.

Calculating the primes

The method of choice to calculate a list of primes is to use the "sieve" of Eratosthenes. Here one starts with a list the natural numbers. Then starting with 2, one removes every even number. The next number is 3, so now one removes every third number (half of these have already been removed by the 2). Here it is easiest to simply replace the values with zero, rather than literally removing the numbers. Now the next (non-zero) number is 5, and we set every fifth number to zero. Then 7 and then 11. By eleven, the first 121 non-zero numbers will all be primes (i.e., 11 squared). The alternative of testing each successive number to see if a smaller prime divided it would be hopelessly inefficient.

Here is a simple MATLAB program (easily translated into the program of your choice):


	
	1    %make-prime routine using sieve
	2    N = 100; %largest possible prime in list
	3    rpr=linspace(1,N,N); %starting list=all numbers up to N
	4    nextp = 2; %next prime to remove from list (this is ALSO its location)
	5    for j=2:N
	6    if (nextp*nextp)<N  %quit once all remaining nonzeros are primes
	7    for k=2:N %starts at 2 to preserve the prime itself
	8    if nextp*k <= N %don't exceed length of vector
	9    rpr(nextp*k) = 0; %the sieve!
	10   end
	11   end
	12   for n = 1:N
	13   if rpr(nextp+n) ~= 0  %run up list until first nonzero value
	14   nextp = rpr(nextp+n); %update nextp
	15   break %stop looking further, otherwise exit at N
	16   end
	17   end
	18   primes=[ ]; %start with empty vector
	19   for n=2:length(rpr) %now list the prime values only, excluding "1"
	20   if (rpr(n)~=0)
	21   primes=[primes,rpr(n)]; %add to existing vector of primes
	22   end
	23   end
	
      

Here one simply chooses the value of NN, and after executing the program, the vector "primes" will contain the list: 2; 3; 5; 7; 11; 13; ...; 97 (if N=100N100).

Note that the "sieve" itself never removes "1" and this number has all of the properties of a prime (not divisible by any other number besides 1 and itself). But it has no use in the unique factoring of natural (whole) numbers into primes, and is so excluded.

This program has been checked against the list given in Abramowitz and Stegun (the primes up to 99,991, which are roughly the first 10,000 primes).

Derivation of a PNT

The sieve suggests a simple way of estimating the distribution of primes. After any given prime, there will be a density of non-zero numbers remaining. For example, after 2, half the remaining numbers will be nonzero. After the 3, one removes 1313 but 1212 are already gone, so we get 12-16=13 12 16 13 left. Since there are an infinite number of numbers left, it is easier to think in terms of remaining blocks of numbers; after removing 2 and 3, we have blocks of 2×3=6 2 3 6 , and in any arbitrary block of 6 consecutive numbers beyond the 3, there will be exactly 2 non-zero numbers. If we go to 5, the blocks become 2×3×5=30 2 3 5 30 , and there will remain exactly 8 non-zero numbers in any block of 30 numbers beyond 5. It is easy to verify that this density evolves quite systematically. After each successive prime pp, the average density per block drops by exactly a factor of p-1p p 1 p

This density then corresponds to a mean distance between primes, since the next prime is chosen from the first non-zero number in the following block, and we can guess this distance from the mean density. In particular, after each successive prime pp, the density decreases by p-1p p 1 p and the mean distance correspondingly increases by pp-1 p p 1 . Let us call this mean distance after the kk-th prime Δ k Δ k , while the kk-th prime itself we would call P k P k .

It follows then that the next prime will be approximately P k + 1 = P k + Δ k P k + 1 P k Δ k but given this new prime, the mean step size increases to Δ k + 1 = Δ k P k + 1 P k + 1 -1 Δ k + 1 Δ k P k + 1 P k + 1 1

But now we are all done. If we chose the first prime ( P 1 P 1 ) and the distance to the next prime ( Δ 1 Δ 1 ), we can simply iterate these relationships indefinetely. Indeed, we know these values: P 1 =2 P 1 2 and Δ 1 =1 Δ 1 1

A MATLAB program would simply be


	
	1    %recursion on steps to generate prime number counts
	2    N=9500;     %primes up to this number
	3    pr(1) = 2;  %initial prime
	4    del(1) = 1; %intial step
	5    for k= 1:N
	6    pr(k+1) = pr(k) + del(k);
	7    del(k+1) = del(k)*pr(k+1)/(pr(k+1)-1);
	8    end
      

We chose N=9500 N 9500 because this available on the tabulated lists, or if you created the list of primes, you should get primes9500=98,947 primes 9500 98,947 . In contrast, the above program gives 102,014, which is too large by only about 3%.

This result is rather astonishingly good given that the initial primes and steps between primes are anything but regular.

Obtaining the PNT Results

It is easy to rearrange the equations as difference equations and then write them as differential equations. First we write P k + 1 - P k = Δ k P k + 1 P k Δ k and Δ k + 1 - Δ k = Δ k 1 P k + 1 -1 Δ k + 1 Δ k Δ k 1 P k + 1 1

For large kk, we can approximate these as ddkPk=Δk k P k Δ k and ddkΔk=ΔkPk k Δ k Δ k P k

One might worry about simplifying P k + 1 -1 P k + 1 1 to Pk P k , but the "corrections" are tiny, and unimportant as we will show.

The ratio is ddΔP=P Δ P P which integrates to give P= P 1 Δ- Δ 1 P P 1 Δ Δ 1 which then gives 1Ndk= P 1 P1 Δ 1 +logP P 1 dP k 1 N P P 1 P 1 Δ 1 P P 1 Notice that this is in fact the log integral function of the improved PNT. However, because of the curious choice of notation, the natural symbols for ππ (here NN) and nn (here PP) are reversed.

Notice that the Δ 1 Δ 1 in the denominator can be absorbed into the logP P 1 P P 1 and the rescaling PP moves these terms to the limits on the integral. Effectively, this simply rescales Lin Li n .

Accordingly, we can chose a Δ 1 Δ 1 such that the curve is through the primes instead of being always off to one side. Such a rescaling makes perfect sense given that the PNT is just a fit to the primes and such a fit has to be global. Thus the fit is not obliged to either pass exactly through the prime 2 nor have the initial step size be exactly 1. Somehow the PNT derivation of Lin Li n has effectively slipped in an implicit assumption that Δ 1 =log P 1 Δ 1 P 1 (e.g., Δ 1 =log2=0.6931 Δ 1 2 0.6931 ...).

In fact, choosing Δ 1 =0.625 Δ 1 0.625 provides a better fit that at large kk gives an estimate for the value of the primes that oscillates about the correct values, being off a few percent to either side. This oscillation shows that there is little point in trying to do "better" (e.g., trying to solve a more exact differential equation above; in fact we just iterated instead). A significant improvement could be gotten by letting Δ 1 Δ 1 be a weak function of kk.

Notice that we have not proven the PNT, we have just derived the same asymptotic functions cited in the proofs. One can create a number of interesting relationships using these approximate expressions and the PNT itself.

For example, if we integrate the log integral function between two (large) successive primes, the result should be just unity (i.e., one new prime). Since the log P k P k denominator will hardly change, the integral itself will be approximately

1= P k + 1 - P k log P k 1 P k + 1 P k P k (1)
which can be rewritten as the recursion relation
P k + 1 = P k +log P k P k + 1 P k P k (2)
which must be one of the simplest possible recursion relations for primes. If we start it with the obviously simplest choice P 1 =2 P 1 2 , we get Figure 1.
Figure 1: The first thousand or so primes, plotting P k P k against kk. The uppermost (ragged) line are the actual primes, the dashed line just below it is Lik Li k , and the solid line is the simple recursion relation P k + 1 = P k +log P k P k + 1 P k P k starting with P 1 =2 P 1 2 .
logrec2.png
Another step is to replace P k + 1 - P k P k + 1 P k with Δ k Δ k , namely the mean distance between primes. This gives us limklog P k 1-1 P 1 1-1 P 2 ...1-1 P k =const k k P k 1 1 P 1 1 1 P 2 ... 1 1 P k const and the result for the first million primes is plotted in Figure 2. The constant is actually -γ γ where gamma is Euler's constant, 0.5772157... and the exponential is 0.56145946... and we can see the convergence to this value. This result was derived in 1874 and is known as Merten's theorem (but the proof was based on Riemann's (1859) paper and not on a causal idea of a "mean distance between primes"). The above product form is a favorite of number theorists because its inverse is a special case of the zeta function, namely the value of that function at unity, which happens to be an infinity that just cancels the infinity in the limit of P k P k (in the sense of the above limit).
Figure 2: Plot of log P k P k divided by Δ k Δ k. If we were to plot this function on ordinary linear scale, it would look like a step function starting at 0.346 and almost immediately jumping to what looks like the asymptotic value, and then being constant out to a value of almost 100,000. So we have expanded the vertical scale to show the trend and plotted the horizontal scale as a log plot to show how the "fluctuations" die off with increasing size of the primes. Here a linear plot suppresses that trend because almost all of the points to the right of the origin would now correspond to "large" primes.
merten.png
In Merten's theorem we have used the "natural" definition for Δ 1 Δ 1 of 1212, whereas when we used it in the recursion relation for primes we instead defined Δ 1 =1 Δ 1 1 , since that was the actual spacing between the starting prime 2 and the next prime, 3. Alternatively, to agree with the log integral expression we would have had to define Δ 1 =log2 Δ 1 2 . In recursion relations there is often such a freedom of choice, as well a "natural" choice.

References

  1. Derbyshire, J. (2003). Prime Obsession. Joseph Henry Press.

Comments, questions, feedback, criticisms?

Send feedback