# Connexions

You are here: Home » Content » PSEUDO-NUMBERS

### Lenses

What is a lens?

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

#### In these lenses

• Statistics

This module is included inLens: Mathieu Plourde's Lens
By: Mathieu PlourdeAs a part of collection: "Introduction to Statistics"

Click the "Statistics" link to see all content selected in this lens.

### Recently Viewed

This feature requires Javascript to be enabled.

# PSEUDO-NUMBERS

Module by: Ewa Paszek. E-mail the author

Summary: This course is a short series of lectures on Introductory Statistics. Topics covered are listed in the Table of Contents. The notes were prepared by Ewa Paszek and Marek Kimmel. The development of this course has been supported by NSF 0203396 grant.

## UNIFORM PSEUDO-RANDOM VARIABLE GENERATION

In this paragraph, our goals will be to look at, in more detail, how and whether particular types of pseudo-random variable generators work, and how, if necessary, we can implement a generator of our own choosing. Below a list of requirements is listed for our uniform random variable generator:

1. A uniform marginal distribution,
2. Independence of the uniform variables,
3. Repeatability and portability,
4. Computational speed.

### CURRENT ALGORITHMS

The generation of pseudo-random variates through algorithmic methods is a mature field in the sense that a great deal is known theoretically about different classes of algorithms, and in the sense that particular algorithms in each of those classes have been shown, upon testing, to have good statistical properties. In this section, let describe the main classes of generators, and then let make specific recommendation about which generators should be implemented.

Congruential Generators

The most widely used and best understood class of pseudo-random number generators are those based on the linear congruential method introduced by Lehmer (1951). Such generators are based on the following formula:

U i =( a U i1 +c )modm , U i =( a U i1 +c )modm ,
(1)

where U i ,i=1,2,... U i ,i=1,2,... are the output random integers; U 0 U 0 is the chosen starting value for the recursion, called the seed and a,c, and m are prechosen constants.

##### Notice That:
to convert to uniform ( 0,1 ) ( 0,1 ) variates, we need only divide by modulus m, that is, we use the sequence { U i /m } { U i /m } .
##### The following properties of the algorithm are worth stating explicitly:
1. Because of the “mod m” operation (for background on modular operations, see Knuth, (1981) ), the only possible values the algorithm can produce are the integers 0,1,2,...,m1. 0,1,2,...,m1. This follows because, by definition, x mod m is the remainder after x is divided by m.
2. Because the current random integer U i U i depends only on the previous random integer U i1 U i1 once a previous value has been repeated, the entire sequence after it must be repeated. Such a repeating sequence is called a cycle, and its period is the cycle length. Clearly, the maximum period of the congruential generator is m. For given choices of a, c, and m, a generator may contain many short cycles, (see the Example 1 below), and the cycle you enter will depend on the seed you start with. Notice that the generator with many short cycles is not a good one, since the output sequence will be one of a number of short series, each of which may not be uniformly distributed or randomly dispersed on the line or the plane. Moreover, if the simulation is long enough to cause the random numbers to repeat because of the short cycle length, the outputs will not be independent.
3. If we are concern with a uniform ( 0,1 ) ( 0,1 ) variates, the finest partition of the interval ( 0,1 ) ( 0,1 ) that this generator can provide is [ 0,1/m,2/m,...,( m1/m ) ] [ 0,1/m,2/m,...,( m1/m ) ] . This is, of course, not truly a uniform ( 0,1 ) ( 0,1 ) distribution since, for any k in ( 0,m1 ) ( 0,m1 ) , we have P[ k/m<U<( k+1 )/m ]=0 P[ k/m<U<( k+1 )/m ]=0 , not 1/m 1/m are required by theory for continuous random variables.
4. Choices of a,c, and m, will determine not only the fineness of the partition of ( 0,1 ) ( 0,1 ) and the cycle length, and therefore, the uniformity of the marginal distribution, but also the independence properties of the output sequence. Properly choosing a,c, and m is a science that incorporates both theoretical results and empirical tests. The first rule is to select the modulus m to be “as large as possible”, so that there is some hope to address point 3 above and to generate uniform variates with an approximately uniform marginal distribution. However, simply having m large is not enough; one may still find that the generator has many short cycles, or that the sequence is not approximately independent. See example 1 below.
##### Example 1

Consider

U i =2 U i1 mod 2 32 U i =2 U i1 mod 2 32
(2)
Where a seed of the form 2 k 2 k creates a loop containing only integers that are powers of 2, or
U i =( U i1 +1 )mod 2 32 U i =( U i1 +1 )mod 2 32
(3)
which generates the nonrandom sequence of increasing integers. Therefore, the second equation gives a generator that has the maximum possible cycle length but is useless for simulating a random sequence.

Fortunately, one a value of the m has been selected; theoretical results exist that give conditions for choosing values of the multiplier a and the additive constant c such that all the possible integers, 0 through m1 m1 , are generated before any are repeated.

##### Notice, that:
this does not eliminate the second counterexample above, which already has the maximal cycle length, but is a useless random number generator.

THEOREM I

A linear congruential generator will have maximal cycle length m, if and only if:

• c is nonzero and is relatively prime to m (i.e., c and m have no common prime factors).
• ( amodq )=1 ( amodq )=1 for each prime factor q of m.
• ( amod4 )=1 ( amod4 )=1 if 4 is a factor of m.

PROOF

###### SEE:
Knuth (1981, p.16).

As a mathematical note, c is called relatively prime to m if and only if c and m have no common divisor other than 1, which is equivalent to c and m having no common prime factor.

A related result concerns the case of c chosen to be 0. This case does not conform to condition in a Theorem I, a value U i U i of zero must be avoided because the generator will continue to produce zero after the first occurrence of a zero. In particular, a seed of zero is not allowable. By Theorem I, a generator with c=0 c=0 , which is called a multiplicative congruential generator, cannot have maximal cycle length m. However, By Theorem II. It can have cycle length m1 m1 .

THEOREM II

If c=0 c=0 in a linear congruential generator, then U i =0 U i =0 can never be included in a cycle, since the 0 will always repeat. However, the generator will cycle through all m1 m1 integers in the set ( amodq ) ( amodq ) if and only if:

• m is a prime integer and
• m is a primitive element modulo m .

PROOF

###### SEE:
Knuth (1981, p.19).

A formal definition or primitive elements modulo m, as well as theoretical results for finding them, are given in Knuth (1981). In effect, when m is a prime, a is a primitive element if the cycle is of length m1 m1 . The results of Theorem II are not intuitively useful, but for our purposes, it is enough to note that such primitive elements exist and have veen computed by researchers,

###### SEE:
e.g., Table24.8 in Abramowitz and Stegun, 1965.
Hence, we now must select one of two possibilities:
• Choose a, c, and m according to Theorem I and work with a generator whose cycle length is known to be m.
• Choose c=0 c=0 , take a and m according to Theorem II, use a number other than zero as the seed, and work with a generator whose cycle length is known to be m1 m1 . A generator satisfying these conditions is known as a prime-modulus multiplicative congruential generator and, because of the simpler computation, it usually has an advantage in terms of speed over the mixed congruential generator.

Another method frequency speeding up a random number generator that has c=0 c=0 is to choose the modulus m to be computationally convenient. For instance, consider m= 2 k . m= 2 k . This is clearly not a prime number, but on a computer the modulus operation becomes a bit-shift operation in machine code. In such cases, Theorem III gives a guise to the maximal cycle length.

THEOREM III

If c=0 c=0 and m= 2 k m= 2 k with k>2 k>2 , then the maximal possible cycle length is 2 k2 2 k2 . This is achieved if and only if two conditions hold:

• a is a primitive element modulo m.
• the seed is odd.

PROOF

###### SEE:
Knuth (1981, p.19).

Notice that we sacrifice some of the cycle length and, as we will se in Theorem IV, we also lose some randomness in the low-order bits of the random variates. Having use any of Theorems I, II, or III to select triples (a, c, m) that lead to generators with sufficiently long cycles of known length, we can ask which triple gives the most random (i.e., approximately independent ) sequence. Although some theoretical results exist for generators as a whole, these are generally too weak to eliminate any but the worst generators. Marsaglia (1985) and Knuth(1981, Chap. 3.3.3) are good sources for material on that results.

THEOREM IV

If U i =a U i1 mod 2 k U i =a U i1 mod 2 k , and we define

Y i = U i mod 2 j ,0<j<k Y i = U i mod 2 j ,0<j<k
(4)
then
Y i =a Y i1 mod 2 j . Y i =a Y i1 mod 2 j .
(5)
In practical terms, this means that the sequence of j-lo-order binary bits of the U i U i sequence, namely Y i Y i cycle with cycle length at most 2 j 2 j . In particular, sequence of the least significant bit (i.e., j=1) in ( U 1 , U 2 , U 3 ,... ) ( U 1 , U 2 , U 3 ,... ) must behave as ( 0,0,0,0,... ),( 1,1,1,1,... ),( 0,1,0,1,... ) ( 0,0,0,0,... ),( 1,1,1,1,... ),( 0,1,0,1,... ) or ( 1,0,1,0,... ) ( 1,0,1,0,... ) .

PROOF

###### SEE:
Knuth (1981, pp. 12-14).

Such normal behavior in the low-order bits of a congruential generator with non-prime-modulus m is an undesirably property, which may be aggravated by techniques such as the recycling of uniform variates. It has been observed (Hutchinson, 1966) that prime-modulus multiplicative congruential generators with full cycle (i.e., when m is a positive primitive element) tend to have fairly randomly distributed low-order bits, although no theory exists to explain this.

THEOREM V

If our congruential generator produces the sequence ( U 1 , U 2 ,... ) ( U 1 , U 2 ,... ) , and we look at the following sequence of points in n dimensions:

( U 1 , U 2 , U 3 ,..., U n ),( U 2 , U 3 , U 4 ,..., U n+1 ),( U 3 , U 4 , U 5 ,..., U n+2 ),... ( U 1 , U 2 , U 3 ,..., U n ),( U 2 , U 3 , U 4 ,..., U n+1 ),( U 3 , U 4 , U 5 ,..., U n+2 ),...
(6)
then the points will all lie in fewer than ( n|m ) 1/n ( n|m ) 1/n parallel hyper planes.

PROOF

###### SEE:
Marsaglia (1976).

Given these known limitations of congruential generator, we are still left with the question of how to choose the “best” values for a, c, and m. To do this, researchers have followed a straightforward but time-consuming procedure:

1. Take values a, c, and m that give a sufficiently long, known cycle length and usa the generator to produce sequences of uniform variates.
2. Subject the output sequences to batteries of statistical tests for independence and a uniform marginal distribution. Document the results.
3. Subject the generator to theoretical tests. In particular, the spectral test of Coveyou and MacPherson (1967) is currently widely used and recognized as a very sensitive structural test for distinguishing between good and bad generators. Document the results.
4. As new, more sensitive tests appear, subject to generator to those tests. Several such tests are discussed in Marsaglia(1985).

## Content actions

PDF | EPUB (?)

### What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks