THE GAMMA AND CHI-SQUARE DISTRIBUTIONS1.32005/11/30 04:19:21 US/Central2007/10/08 16:18:31.535 GMT-5EwaAlinaPaszekepaszek@liv.ac.ukEwaAlinaPaszekepaszek@liv.ac.ukChi-Square DistributionGamma DistributionThis course is a short series of lectures on Introductory Statistics. Topics
covered are listed in the Table of Contents. The notes were prepared by Ewa
Paszek and Marek Kimmel.
The development of this course has been supported by NSF 0203396 grant.GAMMA AND CHI-SQUARE DISTRIBUTIONS
In the (approximate) Poisson process with mean λ, we have seen that the waiting time until the first change has an exponential distribution. Let now W denote the waiting time until the αth change occurs and let find the distribution of W. The distribution function of W ,when w≥0 is given by
F(w)=P(W≤w)=1−P(W>w)=1−P(fewer_than_α_changes_occur_in_[0,w])=1−∑k=0α−1(λw)ke−λwk!,
since the number of changes in the interval [0,w] has a Poisson distribution with mean λw. Because W is a continuous-type random variable, F'(w) is equal to the p.d.f. of W whenever this derivative exists. We have, provided w>0, that
F'(w)=λe−λw−e−λw∑k=1α−1[k(λw)k−1λk!−(λw)kλk!]=λe−λw−e−λw[λ−λ(λw)α−1(α−1)!]=λ(λw)α−1(α−1)!e−λw.Gamma Distribution
If w<0, then F(w)=0 and F'(w)=0, a p.d.f. of this form is said to be one of the gamma type, and the random variable W is said to have the gamma distribution.
The gamma function is defined by Γ(t)=∫0∞yt−1e−ydy,0<t.
This integral is positive for 0<t, because the integrand id positive. Values of it are often given in a table of integrals. If t>1, integration of gamma fnction of t by parts yields
Γ(t)=[−yt−1e−y]0∞+∫0∞(t−1)yt−2e−ydy=(t−1)∫0∞yt−2e−ydy=(t−1)Γ(t−1).
Let Γ(6)=5Γ(5) and Γ(3)=2Γ(2)=(2)(1)Γ(1). Whenever t=n, a positive integer, we have, be repeated application of Γ(t)=(t−1)Γ(t−1), that Γ(n)=(n−1)Γ(n−1)=(n−1)(n−2)...(2)(1)Γ(1).
However, Γ(1)=∫0∞e−ydy=1.
Thus when n is a positive integer, we have that Γ(n)=(n−1)!; and, for this reason, the gamma is called the generalized factorial.
Incidentally, Γ(1)
corresponds to 0!, and we have noted that Γ(1)=1, which is consistent with earlier discussions.
SUMMARIZING
The random variable x has a gamma distribution if its p.d.f. is defined by
f(x)=1Γ(α)θαxα−1e−x/θ,0≤x<∞.
Hence, w, the waiting time until the α
th change in a Poisson process, has a gamma distribution with parameters α and θ=1/λ.
Function f(x) actually has the properties of a p.d.f., because f(x)≥0
and
∫−∞∞f(x)dx=∫0∞xα−1e−x/θΓ(α)θαdx, which, by the change of variables y=x/θ equals
∫0∞(θy)α−1e−yΓ(α)θαθdy=1Γ(α)∫0∞yα−1e−ydy=Γ(α)Γ(α)=1.
The mean and variance are: μ=αθ and σ2=αθ2.
Suppose that an average of 30 customers per hour arrive at a shop in accordance with Poisson process. That is, if a minute is our unit, then λ=1/2. What is the probability that the shopkeeper will wait more than 5 minutes before both of the first two customers arrive? If X denotes the waiting time in minutes until the second customer arrives, then X has a gamma distribution with α=2,θ=1/λ=2.
Hence,
p(X>5)=∫5∞x2−1e−x/2Γ(2)22dx=∫5∞xe−x/24dx=14[(−2)xe−x/2−4e−x/2]5∞=72e−5/2=0.287.
We could also have used equation with λ=1/θ, because α is an integer P(X>x)=∑k=0α−1(x/θ)ke−x/θk!. Thus, with x=5, α=2, and θ=2, this is equal to
P(X>x)=∑k=02−1(5/2)ke−5/2k!=e−5/2(1+52)=(72)e−5/2.Chi-Square Distribution
Let now consider the special case of the gamma distribution that plays an important role in statistics.
Let X have a gamma distribution with θ=2 and α=r/2, where r is a positive integer. If the p.d.f. of X is
f(x)=1Γ(r/2)2r/2xr/2−1e−x/2,0≤x<∞.
We say that X has chi-square distribution with r degrees of freedom, which we abbreviate by saying is χ2(r).
The mean and the variance of this chi-square distributions are
μ=αθ=(r2)2=r and σ2=αθ2=(r2)22=2r.
That is, the mean equals the number of degrees of freedom and the variance equals twice the number of degrees of freedom.
In the fugure 2 the graphs of chi-square p.d.f. for r=2,3,5, and 8 are given.
the relationship between the mean μ=r, and the point at which the p.d.f. obtains its maximum.
Because the chi-square distribution is so important in applications, tables have been prepared giving the values of the distribution function for selected value of r and x,
F(x)=∫0x1Γ(r/2)2r/2wr/2−1e−w/2dw.
Let X have a chi-square distribution with r =5 degrees of freedom. Then, using tabularized values,
P(1.145≤X≤12.83)=F(12.83)−F(1.145)=0.975−0.050=0.925
and P(X>15.09)=1−F(15.09)=1−0.99=0.01.
If X is χ2(7), two constants, a and b, such that
P(a<X<b)=0.95, are a=1.690 and b=16.01.
Other constants a and b can be found, this above are only restricted in choices by the limited table.
Probabilities like that in Example 4 are so important in statistical applications that one uses special symbols for a and b. Let α
be a positive probability (that is usually less than 0.5) and let X have a chi-square distribution with r degrees of freedom. Then χα2(r) is a number such that P[X≥χα2(r)]=α
That is, χα2(r) is the 100(1-α) percentile (or upper 100a percent point) of the chi-square distribution with r degrees of freedom. Then the 100α
percentile is the number χ1−α2(r) such that P[X≤χ1−α2(r)]=α.
This is, the probability to the right of χ1−α2(r)
is 1-α.
SEE fugure 3.
Let X have a chi-square distribution with seven degrees of freedom. Then, using tabularized values, χ0.052(7)=14.07 and χ0.952(7)=2.167. These are the points that are indicated on Figure 3.