Introduction
One of the most common problems complicating the research of an economist is created by the inclusion of endogenous variables as an explanatory variable. The variable on the left-hand-side of a regression is an endogenous variable; its level is determined by the levels of the explanatory variables—that is, the variables on the right-hand-side of the equation. In OLS we assume that the explanatory variables are independent of the error term. However, if the level of one of these explanatory variables is determined by the levels of the other variables in the model, that explanatory variable actually is an endogenous variable. In a nutshell the problem with having endogenous explanatory variables is that these endogenous variables cause the error term in the model to be correlated with the explanatory variables thus causing the OLS estimator to be biased. This problem is also known as simultaneous equation bias and it is a problem that is subtly different from sample selection bias. See "What is the difference between 'endogeneity' and 'sample selection bias"'?" for an excellent discussion of the difference between these two econometric problems.
In this module we explore both the statistical and algebraic issues raised by the inclusion of endogenous explanatory variables in a model. This introduction is too sketchy to give you a thorough understanding of the many problems raised by simultaneous equation bias. Hopefully, by the time you finish the module along with the problem set, you will have an least an intuitive understanding of the problem and will be able to recognize it when you come across the problem in your own research. If you think the model you are estimating may have simultaneous equation bias, you should seek the advice of an econometrician.
The Statistical Problem
Imagine we know with certainty that the following model fully describes the true state of the supply and demand for wheat. First, the demand for wheat in any year, qt, is a function of the price of wheat,
Demand:
and
Supply:
We assume that the error terms each are normally distributed with a mean of zero and a constant variance. Moreover, we assume that the two error terms are independent of each other—that is, we are assuming that:
Finally, we assume that income, the price of corn, and the weather index are non-stochastic variables—i.e., these variables are independent of the two error terms. Clearly, the price of wheat and the quantity of wheat are stochastic variables.1
What we have here is an ideal model in the sense that we know and can measure all of the variables in the model. The model as written has two endogenous variables—qt and
What we ultimately want to know is if we can use ordinary least squares (OLS) to obtain unbiased estimates of the parameters in Equations (1) and (2). One of the assumptions of OLS is that each of the explanatory variables are independent of the error term,
It is convenient in answering our question to use the two structural equations to find what are known as the reduced form equations—that is, one equation for each endogenous variable in which the endogenous variable is written as a function solely of exogenous variables and error terms. We can find the reduce form equations by solving the structural equations simultaneously for the endogenous variables. Substituting (2) into (1), we get:
or
Substituting (1) into (2) yields:
or
Equations (4) and (5) are the reduced form equations for this model. We can use them to calculate
or
Factoring out the non-stochastic terms from the expected value operators gives:
Moreover, by assumption
A similar analysis yields:
Equations (6) and (7) are what create the endogeneity problem (or simultaneous equation bias)—using OLS to estimate the parameters of equations that have an endogenous variable as an explanatory variable yields biased estimates of the unknown parameters. Figure 1 illustrates the endogeneity problem. In this figure we have demand and supply equations that have both risen due to changes in exogenous variables. What the researcher observes are two (red) points: (1) the intersection of the old demand and supply curves and (2) the intersection of the new demand and supply curves.
![]() |
The thick red line shows the regression that would result from using OLS to estimate either of the two structural equations. As illustrated, an OLS estimate of the slope estimate will be biased. We need to use some other estimation technique than OLS.
Estimation
As noted earlier, the basic problem created by the endogeneity problem is that the endogenous explanatory variable is correlated with the error term. The most logical approach would be to replace this variable with one that is not correlated with the error term but highly correlated with the endogenous variable. Consider the value of the price predicted by the reduced form equation (5):
where
Clearly,
Two-stages least squares
The easiest way to understand two-stage least squares is to think of the estimation process as being in the following two steps (although the computer programs calculate the estimators in one step):
Stage 1: obtain a OLS predictions for any endogenous variable on the right-hand side of the equation to be estimated using as the explanatory variables all of the exogenous variables in the system.
Stage 2: estimate the parameters of the equation using OLS and replacing the endogenous variable on the right-hand side of the equation by the its predictions as obtained in step 1.
For obvious reasons he TSLS method works best when the full model is specified or when you know and can measure all of the exogenous variables in the system.
Instrumental variables (IV)
While the use of instrumental variable (IV) estimators is appropriate in a large number of situations, the two situations where they are most commonly used are (1) in the presence of endogenous explanatory variables and (2) in cases when errors arise in the measurement of an explanatory variable (or the errors-in-variables problem). Since I have already described the endogeneity problem, I now turn to a brief discussion of errors-in-variables.
Consider the following simple model:
In this model the researcher observes
The important thing to note in estimating (10) using OLS is that the explanatory variable,
In both examples, ordinary least squares estimation is biased because an explanatory variable in the regression is correlated with the error term in the regression. Such a correlation can result from an endogenous explanator, a mismeasured explanator, an omitted explanator, or a lagged dependent variable among the explanators. I call all such explanators “troublesome.” Instrumental variable estimation can consistently estimate coefficients when ordinary least squares cannot—that is, the instrumental variable estimate of the coefficient will almost certainly be very close to the coefficient’s true value if the sample is sufficiently large—despite troublesome explanators. [Murray (2006a): 112]
Consider a regression that includes a “troublesome explanator,” like
It is usually fairly easy to identify instances when IV estimation methods are appropriate. This is especially true when one of the explanatory variables is possibly an endogenous variable. The real problem arises in finding an instrumental variable or a set of instrumental variables. However, assuming you have one or more instrumental variables, the IV method follows the same steps as described above for TSLS. In the first stage you estimate a regression of the “troublesome variable” as a function of the instruments and the exogenous variables in the equation—i.e., you estimate the reduced form equation. In the second stage you use OLS to estimate the original equation with the value of the “troublesome variable” predicted by the first stage regression substituted for the actual values of the “troublesome variable.”
In a sense TSLS is a IV estimation. The exogenous variables not in a particular regression play the role of the instruments. Thus, in the IV estimation of (1), the weather index is the instrument. In the estimation of (2) the price of corn and the income level are the IVs. Thus, in a fully specified model, the exogenous variables excluded from the regression play the role of instrumental variables. In other situations the choice of an appropriate instrument can be very difficult. The selection process demands creativity both in finding the instrument and in defending the choice.
The use either of IV or TSLS comes at a cost. First, the OLS estimators are more precise (i.e., have a smaller standard error) than the TSLS or IV estimators. Second, selecting invalid or weak instruments can create results that are not meaningful. So how does one know if they have chosen a good set of instruments? There is no easy answer to this question. Murray (2006a: 116-117) discusses some possible tests of the validity of an instrumental variable. In the end, however, the “success” of your instrument may depend more on how convincing your justifications are than any statistical test. Some economists, like Steven Levitt, make a living coming up with and justifying the use of some very creative instrumental variables. Murray (2006a) offers a detailed discussion of IV and should be read by any student planning to make use either of TSLS or IV regression estimators.
The identification problem
There is an additional issue that arises with estimating systems of equations—identification. Essentially, identification is an algebraic problem. Consider the reduced form equations given earlier in (4) and (5):
and
OLS estimation of both of these equations yields unbiased estimates of the parameters in the reduced form equations. Identification asks if we can retrieve the parameters of the structural equations from the reduced form equations. Say, for instance, that we re-write the reduced form equations as:
and
Table 1 shows each of the parameters in (11) and (12) in terms of the parameters of the two reduced form equations. We can recover the parameters of the structural equations by algebraic manipulation of the relationships in Table 1. (This method of estimation—that is, estimating the reduced form equations of a model using OLS and then solving algebraically for the parameters of the structural equations is referred to in the literature as indirect least squares.) For instance,
and
| Explanatory variable | Equation (11) | Equation (12) |
| Intercept |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Error term |
|
|
One can continue in a likewise manner to find formulae for other of the structural parameters. However, an interesting problem does arrive in that it is also true that
Clearly, we need to know how to identify if an equation is either over-identified, exactly identified, or under-identified. A necessary rule is that the number of exogenous variables in a system of equation that are not included in a particular regression must be greater than or equal to the number of endogenous variables on the right-hand-side of the equation for the equation to be either exactly or over identified. Consider the following three-equation model, where the endogenous variables are
The error terms in these three equations are omitted because they are irrelevant to determining if an equation is identified—remember, identification is an algebraic problem, not a statistical issue. There are 3 endogenous variables in the system and 3 equations in the system. Also, there are 5 exogenous variables in the system of equations. Equation (13) is exactly identified; Equation (14) is over-identified; and Equation (15) is under-identified. What this means is (1) Equation (13) can be estimated directly from the reduced form equation (using indirect-least-squares) or using TSLS; (2) Equation (14) must be estimated using TSLS; and Equation (15) cannot be estimated. Table 2 summarizes how to determine if an equation is or is not identified. Basically, if the number in column 2 equals the number in column 3, the equation is exactly identified. If the number in column 2 is less than the number in column 3, the equation is over-identified. Finally, if the number in column 2 is greater than the number in column 3, the equation is under-identified.3
| Equation | Number of endogenous variables on right-hand-side | Number of exogenous variables excluded from the equation | Identification |
|
|
2 | 2 | Exactly |
|
|
1 | 4 | Over |
|
|
1 | 0 | Under |
One other thing to notice is the similarity of TSLS to IV estimation. The exogenous variables play the role of instruments in TSLS estimation. By implication, the instruments in an IV estimation must not include any of the exogenous variables in the equation.4 Similarly, one of the
ways to isolate potential instruments in a regression is to think of what system of equation the equation is and then ask what exogenous variables in that system are not included in the equation. These excluded exogenous variables are potential instruments.
TSLS and IV in Stata
The command for estimating an equation in Stata using two-stages least squares (TSLS) is a bit tricky. Assume that you want to estimate equations (13) and (14) in the model discussed above.5 For simplicity assume that each variable assumes the name for it in Table 2. Thus, in our Stata commands Y1 refers to variable Thus, in our Stata commands Y1 refers to variable
| Equation to be estimated | Stata command |
|
|
.ivreg y1 x2 x3 x5 (y2 y3 = x1 x4) |
|
|
.ivreg y2 x3 (y1 = x1 x2 x4 x5) |
Example 1
An example from Stata. The Stata manual offers the following example analysis. Assume that you want to use state level data from the 1980 census to estimate the following system of equations:
and
where hsngval is the median dollar value of owner-occupied housing; rent is the median monthly gross rent; fainc is family income; pcturban is the percent of the state population living in an urban area; and reg2, reg3, and reg4 are dummy variables that designate the region of the country where the state is located. In this example we focus on estimating (17).
We begin by loading the data set and describing the data.
. use http://www.stata-press.com/data/r8/hsng2
(1980 Census housing data)
.describe
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Now we estimate equation (17) using TSLS as shown in Figure 2.
| Two-stages least square estimate of the example. |
|---|
![]() |
The manual continues the example to include some testing of the model including the Hausman test. Students using TSLS and IV should read the discussion in the Stata manual thoroughly.
Exercises
Exercise 1
Cigarette advertising and sales. A great deal of controversy exists over the issue of whether advertising expenditures affect sales. This controversy is particularly sharp when it affects policy decisions. An example of this phenomenon is the controversy over the impact of cigarette advertising on advertising sales. While many public policy experts advocate bans on cigarette advertising, a majority of economists caution against bans on cigarette advertising. The economists point out that there is little theoretical reasons to believe that cigarette advertising affects total demand for cigarettes. Instead, economists argue that cigarette advertising only affects brand choice and not the number of cigarettes that people smoke. Moreover, these economists point out that there is also little empirical evidence that supports the argument that cigarette advertising affects the demand for cigarettes. Given the negative impact advertising bans have on freedom of speech, most economists conclude that the negative effects of cigarette advertising bans outweigh the benefits of the bans.
In this exercise we address this issue by using data used originally by Richard Schmalensee (1972) in his Ph.D. dissertation. You will use these data to estimate a simple two-equation model of the cigarette advertising industry.
We use annual data for the period 1955 to 1967 to estimate the impact of cigarette advertising on aggregate demand for cigarettes and the impact of cigarette consumption on cigarette advertising. We begin with a model of the demand for cigarettes. We assume that the demand for cigarettes is given by:
where
qt = cigarettes consumed per person over age 15,
pct = retail price of cigarettes,
yt = real disposable personal income per capita (1958 dollars),
At = real advertising expenditures per individual over age 15 (1960 dollars), and
D64 = a dummy variable equal to 1 for the years 1964 through 1967 and zero otherwise.
We include the dummy variable for years after 1964 to pick up the negative impact on cigarette sales of the 1964 report of the US Surgeon General’s Advisory Committee (1964) announcing that the government believed that there was enough evidence available to conclude that cigarette smoking causes cancer. We expect the signs of the parameters with the price of cigarettes and the dummy variable to be negative. We expect that the sign of the parameters with income and advertising to be positive.
Next we turn to a model of the supply of advertising. We assume:
where:
pat = advertising price index, and
mt = gross profits as a percentage of gross sales.
The last variable needs a bit of explaining. The amount of advertising in the industry should be a function of degree of competition in the industry. If the market were perfectly competitive, there would be no reason for any firm to advertise. If the firm were a monopoly, there also would be no reason to advertise. However, if the market is an oligopoly, then a firm would advertise in an effort to gain market share by differentiating its product from the product of its competitors.
The traditional measure of the degree of monopoly power that a firm has is the ratio of its marginal profits to its marginal cost:
where p is output price, mc is marginal cost, and m is the measure of monopoly power. Since we cannot observe the firms’ marginal costs, we approximate m by the ratio of gross profits to gross sales. We expect the impact of the degree of monopoly to have a non-linear impact on advertising expenditures.
The data used to estimate our two equations are listed in Table 5 and are available in the MS Excel file Cigarette sales and advertising data.xls. These data are with the exception of disposable personal income from Schmalensee (1972: 273-290). The disposable personal income data are from the Department of Commerce (1975: Table F26, page 225).
Specification of the Model. Equations (18) and (19) are, as written, very general and need further specification before they can be estimated. We will assume that the two equations take a log-log form. In particular, we assume that we want to estimate:
and
| Year | Cigarettes Sold per Person Over Age 15 | Retail Price of Cigarettes | Real Advertising per Person Over Age 15 | Advertising Price Index | Degree of Monopoly | Disposable Personal Income in 1958 dollars |
| 1955 | 3163.090 | 93.9693 | 0.96100 | 95.4775 | 18.595 | 1659 |
| 1956 | 3230.517 | 94.7049 | 1.09969 | 94.3800 | 19.207 | 1673 |
| 1957 | 3313.033 | 94.2535 | 1.22180 | 96.2125 | 20.165 | 1683 |
| 1958 | 3479.063 | 94.7712 | 1.40471 | 97.8300 | 21.736 | 1666 |
| 1959 | 3584.930 | 98.1779 | 1.45816 | 98.2800 | 22.042 | 1735 |
| 1960 | 3676.912 | 100.0000 | 1.37863 | 100.0000 | 22.04 | 1749 |
| 1961 | 3743.354 | 99.8677 | 1.31871 | 102.0400 | 22.465 | 1756 |
| 1962 | 3733.504 | 99.6761 | 1.35467 | 102.9725 | 22.226 | 1814 |
| 1963 | 3775.886 | 101.3630 | 1.51345 | 103.9525 | 22.848 | 1867 |
| 1964 | 3648.211 | 102.3110 | 1.73665 | 103.4775 | 23.168 | 1948 |
| 1965 | 3710.075 | 105.7510 | 1.59761 | 103.7225 | 23.598 | 2047 |
| 1966 | 3689.386 | 108.0450 | 1.71062 | 104.2200 | 25.085 | 2127 |
| 1967 | 3652.016 | 109.2490 | 1.71444 | 104.6125 | 26.310 | 2164 |
Answer the following six questions:
a) Which variables in the model are exogenous and which are endogenous?
b) Check and see if equations (18) and (19) are underidentified, exactly identified, or overidentified.
c) Estimate equations (21) and (22) using ordinary least squares.
d) Estimate equations (21) and (22) using two-stage least squares. Present the results in a table that for comparison reasons includes the results from the OLS estimation. Be sure to include the R2 and the Durbin-Watson statistic.
e) Which side of the advertising-sales controversy do your results appear to support?
f) How well-specified does your model appear to be? Why?
Exercise 2
Exercise 2. Demand and supply of commercial loans. We are interested in estimating the demand for commercial loans by business firms and the supply of commercial loans by banks. We have available in Table 6 monthly data from the U. S. commercial loan market for the period from January, 1979 through December, 1984 and available in the MS Excel file Exercise 2.xls.8 Define:
Qt = total commercial loans (billions of dollars)
Rt = average prime rate charged by banks
RSt = 3-month Treasury bill rate (represents an alternative rate of return for banks)
RDt = Aaa corporate bond rate (represents the price of alternative financing to firms)
Xt = industrial production index (represents firms’ expectation about future economic activity)
yt = total bank deposits (billions of dollars) (represents a scale variable).
The demand and supply equations to be estimated, respectively, are as follows:
and
Questions
a) What are the endogenous and exogenous variables in this model?
b) Solve for the two “reduced form” equations of this model. Estimate these two equations using the data in Table 6.
c) Check the “order” condition for identification of each equation of the model.
d) Estimate equations (23) and (24) using ordinary least squares using the data in Table 6.
e) Estimate equations (23) and (24) using two-stage least squares. Report the results of the estimations for part 4 and 5 in a single table. Be sure to include the t-ratios, R2’s, and Durbin-Watson statistics for each of the equations estimated.
f) Perform the Hausman Specification Test on both equations.9
g) When presenting this model, Maddala notes “[T]he model postulated here is not necessarily the right model for the problem of analyzing the commercial loan market.” Is there anything in the results reported above that suggests that the model may be mis-specified?
| N | Date | Q | R | RD | X | RS | y |
| 1 | January-79 | 251.8 | 11.75 | 9.25 | 150.8 | 9.35 | 994.3 |
| 2 | February-79 | 255.6 | 11.75 | 9.26 | 151.5 | 9.32 | 1002.5 |
| 3 | March-79 | 259.8 | 11.75 | 9.37 | 152.0 | 9.48 | 994.0 |
| 4 | April-79 | 264.7 | 11.75 | 9.38 | 153.0 | 9.46 | 997.4 |
| 5 | May-79 | 268.8 | 11.75 | 9.50 | 150.8 | 9.61 | 1013.2 |
| 6 | June-79 | 274.6 | 11.65 | 9.29 | 152.4 | 9.06 | 1015.6 |
| 7 | July-79 | 276.9 | 11.54 | 9.20 | 152.6 | 9.24 | 1012.3 |
| 8 | August-79 | 280.5 | 11.91 | 9.23 | 152.8 | 9.52 | 1020.9 |
| 9 | September-79 | 288.1 | 12.90 | 9.44 | 151.6 | 10.26 | 1043.6 |
| 10 | October-79 | 288.3 | 14.39 | 10.13 | 152.4 | 11.70 | 1062.6 |
| 11 | November-79 | 287.9 | 15.55 | 10.76 | 152.4 | 11.79 | 1058.5 |
| 12 | December-79 | 295.0 | 15.30 | 11.31 | 152.1 | 12.64 | 1076.3 |
| 13 | January-80 | 295.1 | 15.25 | 11.86 | 152.2 | 13.50 | 1063.1 |
| 14 | February-80 | 298.5 | 15.63 | 12.36 | 152.7 | 14.35 | 1070.0 |
| 15 | March-80 | 301.7 | 18.31 | 12.96 | 152.6 | 15.20 | 1073.5 |
| 16 | April-80 | 302.0 | 19.77 | 12.04 | 152.1 | 13.20 | 1101.1 |
| 17 | May-80 | 298.1 | 16.57 | 10.99 | 148.3 | 8.58 | 1097.1 |
| 18 | June-80 | 297.8 | 12.63 | 10.58 | 144.0 | 7.07 | 1088.7 |
| 19 | July-80 | 301.2 | 11.48 | 11.07 | 141.5 | 8.06 | 1099.9 |
| 20 | August-80 | 304.7 | 11.12 | 11.64 | 140.4 | 9.13 | 1111.1 |
| 21 | September-80 | 308.1 | 12.23 | 12.02 | 141.8 | 10.27 | 1122.2 |
| 22 | October-80 | 315.6 | 13.79 | 12.31 | 144.1 | 11.62 | 1161.4 |
| 23 | November-80 | 323.1 | 16.06 | 11.94 | 146.9 | 13.73 | 1200.6 |
| 24 | December-80 | 330.6 | 20.35 | 13.21 | 149.4 | 15.49 | 1239.9 |
| 25 | January-81 | 330.9 | 20.16 | 12.81 | 151.0 | 15.02 | 1223.5 |
| 26 | February-81 | 331.3 | 19.43 | 13.35 | 151.7 | 14.79 | 1207.1 |
| 27 | March-81 | 331.6 | 18.04 | 13.33 | 151.5 | 13.36 | 1190.6 |
| 28 | April-81 | 336.2 | 17.15 | 13.88 | 152.1 | 13.69 | 1206.0 |
| 29 | May-81 | 340.9 | 19.61 | 14.32 | 151.9 | 16.30 | 1221.4 |
| 30 | June-81 | 345.5 | 20.03 | 13.75 | 152.7 | 14.73 | 1236.7 |
| 31 | July-81 | 350.3 | 20.39 | 14.38 | 152.9 | 14.95 | 1221.5 |
| 32 | August-81 | 354.2 | 20.50 | 14.89 | 153.9 | 15.51 | 1250.3 |
| 33 | September-81 | 366.3 | 20.08 | 15.49 | 153.6 | 14.70 | 1293.7 |
| 34 | October-81 | 361.7 | 18.45 | 15.40 | 151.6 | 13.54 | 1224.6 |
| 35 | November-81 | 365.5 | 16.84 | 14.22 | 149.1 | 10.86 | 1254.1 |
| 36 | December-81 | 361.4 | 15.75 | 14.23 | 146.3 | 10.85 | 1288.7 |
| 37 | January-82 | 359.8 | 15.75 | 15.18 | 143.4 | 12.28 | 1251.5 |
| 38 | February-82 | 364.6 | 16.56 | 15.27 | 140.7 | 13.48 | 1258.3 |
| 39 | March-82 | 372.4 | 16.50 | 14.58 | 142.7 | 12.68 | 1295.0 |
| 40 | April-82 | 374.7 | 16.50 | 14.46 | 141.5 | 12.70 | 1272.1 |
| 41 | May-82 | 379.3 | 16.50 | 14.26 | 140.2 | 12.09 | 1286.1 |
| 42 | June-82 | 386.7 | 16.50 | 14.81 | 139.2 | 12.47 | 1325.8 |
| 43 | July-82 | 384.4 | 16.26 | 14.61 | 138.7 | 11.35 | 1307.3 |
| 44 | August-82 | 384.5 | 14.39 | 13.71 | 138.8 | 8.68 | 1321.7 |
| 45 | September-82 | 395.0 | 13.50 | 12.94 | 138.4 | 7.92 | 1335.5 |
| 46 | October-82 | 393.7 | 12.52 | 12.12 | 137.3 | 7.71 | 1345.2 |
| 47 | November-82 | 398.9 | 11.85 | 11.68 | 135.7 | 8.07 | 1358.1 |
| 48 | December-82 | 395.3 | 11.50 | 11.83 | 134.9 | 7.94 | 1409.7 |
| 49 | January-83 | 392.4 | 11.16 | 11.79 | 135.2 | 7.86 | 1385.4 |
| 50 | February-83 | 392.3 | 10.98 | 12.01 | 137.4 | 8.11 | 1412.6 |
| 51 | March-83 | 395.9 | 10.50 | 11.73 | 138.1 | 8.35 | 1419.5 |
| 52 | April-83 | 393.5 | 10.50 | 11.51 | 140.0 | 8.21 | 1411.0 |
| 53 | May-83 | 391.7 | 10.50 | 11.46 | 142.6 | 8.19 | 1413.1 |
| 54 | June-83 | 395.3 | 10.50 | 11.74 | 144.4 | 8.79 | 1443.8 |
| 55 | July-83 | 397.7 | 10.50 | 12.15 | 146.4 | 9.08 | 1438.1 |
| 56 | August-83 | 400.6 | 10.89 | 12.51 | 149.7 | 9.34 | 1461.4 |
| 57 | September-83 | 402.7 | 11.00 | 12.37 | 151.8 | 9.00 | 1448.9 |
| 58 | October-83 | 405.3 | 11.00 | 12.25 | 153.8 | 8.64 | 1459.0 |
| 59 | November-83 | 412.0 | 11.00 | 12.41 | 155.0 | 8.76 | 1499.4 |
| 60 | December-83 | 420.1 | 11.00 | 12.57 | 155.3 | 9.00 | 1508.9 |
| 61 | January-84 | 424.4 | 11.00 | 12.20 | 156.2 | 8.90 | 1504.1 |
| 62 | February-84 | 428.8 | 11.00 | 12.08 | 158.5 | 9.09 | 1499.3 |
| 63 | March-84 | 433.1 | 11.21 | 12.57 | 160.0 | 9.52 | 1494.5 |
| 64 | April-84 | 439.7 | 11.93 | 12.81 | 160.8 | 9.69 | 1501.5 |
| 65 | May-84 | 447.3 | 12.39 | 13.28 | 162.1 | 9.83 | 1541.3 |
| 66 | June-84 | 452.9 | 12.60 | 13.55 | 162.8 | 9.87 | 1532.9 |
| 67 | July-84 | 454.4 | 13.00 | 13.44 | 164.4 | 10.12 | 1535.5 |
| 68 | August-84 | 455.2 | 13.00 | 12.87 | 165.9 | 10.47 | 1539.0 |
| 69 | September-84 | 459.9 | 12.97 | 12.66 | 166.0 | 10.37 | 1549.9 |
| 70 | October-84 | 467.7 | 12.58 | 12.63 | 165.0 | 9.74 | 1578.9 |
| 71 | November-84 | 468.7 | 11.77 | 12.29 | 164.4 | 8.61 | 1578.2 |
| 72 | December-84 | 476.8 | 11.06 | 12.13 | 164.8 | 8.06 | 1631.2 |
References
Angrist, Joshua D. and Alan B. Krueger (2001). Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments. Journal of Economic Perspectives 15(4): 69–85.
Berndt, Ernst R. (1991). The Practice of Econometrics (Reading, MA: Addison-Wesley Publishing Company).
Greene, William H. (1990). Econometric Analysis (New York: Macmillan Publishing Company).
Murray, Michael P. (2006a). Avoiding Invalid Instruments and Coping with Weak Instruments. Journal of Economic Perspectives 20(4): 111-132.
Murray, Michael P. (2006b). Econometrics: A Modern Introduction. (Boston: Addison-Wesley): Chapter 13.
Schmalensee, Richard (1972). The Economics of Advertising (Amsterdam: North-Holland Publishing Company).
StataCorp (2003). Stata Statistical Software: Release 8 (College Station, TX: Stata Corporation): Volume 2: Reference G-M, pages 186-194.
Stock, James H, and Mark W. Watson (2003). Introduction to Econometrics (Boston, MA: Addison-Wesley): Chapter 10.
US Department of Commerce (1975). Historical Statistics of the United States: Colonial Times to 1970 (Washington: Government Printing Office).
US Surgeon General’s Advisory Committee (1964). Smoking and Health (Washington: Government Printing Office).



Statistical terminology


