At this point we are ready to begin the replication. Since it is easy to get lost in the process, I have created a list of steps that include both instructions on what to do and questions you need to answer. As part of this exercise you will be asked to complete several tables of results. In order to make this effort easier, I have provided a MS Word file, Tables for ordered probit discussion.doc, with the tables to be completed in it.
1. Load the data in Stata from Excel.
2. Convert MSAT and VSAT to MSAT/100 and VSAT/100, respectively, using the commands:
.replace msat = msat/100
.replace vsat = vsat/100
3. Common sense dictates that we should calculate the means and standard deviations of the variables to be sure that there are no entry errors. We need to construct a table that compares the means and standard deviations reported in BFS with those in our dataset. Table 2, which has the means and standard deviations reported by BFS, gives a place to put the means and standard deviations for the variables in our dataset. Fill in the information missing from Table 2.
Table 2: Means and standard deviations of the data.
| |
Our data |
Butler, et al. |
| Variable |
Mean |
Std. Dev. |
Mean |
Std. Dev. |
| msat |
|
|
6.25 |
0.60 |
| foreign |
|
|
0.11 |
0.32 |
| female |
|
|
0.39 |
0.49 |
| emecon |
|
|
0.34 |
0.48 |
| emoss |
|
|
0.17 |
0.38 |
| emns |
|
|
0.21 |
0.41 |
| emh |
|
|
0.07 |
0.25 |
| am1 |
|
|
0.49 |
0.50 |
| am2 |
|
|
0.45 |
0.50 |
| am3 |
|
|
0.01 |
0.11 |
| phy1 |
|
|
0.67 |
0.47 |
| Phy2 |
|
|
0.02 |
0.14 |
| chem1 |
|
|
0.82 |
0.39 |
| chem2 |
|
|
0.12 |
0.32 |
4. Estimate the ordered probit regression using (in Stata) the commands:
.global indvar msat foreign female emecon emoss emns emh am1 am2 am3 phy1 phy2 chem1 chem2
.oprobit highestmath $indvar
5. Use the result of this estimation to complete Table 3.
Table 3: Results of Stata ordered-probit regression.
| highestmath |
Coef. |
Std. Err. |
z |
P>z |
[95% Conf. Interval] |
| msat1 |
|
|
|
|
|
|
| foreign |
|
|
|
|
|
|
| female |
|
|
|
|
|
|
| emecon |
|
|
|
|
|
|
| emoss |
|
|
|
|
|
|
| emns |
|
|
|
|
|
|
| emh |
|
|
|
|
|
|
| am1 |
|
|
|
|
|
|
| am2 |
|
|
|
|
|
|
| am3 |
|
|
|
|
|
|
| phy1 |
|
|
|
|
|
|
| Phy2 |
|
|
|
|
|
|
| chem1 |
|
|
|
|
|
|
| chem2 |
|
|
|
|
|
|
| |
|
|
|
|
|
|
| _cut1 |
|
|
|
| _cut2 |
|
|
|
|
|
|
| _cut3 |
|
|
|
|
|
|
| _cut4 |
|
|
|
|
|
|
| _cut5 |
|
|
|
|
|
|
| _cut6 |
|
|
|
|
|
|
| Observations |
|
|
|
|
|
|
| Log likelihood |
|
|
|
|
|
|
| LR χ2(14) |
|
|
|
|
|
|
| Prob > χ2 |
|
|
|
|
|
|
| Pueudo-R2 |
|
|
|
|
|
|
6. Compare your results with the table reported in the article. The table in the article is Table II on page 193 and is reproduced in Figure 3. What we are interested in is comparing column 4 in Figure 3 with columns 2 and 4 in Table 3. Table 4 below offers a model for this comparison.
Table 4. Comparison of ordered probit estimations.
Table 4: Comparison of ordered-probit estimations.
| |
Our estimates |
Butler, et al. estimates |
| |
Estimate |
z |
Estimate |
t-value |
| msat1 |
|
|
0.05 |
6.12 |
| foreign |
|
|
0.02 |
0.14 |
| female |
|
|
0.25 |
2.59 |
| emecon |
|
|
-0.11 |
0.86 |
| emoss |
|
|
-0.29 |
1.99 |
| emns |
|
|
0.43 |
3.10 |
| emh |
|
|
-0.37 |
1.78 |
| am1 |
|
|
0.24 |
1.07 |
| am2 |
|
|
0.93 |
4.04 |
| am3 |
|
|
0.77 |
1.70 |
| phy1 |
|
|
0.26 |
2.71 |
| Phy2 |
|
|
0.38 |
1.07 |
| chem1 |
|
|
-0.12 |
0.69 |
| chem2 |
|
|
0.17 |
0.75 |
| Intercept |
|
|
-3.09 |
5.48 |
| _cut1 |
|
|
0.27 |
7.29 |
| _cut2 |
|
|
0.33 |
8.16 |
| _cut3 |
|
|
1.52 |
20.32 |
| _cut4 |
|
|
1.79 |
23.07 |
| _cut5 |
|
|
2.04 |
23.72 |
| _cut6 |
|
|
|
|
7. It is easy to see from Table 4 is that almost without exception the estimates of the parameters and their t-ratios are very similar. The exception arises with the estimates of the truncation points (_cut# in the Stata results). We will have to figure out what these are estimates of in order to make sense of them. Figure 1 shows the "cutoffs" that are being estimated. Footnote c in the BFS Table II on page 193 (shown in Figure 3) offers a useful observation:
In an ordered probit, an underlying, normally distributed, latent variable has a mean which is a function of observable variables. The latent variable gives rise to a set of observed dummy variables for ordered categories based on ranges between unobserved but estimable truncation points which correspond to levels of effort, ability, or other factors reflected in the explanatory variables. If L categories are observed, there are
L−1
L−1
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaqcaaKaamitaiabgkHiTiaaigdaaaa@38B6@
truncation points, of which the first is normalized to be zero, so that
L−2
L−2
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaqcaaKaamitaiabgkHiTiaaikdaaaa@38B7@
truncation points are estimated and reported in the table. The values correspond to standard deviations of the latent normally distributed variable.
The key idea is that the values of cutoffs are relative and can be normalized around any value. Notice that the Stata results do not report an intercept term but do report six cutoff values. Moreover, the difference between the estimate by Stata for the first cutoff (3.08402) and the estimate for the second cutoff (3.356916) is equal to 0.272896, which is itself equal to the first truncation point reported by BFS (1998: 193). Use Table 5 to report the difference between the first cutoff value and each of the cutoff points reported by Stata.
Table 5: Reconciling Stata estimates of cutoff points with Butler, et al.'s truncation points.
| Cutoff |
Estimate |
Estimate - _cut1 |
BFS Truncation Points |
| _cut1 |
3.0840 |
|
|
| _cut2 |
3.3569 |
|
0.27 |
| _cut3 |
3.4146 |
|
0.33 |
| _cut4 |
4.6013 |
|
1.52 |
| _cut5 |
4.8774 |
|
1.79 |
| _cut6 |
5.1202 |
|
2.04 |
The second part of the reconciliation of the two sets of results is to compute the t-ratios. To do this we need to compute the standard deviation of the estimates of the cutoff points reported by Stata. To do this we need to retrieve the variance-covariance matrix from the regression. First, let's see what we are interested in computing. Let
β
^
i
β
^
i
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaKaadaWgaaWcbaGaamyAaaqabaaaaa@38BF@
be the estimate of the ith cutoff point. In column 3 of Table 5 you computed
α
^
i
=
β
^
i
−
β
^
1
α
^
i
=
β
^
i
−
β
^
1
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqySdeMbaKaadaWgaaWcbaGaamyAaaqabaGccqGH9aqpcuaHYoGygaqcamaaBaaaleaacaWGPbaabeaakiabgkHiTiqbek7aIzaajaWaaSbaaSqaaiaaigdaaeqaaaaa@4027@
for
i=2,…,6
i=2,…,6
.
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaiabg2da9iaaikdacaGGSaGaeSOjGSKaaiilaiaaiAdaaaa@3BE6@
.
The variance of the new variable is:
V(
α
^
i
)=V(
β
^
i
)−2Cov(
β
^
i
β
^
1
)+V(
β
^
1
)=
σ
i
2
−2
σ
i1
+
σ
1
2
V(
α
^
i
)=V(
β
^
i
)−2Cov(
β
^
i
β
^
1
)+V(
β
^
1
)=
σ
i
2
−2
σ
i1
+
σ
1
2
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOvamaabmaabaGafqySdeMbaKaadaWgaaWcbaGaamyAaaqabaaakiaawIcacaGLPaaacqGH9aqpcaWGwbWaaeWaaeaacuaHYoGygaqcamaaBaaaleaacaWGPbaabeaaaOGaayjkaiaawMcaaiabgkHiTiaaikdacaWGdbGaam4BaiaadAhadaqadaqaaiqbek7aIzaajaWaaSbaaSqaaiaadMgaaeqaaOGafqOSdiMbaKaadaWgaaWcbaGaaGymaaqabaaakiaawIcacaGLPaaacqGHRaWkcaWGwbWaaeWaaeaacuaHYoGygaqcamaaBaaaleaacaaIXaaabeaaaOGaayjkaiaawMcaaiabg2da9iabeo8aZnaaDaaaleaacaWGPbaabaGaaGOmaaaakiabgkHiTiaaikdacqaHdpWCdaWgaaWcbaGaamyAaiaaigdaaeqaaOGaey4kaSIaeq4Wdm3aa0baaSqaaiaaigdaaeaacaaIYaaaaaaa@60F0@
(5)The variance-covariance matrix will give us estimates of these variances and covariances. When there are j parameters in a regression equation, this matrix is defined to be:
Σ
^
=[
σ
^
β
1
2
σ
^
β
1
β
2
⋯
σ
β
1
β
k
σ
^
β
2
β
1
σ
^
β
2
2
⋯
σ
^
β
2
β
k
⋮
⋮
⋱
σ
β
k
β
1
σ
β
k
β
2
⋯
σ
^
β
k
2
].
Σ
^
=[
σ
^
β
1
2
σ
^
β
1
β
2
⋯
σ
β
1
β
k
σ
^
β
2
β
1
σ
^
β
2
2
⋯
σ
^
β
2
β
k
⋮
⋮
⋱
σ
β
k
β
1
σ
β
k
β
2
⋯
σ
^
β
k
2
].
MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafu4OdmLbaKaacqGH9aqpdaWadaqaauaabeqaeqaaaaaabaGafq4WdmNbaKaadaqhaaWcbaGaeqOSdi2aaSbaaWqaaiaaigdaaeqaaaWcbaGaaGOmaaaaaOqaaiqbeo8aZzaajaWaaSbaaSqaaiabek7aInaaBaaameaacaaIXaaabeaaliabek7aInaaBaaameaacaaIYaaabeaaaSqabaaakeaacqWIVlctaeaacqaHdpWCdaWgaaWcbaGaeqOSdi2aaSbaaWqaaiaaigdaaeqaaSGaeqOSdi2aaSbaaWqaaiaadUgaaeqaaaWcbeaaaOqaaiqbeo8aZzaajaWaaSbaaSqaaiabek7aInaaBaaameaacaaIYaaabeaaliabek7aInaaBaaameaacaaIXaaabeaaaSqabaaakeaacuaHdpWCgaqcamaaDaaaleaacqaHYoGydaWgaaadbaGaaGOmaaqabaaaleaacaaIYaaaaaGcbaGaeS47IWeabaGafq4WdmNbaKaadaWgaaWcbaGaeqOSdi2aaSbaaWqaaiaaikdaaeqaaSGaeqOSdi2aaSbaaWqaaiaadUgaaeqaaaWcbeaaaOqaaiabl6Uinbqaaiabl6UinbqaaiablgVipbqaaaqaaiabeo8aZnaaBaaaleaacqaHYoGydaWgaaadbaGaam4AaaqabaWccqaHYoGydaWgaaadbaGaaGymaaqabaaaleqaaaGcbaGaeq4Wdm3aaSbaaSqaaiabek7aInaaBaaameaacaWGRbaabeaaliabek7aInaaBaaameaacaaIYaaabeaaaSqabaaakeaacqWIVlctaeaacuaHdpWCgaqcamaaDaaaleaacqaHYoGydaWgaaadbaGaam4AaaqabaaaleaacaaIYaaaaaaaaOGaay5waiaaw2faaiaac6caaaa@82FC@
If you type the command .vce, Stata will report
Σ
^
Σ
^
,
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafu4OdmLbaKaaaaa@3788@
as shown in Figure 4. We need the section of this matrix shown in Part A of Table 6. Use equation (5) to estimate the standard errors of the estimates of the cutoff points and complete Part B of Table 6 and compares the t-ratios with the values reported by Butler, et al. (and shown in the last column 4 of Table 6). Are you satisfied that we have been able to come reasonably close to the results reported in the article?
Table 6: Calculation of the t-ratios for the cutoff estimates.
| Part A. Relevant portion of the variance-covariance matrix. |
| |
_cut1 |
_cut2 |
_cut3 |
_cut4 |
_cut5 |
_cut6 |
| _cut1 |
0.329 |
|
|
|
|
|
| _cut2 |
0.329 |
0.330 |
|
|
|
|
| _cut3 |
0.329 |
0.330 |
0.331 |
|
|
|
| _cut4 |
0.332 |
0.333 |
0.334 |
0.341 |
|
|
| _cut5 |
0.333 |
0.334 |
0.334 |
0.341 |
0.343 |
|
| _cut6 |
0.333 |
0.334 |
0.335 |
0.342 |
0.343 |
0.345 |
| Part B. Calculation of the t-ratios (with comparison of values reported in BFS) |
| |
V(
β
^
β
^
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaKaaaaa@37A5@
) |
St. Dev.(
β
^
β
^
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaKaaaaa@37A5@
|
t-ratio |
BFS t-ratio |
| _cut2 |
|
|
|
7.29 |
|
| _cut3 |
|
|
|
8.16 |
|
| _cut4 |
|
|
|
20.32 |
|
| _cut5 |
|
|
|
23.07 |
|
| _cut6 |
|
|
|
23.72 |
|
8. The next step in the process is to generate the term we will use in the estimation of the grade regression to account for the potential sample selection bias. To do this we will need to find a reference in the literature that offers a clear description of what we need to do. As it turns out, a reasonable explanation of the appropriate estimation technique is available in Jimenez and Kugler (1987). Since much of what follows comes directly from this article, I highly recommend you read it yourself.
The gist of the method suggests that the potential sample bias is accounted for by an inverse Mills ratio for each of the categories. What we need to do is calculate:
λ
^
i
=
ϕ(
μ
^
j
−
z
^
i
∗
)−ϕ(
μ
^
j+1
−
z
^
i
∗
)
Φ(
μ
^
j+1
−
z
^
i
∗
)−Φ(
μ
^
j
−
z
^
i
∗
)
λ
^
i
=
ϕ(
μ
^
j
−
z
^
i
∗
)−ϕ(
μ
^
j+1
−
z
^
i
∗
)
Φ(
μ
^
j+1
−
z
^
i
∗
)−Φ(
μ
^
j
−
z
^
i
∗
)
MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafq4UdWMbaKaadaWgaaWcbaGaamyAaaqabaGccqGH9aqpdaWcaaqaaiabew9aMnaabmaabaGafqiVd0MbaKaadaWgaaWcbaGaamOAaaqabaGccqGHsislceWG6bGbaKaadaqhaaWcbaGaamyAaaqaaiabgEHiQaaaaOGaayjkaiaawMcaaiabgkHiTiabew9aMnaabmaabaGafqiVd0MbaKaadaWgaaWcbaGaamOAaiabgUcaRiaaigdaaeqaaOGaeyOeI0IabmOEayaajaWaa0baaSqaaiaadMgaaeaacqGHxiIkaaaakiaawIcacaGLPaaaaeaacqqHMoGrdaqadaqaaiqbeY7aTzaajaWaaSbaaSqaaiaadQgacqGHRaWkcaaIXaaabeaakiabgkHiTiqadQhagaqcamaaDaaaleaacaWGPbaabaGaey4fIOcaaaGccaGLOaGaayzkaaGaeyOeI0IaeuOPdy0aaeWaaeaacuaH8oqBgaqcamaaBaaaleaacaWGQbaabeaakiabgkHiTiqadQhagaqcamaaDaaaleaacaWGPbaabaGaey4fIOcaaaGccaGLOaGaayzkaaaaaaaa@6799@
(6)for the category that the individual actually is in. What we will do is calculate (6) for all of the categories and then sum the product of this number and a dummy variable indicating if a course is the highest math class completed by an individual. Since the dummy variables will equal 0 for math categories an individual is not in, the resulting sum will preserve the value of (6) that is associated with the category the individual does belong to.
It is clear from (6) that we will need to retain the 6 cutoffs. We can do this with the commands:
. generate cutoff1 = _b[_cut1]
. generate cutoff2 = _b[_cut2]
. generate cutoff3 = _b[_cut3]
. generate cutoff4 = _b[_cut4]
. generate cutoff5 = _b[_cut5]
. generate cutoff6 = _b[_cut6]
Technically, this step is not necessary since the parameter estimates are preserved until the next regression is estimated; I suggest doing this purely as a precaution.
9. Preserve the predicted values of the ordered-probit using the command:
. predict zhat, xb
. predict phat1 phat2 phat3 phat4 phat5 phat6 phat7, p
These two commands will generate for each observation the predicted mean category of math classes and the probability that this individual will fall in each category. To see what is going on we will retrieve some representative values of these variables and then graph them for one individual. Table 7 reports these values for 10 individuals in the sample. Now consider individual 2. Fitting a normal distribution with a mean of 4.25 and using the critical values from our estimation yields the probabilities that the individual is in each of the categories. For example, the probability that individual 1 will have completed no math classes is equal to 0.1223. Figure 5 illustrates the results for individual 1. The dashed vertical lines are the six cutoff values that are the same for each individual. The solid vertical line is the zhat for individual 1. The heavy blue line represents the normal probability density function for this individual. While, there is, of course, a different probability distribution for each individual, the cutoff values are the same for all members of the sample.
Table 7: Predicted values of the ordered probit regression.
| Observation |
Highest Math Class |
zhat |
Pr(0) |
Pr(1) |
Pr(2) |
Pr(3) |
Pr(4) |
Pr(5) |
Pr(6) |
| 1 |
3 |
3.9657 |
0.1890 |
0.0824 |
0.0194 |
0.4467 |
0.0816 |
0.0568 |
0.1241 |
| 2 |
0 |
4.2507 |
0.1217 |
0.0640 |
0.0158 |
0.4355 |
0.0975 |
0.0731 |
0.1923 |
| 165 |
0 |
3.5982 |
0.3036 |
0.1011 |
0.0225 |
0.4149 |
0.0575 |
0.0364 |
0.0640 |
| 166 |
6 |
4.6914 |
0.0540 |
0.0370 |
0.0098 |
0.3633 |
0.1097 |
0.0922 |
0.3340 |
| 214 |
3 |
3.4533 |
0.3560 |
0.1056 |
0.0229 |
0.3900 |
0.0483 |
0.0294 |
0.0478 |
| 215 |
3 |
4.0840 |
0.1587 |
0.0749 |
0.0180 |
0.4459 |
0.0887 |
0.0637 |
0.1501 |
| 225 |
3 |
3.5250 |
0.3296 |
0.1036 |
0.0228 |
0.4031 |
0.0528 |
0.0328 |
0.0553 |
| 226 |
3 |
3.6990 |
0.2693 |
0.0969 |
0.0219 |
0.4285 |
0.0641 |
0.0417 |
0.0776 |
| 453 |
3 |
3.9713 |
0.1875 |
0.0820 |
0.0194 |
0.4468 |
0.0819 |
0.0571 |
0.1253 |
| 454 |
5 |
4.1650 |
0.1399 |
0.0697 |
0.0170 |
0.4422 |
0.0932 |
0.0684 |
0.1697 |
| 495 |
3 |
4.4168 |
0.0913 |
0.0533 |
0.0135 |
0.4151 |
0.1043 |
0.0816 |
0.2409 |
| 496 |
0 |
2.9811 |
0.5410 |
0.1055 |
0.0212 |
0.2797 |
0.0236 |
0.0127 |
0.0162 |
| 526 |
0 |
2.9247 |
0.5633 |
0.1039 |
0.0207 |
0.2653 |
0.0214 |
0.0114 |
0.0141 |
| 527 |
3 |
3.9757 |
0.1863 |
0.0817 |
0.0193 |
0.4469 |
0.0822 |
0.0574 |
0.1262 |
Now we are ready to calculate (6). The commands are:
.generate lambda0 = (-normden(cutoff1-zhat))/(norm(cutoff1-zhat)-norm(-zhat))
.generate lambda1 = (normden(cutoff1-zhat)-normden(cutoff2-zhat))/(norm(cutoff2-zhat)-norm(cutoff1-zhat))
.generate lambda2 = (normden(cutoff2-zhat)-normden(cutoff3-zhat))/(norm(cutoff3-zhat)-norm(cutoff2-zhat))
.generate lambda3 = (normden(cutoff3-zhat)-normden(cutoff4-zhat))/(norm(cutoff4-zhat)-norm(cutoff3-zhat))
.generate lambda4 = (normden(cutoff4-zhat)-normden(cutoff5-zhat))/(norm(cutoff5-zhat)-norm(cutoff4-zhat))
.generate lambda5 = (normden(cutoff5-zhat)-normden(cutoff6-zhat))/(norm(cutoff6-zhat)-norm(cutoff5-zhat))
.generate lambda6 = (normden(cutoff6-zhat))/(1-norm(cutoff6)-norm(cutoff5-zhat))
.generate lambda = m170*lambda0 + m171a*lambda1 + m172a*lambda2 + m171b*lambda3 + m172b*lambda4 + m221a*lambda5+m221b*lambda6
One thing to notice in these calculations is that cutoff0 is assumed to be
−∞
−∞
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeyOeI0IaeyOhIukaaa@3852@
and cutoff7 is assumed to be
∞.
∞.
MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeyOhIukaaa@3765@
10. Now we are ready to estimate our regression explaining the grade that each individual received in intermediate microeconomics. Use Table 8 to report the regression results for four specifications of the model. The first question is can the null hypothesis of sample selection bias be rejected? How does this conclusion compare with BFS's conclusions? (See Table 9.) Second, since many of the potential explanatory variables like class size and scores on the SATs do not seem to be statistically significant, it is reasonable to focus our comments on the results reported in column (4) of Table 8.
What can you conclude about the impact of calculus on how well a student will do in intermediate microeconomics? Do the final grades earned in a majority of the math classes impact the grade earned in intermediate microeconomics? Do the grades earned in any of the math classes positively and significantly affect the grade earned in intermediate microeconomics? Can you explain the impact of the freshman GPA on the grade earned in intermediate microeconomics? What, if any, is your bottom line conclusions about what matters in determining the grades earned in intermediate microeconomics?
Table 8: Determinants of Final Grade in Intermediate Microeconomics.Robust t-ratios are in parentheses.
| Explanatory variables |
Model (1) |
Model (2) |
Model (3) |
Model (4) |
| Lambda |
|
|
— |
— |
| |
|
|
|
|
| Sophomore |
|
— |
|
— |
| |
|
|
|
|
| Senior |
|
— |
|
— |
| |
|
|
|
|
| Same |
|
|
|
|
| |
|
|
|
|
| Skip |
|
— |
|
— |
| |
|
|
|
|
| M171a |
|
|
|
|
| |
|
|
|
|
| M172a |
|
|
|
|
| |
|
|
|
|
| M171b |
|
|
|
|
| |
|
|
|
|
| M172b |
|
|
|
|
| |
|
|
|
|
| M221a |
|
|
|
|
| |
|
|
|
|
| M221b |
|
|
|
|
| |
|
|
|
|
| GE100 |
|
|
|
|
| |
|
|
|
|
| GDE100 |
|
|
|
|
| |
|
|
|
|
| GE101 |
|
|
|
|
| |
|
|
|
|
| GDE101 |
|
|
|
— |
| |
|
|
|
|
| GDE231 |
|
|
|
|
| |
|
|
|
|
| Size |
|
|
|
— |
| |
|
|
|
|
| FGPA |
|
|
|
|
| |
|
|
|
|
| Female |
|
|
|
|
| |
|
|
|
|
| MSAT |
|
|
|
— |
| |
|
|
|
|
| VSAT |
|
|
|
— |
| |
|
|
|
|
| Grade in highest Math |
— |
|
— |
— |
| class |
|
|
|
|
| GM170 |
|
— |
|
|
| |
|
|
|
|
| GM171a |
|
— |
|
|
| |
|
|
|
|
| GM172a |
|
— |
|
|
| |
|
|
|
|
| GM171b |
|
— |
|
|
| |
|
|
|
|
| GM172b |
|
— |
|
|
| |
|
|
|
|
| GM221a |
|
— |
|
|
| |
|
|
|
|
| GM221b |
|
— |
|
|
| |
|
|
|
|
| Intercept |
|
|
|
|
| |
|
|
|
|
| F( 28, 580) |
|
— |
— |
— |
| Prob > F |
|
— |
— |
— |
| F( 27, 581) |
— |
— |
|
— |
| Prob > F |
— |
— |
|
— |
| F( 20, 588) |
|
— |
— |
|
| Prob > F |
|
— |
— |
|
| F( 19, 589) |
— |
|
— |
— |
| Prob > F |
— |
|
— |
— |
| R-Squared |
|
|
|
|
| Root MSE |
|
|
|
|
| Sample Size |
609 |
609 |
609 |
609 |
Table 9: Results reported in BFS (p. 195).a Omitted reference groups in MICRO-2 regression: attained Math 170; took MICRO-2 in Junior year; took MICRO-1 in spring, MICRO-2 next fall.
b Significant at 0.01 level, one- or two-tailed test as appropriate.
c Significant at 0.05 level, one- or two-tailed test as appropriate.
| |
|
MICRO-2 |
| Variablea |
Expected sign |
Mean (SD) |
Coefficient(t-value) |
| Intercept |
— |
— |
-1.64 |
| |
|
|
(3.48) |
| Selection bias correction |
+ |
-0.00 |
0.10 |
| (Predicted residual) |
|
(0.92) |
(1.29) |
| Level of calculus attained: |
| Math 171A |
+ |
0.08 |
0.39 |
| |
|
(0.27) |
(1.04) |
| Math 172A |
+ |
0.02 |
-0.18 |
| |
|
(0.13) |
(0.21) |
| Math 171B |
+ |
0.37 |
1.02b |
| |
|
(0.48) |
(3.49) |
| Math 172B |
+ |
0.07 |
1.52 b |
| |
|
(0.25) |
(3.53) |
| Math 221A |
+ |
0.05 |
1.33c |
| |
|
(0.22) |
(2.27) |
| Math 221B or 222 |
+ |
0.14) |
0.75c |
| |
|
(0.35 |
(1.67) |
| Grade in last calculus course: |
| Math 170 |
+ |
3.06 |
0.36b |
| |
|
(0.70) |
(4.36) |
| Math 171A |
+ |
2.22 |
0.26c |
| |
|
(0.86) |
(2.21) |
| Math 172A |
+ |
2.94 |
0.42 |
| |
|
(0.80) |
(1.54) |
| Math 171B |
+ |
2.62 |
0.10c |
| |
|
(0.93) |
(1.85) |
| Math 172B |
+ |
2.63 |
-0.01 |
| |
|
(0.90) |
(0.10) |
| Math 221A |
+ |
3.10 |
-0.09 |
| |
|
(0.77) |
(0.55) |
| Math 221B or 222 |
+ |
3.15 |
0.11 |
| |
|
(0.76) |
(1.04) |
| Grade deflator of instructor in intermediate theory |
+ |
-0.16 |
0.88b |
| course |
|
(0.27) |
(8.28) |
| Taken in Sophomore year |
? |
0.32 |
0.07 |
| |
|
(0.47) |
(0.94) |
| Taken in Senior year |
- |
0.06 |
-0.02 |
| |
|
(0.24) |
(0.13) |
| MICRO-1 and MICRO-2 in same academic year |
+ |
0.35 |
0.04 |
| |
|
(0.48) |
(0.46) |
| At least one semester between MICRO-1 and |
- |
0.27 |
0.13 |
| MICRO-2 |
|
(0.44) |
(1.85) |
| Grade in MACRO-1 |
+ |
2.73 |
0.20b |
| |
|
(0.73) |
(3.93) |
| Grade in MICRO-1 |
+ |
2.67 |
0.29b |
| |
|
(0.74) |
(5.93) |
| Instructor's grade deflator: |
| |
| MACRO-1 |
- |
-0.32 |
-0.33c |
| |
|
(0.20) |
(2.20) |
| MICRO-1 |
- |
-0.29 |
-0.11 |
| |
|
(0.16) |
(0.53) |
| Class size (intermediate theory course) |
? |
28.2 |
-0.002 |
| |
|
(5.5) |
(0.45) |
| Freshman Grade Point Average |
+ |
2.79 |
0.29b |
| |
|
(0.46) |
(3.04) |
| Sex (female = 1; male = 0) |
? |
0.39 |
0.13c |
| |
|
(0.49) |
(2.09) |
| SAT-Math score x 10-2 |
+ |
6.25 |
0.12c |
| |
|
(0.60) |
(1.75) |
| SAT-Verbal score x 10-2 |
+ |
5.56 |
0.04 |
| |
|
(0.67) |
(0.78) |
| OVERALL RESULTS |
| Mean (SD) of dependent variable |
|
|
|
| |
|
|
|
| Adjusted R2 |
|
0.44 |
|
| Number of observations |
|
609 |
|