Skip to content Skip to navigation

Connexions

You are here: Home » Content » Replication of econometric studies

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

Replication of econometric studies

Module by: Christopher Curran. E-mail the author

Summary: This module leads the advanced undergraduate student through the process of replicating the econometrics in a published study.

Replication

Introduction

One of the most important first steps in a science experiment is to replicate the results of earlier research. For a variety of reasons (most of them practical and not theoretically sound) economists generally do not undertake this step; what they tend to do is report the results of earlier papers and then compare their results with the earlier results without asking the question of whether these earlier results were reported accurately. Omitting this step in a world of honest careful researchers might seem to be a minor problem. However, there is enough casual evidence to suggest that a large portion of the econometric results reported in the journals cannot be replicated because the original researcher (1) does not have the data set used in the research because it has been lost for a variety of reasons, (2) cannot share the data set because it is proprietory, (3) is unwilling to share the data set because there are other issues they wish to investigate using the data set, or (4) just are unwilling to share the data set. For this reason much of the published econometrics research has never been replicated. In recognization of this problem several journals like the Journal of Applied Econometrics now require that authors submit the data set they used to the journal to be posted on the web for use by any other researcher. Whether this effort has been successful will not be clear unless someone undertakes to replicate the work in this journal to see if all of the data necessary to replicate an article have been posted and if the regressions included in the article actually can be replicated. It is very unlikely anyone would undertake such an effort given the fact that no journal will publish results that are merely a replication of previously published articles.

In this module we explore some of the difficulties that exist in replicating existing research by undertaking to replicate some of the results reported in the Butler, Finegan, and Siegfried (1998) (BFS, hereafter) article analyzing the effect of a student's calculus background on the grade he or she earns in intermediate microeconomics or in intermediate macroeconomics.1 The goal of this module is to (1) help students to learn how to read in detail an article that appears in a typical economics trade journal, (2) introduce them to ordered probit, an advanced econometrics tool, and (3) teach them how to present and discuss the results of an estimation of a model in an economics paper. While most of the discussion in this module focuses on using Stata in this replication, one can use most any econometrics program they are comfortable with to replicate some of the results reported in the BFS article.

Butler, Finegan, and Siefried (1998).

The obvious first step is to find and print a copy of the article by Butler, Finegan, and Siefried. In fact, do not proceed any further in reading this module until you have read the article. We will discuss in class what the authors do in the paper and how clearly they present their conclusions. In this first pass at the article you are to pay attention to how convincing you find their arguments to be. Since everyone in the class has completed an intermediate microeconomics course, your discussion of their conclusions should reflect your own experiences. Also, you need to be able to discuss in class the estimation strategy they use in the paper. In particular, you will need to be able to identify what the source of the data is and what equations did they estimate. Also, try to determine how the estimations in the "first" stage are used in the estimations of the "second" stage. Why did the authors use a two-stage estimation strategy?

Also, what do you think the authors mean in their description of their estimation strategy by their statement about the estimation methods they use:

Estimation Methods and Expectations
To cope with the selection bias problem, we use a two-stage estimation procedure. The first stage employs an ordered probit model to predict the highest level of calculus attained by each student prior to taking each intermediate economic theory course.... In the second stage, the student's grade in MICRO-2 ... (the `outcome') is regressed on the actual level of calculus attained, the grade earned in that calculus course, the predicted residual in the grade equation that we would expect on the basis of the actual level of calculus attained, and a roster of control variables reflecting ability and motivation. Individuals are the unit of observation. Ordinary least squares estimation is used because there are twelve categories of grades which are commonly interpreted as cardinal measures of performance (as is implied by the calculation of `grade point averages'). (Butler, Finegan, and Siegfried, 1998: 188)

The ordered-probit model

In what follows you are to “replicate” the equations the authors estimate in the paper for the intermediate microeconomics course. In order to complete this assignment you will need to figure out several things including (1) what an ordered-probit model is and (2) how to use Stata to estimate an ordered-probit model. In this section of the module we introduce the ordered-probit model. I strongly encourage you to consult Greene (1990: 703-706) for an excellent and clear discussion of the ordered-probit model. The discussion here follows Greene closely.

It is common for surveys to have questions that require the responder to choose one of several categories that have an innate order to them. For instance, most course evaluations ask the respondent to choose an answer to a question that reflects their agreement with a statement about the course. For instance, the question might read, "The Professor was interested in the material taught in the class" where the student completing the evaluation would choose a number from 1 to 9 where a 1 indicates complete disagreement with the statement and a 9 reflects complete agreement with the statement. Thus, there is an order to the potential answers. Using a logit, probit, or multilogit model would completely ignore this order. A linear regression is inappropriate because OLS treats the difference between answers of 1 and 2 as being the same as the difference between a 7 and and 8, when in fact the numbers only provide a ranking.

Consider a latent variable, y*, that is not observed but where y= β x+ε. y= β x+ε. MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadMhacqGH9aqpceWHYoGbauaacaWH4bGaey4kaSIaeqyTduMaaiOlaaaa@3D72@ We want to estimate the β k 's β k 's MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabek7aInaaBaaaleaacaWGRbaabeaakiaacEcacaqGZbaaaa@3A51@ in the vector β=( β 0 β 1 β K ). β=( β 0 β 1 β K ). MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaahk7acqGH9aqpdaqadaqaauaabeqabqaaaaqaaiabek7aInaaBaaaleaacaaIWaaabeaaaOqaaiabek7aInaaBaaaleaacaaIXaaabeaaaOqaaiabl+Uimbqaaiabek7aInaaBaaaleaacaWGlbaabeaaaaaakiaawIcacaGLPaaacaGGUaaaaa@4431@ 2 We may not observe y* but we do observe:

The μ i 's μ i 's MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeY7aTnaaBaaaleaacaWGPbaabeaakiaabEcacaqGZbaaaa@3A63@ in (1) are parameters that must be estimated along with β. β. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaahk7acaGGUaaaaa@37D9@ As usual, we assume that the error term ε is normally distributed (with a normalized mean and variance arbitrarily set to 0 and 1, respectively). It is trivial to estimate the model with the error terms having a logistic distribution, but this chance in assumptions appears to make virtually no difference in practice).3 With the normal distribution, we have:

y={ 0 if y <0, 1 if 0 y < μ 1 , 2 if μ 1 y < μ 2 , J if μ J1 y . y={ 0 if y <0, 1 if 0 y < μ 1 , 2 if μ 1 y < μ 2 , J if μ J1 y . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadMhacqGH9aqpdaGabaabaeqabaGaaGimaiaabccacaqGGaGaaeiiaiaabMgacaqGMbGaaeiiaiaadMhadaahaaWcbeqaaiabgEHiQaaakiabgYda8iaaicdacaGGSaaabaGaaGymaiaabccacaqGGaGaaeiiaiaabMgacaqGMbGaaeiiaiaaicdacqGHKjYOcaWG5bWaaWbaaSqabeaacqGHxiIkaaGccqGH8aapcqaH8oqBdaWgaaWcbaGaaGymaaqabaGccaGGSaaabaGaaeOmaiaabccacaqGGaGaaeiiaiaabMgacaqGMbGaaeiiaiabeY7aTnaaBaaaleaacaaIXaaabeaakiabgsMiJkaadMhadaahaaWcbeqaaiabgEHiQaaakiabgYda8iabeY7aTnaaBaaaleaacaaIYaaabeaakiaacYcaaeaacqWIUlstaeaacaWGkbGaaeiiaiaabccacaqGGaGaaeyAaiaabAgacaqGGaGaeqiVd02aaSbaaSqaaiaadQeacqGHsislcaaIXaaabeaakiabgsMiJkaadMhadaahaaWcbeqaaiabgEHiQaaakiaac6caaaGaay5Eaaaaaa@70C5@
(1)
Pr( y=0 )=Φ( β x ), Pr( y=1 )=Φ( μ 1 β x )Φ( β x ), Pr( y=2 )=Φ( μ 2 β x )Φ( μ 1 β x ), Pr( y=J )=1Φ( μ J1 β x ), Pr( y=0 )=Φ( β x ), Pr( y=1 )=Φ( μ 1 β x )Φ( β x ), Pr( y=2 )=Φ( μ 2 β x )Φ( μ 1 β x ), Pr( y=J )=1Φ( μ J1 β x ), MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOabaeqabaGaciiuaiaackhadaqadaqaaiaadMhacqGH9aqpcaaIWaaacaGLOaGaayzkaaGaeyypa0JaeuOPdy0aaeWaaeaacqGHsislceWHYoGbauaacaWH4baacaGLOaGaayzkaaGaaiilaaqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGymaaGaayjkaiaawMcaaiabg2da9iabfA6agnaabmaabaGaeqiVd02aaSbaaSqaaiaaigdaaeqaaOGaeyOeI0IabCOSdyaafaGaaCiEaaGaayjkaiaawMcaaiabgkHiTiabfA6agnaabmaabaGaeyOeI0IabCOSdyaafaGaaCiEaaGaayjkaiaawMcaaiaacYcaaeaaciGGqbGaaiOCamaabmaabaGaamyEaiabg2da9iaaikdaaiaawIcacaGLPaaacqGH9aqpcqqHMoGrdaqadaqaaiabeY7aTnaaBaaaleaacaaIYaaabeaakiabgkHiTiqahk7agaqbaiaahIhaaiaawIcacaGLPaaacqGHsislcqqHMoGrdaqadaqaaiabeY7aTnaaBaaaleaacaaIXaaabeaakiabgkHiTiqahk7agaqbaiaahIhaaiaawIcacaGLPaaacaGGSaaabaGaeSO7I0eabaGaciiuaiaackhadaqadaqaaiaadMhacqGH9aqpcaWGkbaacaGLOaGaayzkaaGaeyypa0JaaGymaiabgkHiTiabfA6agnaabmaabaGaeqiVd02aaSbaaSqaaiaadQeacqGHsislcaaIXaaabeaakiabgkHiTiqahk7agaqbaiaahIhaaiaawIcacaGLPaaacaGGSaaaaaa@8C59@
(2)

where Φ( ) Φ( ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaahA6adaqadaqaaiabgwSixdGaayjkaiaawMcaaaaa@3AEE@ is the cumulative normal function. In order for all of the probabilities to be positive, we need μ 1 < μ 2 << μ J1 , μ 1 < μ 2 << μ J1 , MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeY7aTnaaBaaaleaacaaIXaaabeaakiabgYda8iabeY7aTnaaBaaaleaacaaIYaaabeaakiabgYda8iabl+UimjabgYda8iabeY7aTnaaBaaaleaacaWGkbGaeyOeI0IaaGymaaqabaGccaGGSaaaaa@4545@ as shown in Figure 1. One thing to note in Figure 1 is that the cutoff locations change when the values of the explanatory variables change.

Figure 1: Distribution of the error term in the ordered-probit model.
This is the graph of the distribution of the error term in the ordered-probit model.

The estimation strategy from here follows the usual maximum likelihood method. The computer program forms the likelihood function and then chooses the values of the parameters (including the cutoffs) that maximize this likelihood function.

The estimated coefficients are not equal to the marginal effects of a change in one of the explanatory variables (as is also true with the logit and probit models). Consider the simple example Greene (1990, 704) describes. Assume that there are three categories. Then (2) becomes:

Pr( y=0 )=1Φ( β x ), Pr( y=1 )=Φ( μ β x )Φ( β x ), Pr( y=2 )=1Φ( μ β x ). Pr( y=0 )=1Φ( β x ), Pr( y=1 )=Φ( μ β x )Φ( β x ), Pr( y=2 )=1Φ( μ β x ). MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOabaeqabaGaciiuaiaackhadaqadaqaaiaadMhacqGH9aqpcaaIWaaacaGLOaGaayzkaaGaeyypa0JaaGymaiabgkHiTiabfA6agnaabmaabaGabCOSdyaafaGaaCiEaaGaayjkaiaawMcaaiaacYcaaeaaciGGqbGaaiOCamaabmaabaGaamyEaiabg2da9iaaigdaaiaawIcacaGLPaaacqGH9aqpcqqHMoGrdaqadaqaaiabeY7aTjabgkHiTiqahk7agaqbaiaahIhaaiaawIcacaGLPaaacqGHsislcqqHMoGrdaqadaqaaiabgkHiTiqahk7agaqbaiaahIhaaiaawIcacaGLPaaacaGGSaaabaGaciiuaiaackhadaqadaqaaiaadMhacqGH9aqpcaaIYaaacaGLOaGaayzkaaGaeyypa0JaaGymaiabgkHiTiabfA6agnaabmaabaGaeqiVd0MaeyOeI0IabCOSdyaafaGaaCiEaaGaayjkaiaawMcaaiaac6caaaaa@6CF8@
(3)

Figure 2 shows this situation. The solid curve shows the distribution of y and y*. Increasing one of the x's while holding the β constant (that is, changing β ^ x 0 β ^ x 0 MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqahk7agaqcgaqbaiaahIhadaWgaaWcbaGaaGimaaqabaaaaa@3928@ to β ^ x 1 ) β ^ x 1 ) MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqahk7agaqcgaqbaiaahIhadaWgaaWcbaGaaGymaaqabaGccaGGPaaaaa@39E0@ is the same as shifting the entire distribution of y and y* to the right with μ ^ μ ^ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqbeY7aTzaajaaaaa@37AF@ remaining constant. As a result the probabilities that y takes on the values of 0, 1, and 2 change. Clearly, as shown in Figure 2, Pr( y=0 ) Pr( y=0 ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGimaaGaayjkaiaawMcaaaaa@3BFC@ decreases and Pr( y=2 ) Pr( y=2 ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGOmaaGaayjkaiaawMcaaaaa@3BFE@ increases. The Pr( y=1 ), Pr( y=1 ), MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGymaaGaayjkaiaawMcaaiaacYcaaaa@3CAD@ on the other hand, may increase or decrease and, thus, the effect of an increase in one of the explanatory variables is ambiguous. It is easy to show this result algebraically. The marginal effects for the 3 probabilities in (3) are, assuming β>0: β>0: MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaahk7acqGH+aGpcaaIWaGaaiOoaaaa@39A7@

Pr( y=0 ) x =ϕ( β x )β<0, Pr( y=1 ) x =ϕ( μ β x )βϕ( β x )β, Pr( y=2 ) x =ϕ( μ β x )β>0. Pr( y=0 ) x =ϕ( β x )β<0, Pr( y=1 ) x =ϕ( μ β x )βϕ( β x )β, Pr( y=2 ) x =ϕ( μ β x )β>0. MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOabaeqabaWaaSaaaeaacqGHciITciGGqbGaaiOCamaabmaabaGaamyEaiabg2da9iaaicdaaiaawIcacaGLPaaaaeaacqGHciITcaWH4baaaiabg2da9iabgkHiTiabew9aMnaabmaabaGabCOSdyaafaGaaCiEaaGaayjkaiaawMcaaiaahk7acqGH8aapcaaIWaGaaiilaaqaamaalaaabaGaeyOaIyRaciiuaiaackhadaqadaqaaiaadMhacqGH9aqpcaaIXaaacaGLOaGaayzkaaaabaGaeyOaIyRaaCiEaaaacqGH9aqpcqaHvpGzdaqadaqaaiabeY7aTjabgkHiTiqahk7agaqbaiaahIhaaiaawIcacaGLPaaacaWHYoGaeyOeI0Iaeqy1dy2aaeWaaeaaceWHYoGbauaacaWH4baacaGLOaGaayzkaaGaaCOSdiaacYcaaeaadaWcaaqaaiabgkGi2kGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGOmaaGaayjkaiaawMcaaaqaaiabgkGi2kaahIhaaaGaeyypa0Jaeqy1dy2aaeWaaeaacqaH8oqBcqGHsislceWHYoGbauaacaWH4baacaGLOaGaayzkaaGaaCOSdiabg6da+iaaicdacaGGUaaaaaa@7EEF@
(4)

Figure 2: A rise in one of the explanatory variables whose parameter is positive will shift the probability distribution of the outcome to the right (from the solid line to the dashed line).
The figure shows the impact of a change in one of the explanatory variables on the probabilities that y equals 0, 1, or 2.

In general, only the sign's of the change Pr( y=0 ) Pr( y=0 ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGimaaGaayjkaiaawMcaaaaa@3BFC@ and Pr( y=J ) Pr( y=J ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaamOsaaGaayjkaiaawMcaaaaa@3C11@ are unambiguous. Greene (1990, 705) cautions that "“[w]e must be very careful in interpreting the coefficients in this model.... Indeed, without a fair amount o extra calculation, it is quite unclear how the coefficients in the ordered-probit model should be interpreted.”"

The BFS Dataset

The data used by BFS are available at the Journal of Applied Econometrics data website or in the MS Excel file Vanderbilt data set.xls. Table 1 identifies the variables in the dataset.

Table 1: Definition of the variables included in the Vanderbilt data set.
Column Code Variable definition
A Obs Observation number
B SID Student ID
C Grade Grade earned in Economics 231, A = 4, A- = 3.7, etc.
D SelCorr Variable correcting for selection bias
E Soph Dummy variable = 1 if student is a sophomore
F Senior Dummy variable = 1 if student is a senior
G Same Dummy variable = 1 if student took both intermediate classes the same year
H Skip Dummy variable = 1 if student took the intermediate classes at least one semester apart
I HighestMath Highest level of math attained (the dependent variable, 0-6 corresponding to Math 170, 171a, 172a, 171b, 172b, 221a, 221b)
J M170 Dummy variable = 1 if student's highest level of math was Math 170
K M171a Dummy variable = 1 if student's highest level of math was Math 171A
L M172a Dummy variable = 1 if student's highest level of math was Math 172a
M M171b Dummy variable = 1 if student's highest level of math was Math 171b
N M172b Dummy variable = 1 if student's highest level of math was Math 172b
O M221a Dummy variable = 1 if student's highest level of math was Math 221a
P M221b Dummy variable = 1 if student's highest level of math was Math 221b
Q GE100 Grade in Economics 100
R GDE100 Individual instructor grade deflator in Economics 100
S GE101 Grade in Economics 101
T GDE101 Individual instructor grade deflator in Economics 101
U GDE231 Individual instructor grade deflator in Economics 231
V Size Class size
W FGPA Freshman GPA
X Female Dummy variable =1 if student is a female
Y MSAT Score on Math section of the SAT
Z VSAT Score on Verbal section of the SAT
AA TE231 Teacher of Economics 231 (numerical code)
AB SE231 Section of Economics 231 (numerical code)
AC GM170 Grade in highest math class: Math 170
AD GM171a Grade in highest math class: Math 171a
AE GM172a Grade in highest math class: Math 172a
AF GM171b Grade in highest math class: Math 171b
AG GM172b Grade in highest math class: Math 172b
AH GM221a Grade in highest math class: Math 221a
AI GM221b Grade in highest math class: Math 221b
AJ GHM Grade in highest math class
AK Foreign Dummy variable = 1 if student passed foreign language proficiency test
AL EMEcon Dummy variable = 1 if expected major is economics
AM EMOSS Dummy variable = 1 if expected major is another social science
AN EMNS Dummy variable = 1 if expected major is a natural science
AO EMH Dummy variable = 1 if expected major is in the humanities
AP AM1 Dummy variable = 1 if student completed 1 year of advanced math in high school
AQ AM2 Dummy variable = 1 if student completed 2 years of advanced math in high school
AR AM3 Dummy variable = 1 if student completed 3 years of advanced math in high school
AS Phy1 Dummy variable = 1 if student completed 1 course in physics in high school
AT Phy2 Dummy variable = 1 if student completed 2 courses in physics in high school
AU Chem1 Dummy variable = 1 if student completed 1 course in chemistry in high school
AV Chem2 Dummy variable = 1 if student completed 2 courses in chemistry in high school

Replication of the Ordered Probit Regression

At this point we are ready to begin the replication. Since it is easy to get lost in the process, I have created a list of steps that include both instructions on what to do and questions you need to answer. As part of this exercise you will be asked to complete several tables of results. In order to make this effort easier, I have provided a MS Word file, Tables for ordered probit discussion.doc, with the tables to be completed in it.

1. Load the data in Stata from Excel.

2. Convert MSAT and VSAT to MSAT/100 and VSAT/100, respectively, using the commands:

.replace msat = msat/100

.replace vsat = vsat/100

3. Common sense dictates that we should calculate the means and standard deviations of the variables to be sure that there are no entry errors. We need to construct a table that compares the means and standard deviations reported in BFS with those in our dataset. Table 2, which has the means and standard deviations reported by BFS, gives a place to put the means and standard deviations for the variables in our dataset. Fill in the information missing from Table 2.

Table 2: Means and standard deviations of the data.
  Our data Butler, et al.
Variable Mean Std. Dev. Mean Std. Dev.
msat     6.25 0.60
foreign     0.11 0.32
female     0.39 0.49
emecon     0.34 0.48
emoss     0.17 0.38
emns     0.21 0.41
emh     0.07 0.25
am1     0.49 0.50
am2     0.45 0.50
am3     0.01 0.11
phy1     0.67 0.47
Phy2     0.02 0.14
chem1     0.82 0.39
chem2     0.12 0.32

4. Estimate the ordered probit regression using (in Stata) the commands:

.global indvar msat foreign female emecon emoss emns emh am1 am2 am3 phy1 phy2 chem1 chem2

.oprobit highestmath $indvar

5. Use the result of this estimation to complete Table 3.4

Table 3: Results of Stata ordered-probit regression.
highestmath Coef. Std. Err. z P>z [95% Conf. Interval] 
msat1            
foreign            
female            
emecon            
emoss            
emns            
emh            
am1            
am2            
am3            
phy1            
Phy2            
chem1            
chem2            
             
_cut1      
_cut2            
_cut3            
_cut4            
_cut5            
_cut6            
Observations            
Log likelihood            
LR χ2(14)            
Prob > χ2            
Pueudo-R2            

6. Compare your results with the table reported in the article. The table in the article is Table II on page 193 and is reproduced in Figure 3. What we are interested in is comparing column 4 in Figure 3 with columns 2 and 4 in Table 3. Table 4 below offers a model for this comparison.

Figure 3: Results of ordered probit regression as reported in Butler, et al.
This figure shows a copy of the results as reported by BFS.

Table 4. Comparison of ordered probit estimations.

Table 4: Comparison of ordered-probit estimations.
  Our estimates Butler, et al. estimates
  Estimate z Estimate t-value
msat1     0.05 6.12
foreign     0.02 0.14
female     0.25 2.59
emecon     -0.11 0.86
emoss     -0.29 1.99
emns     0.43 3.10
emh     -0.37 1.78
am1     0.24 1.07
am2     0.93 4.04
am3     0.77 1.70
phy1     0.26 2.71
Phy2     0.38 1.07
chem1     -0.12 0.69
chem2     0.17 0.75
Intercept     -3.09 5.48
_cut1     0.27 7.29
_cut2     0.33 8.16
_cut3     1.52 20.32
_cut4     1.79 23.07
_cut5     2.04 23.72
_cut6        

7. It is easy to see from Table 4 is that almost without exception the estimates of the parameters and their t-ratios are very similar. The exception arises with the estimates of the truncation points (_cut# in the Stata results). We will have to figure out what these are estimates of in order to make sense of them. Figure 1 shows the "cutoffs" that are being estimated. Footnote c in the BFS Table II on page 193 (shown in Figure 3) offers a useful observation:

In an ordered probit, an underlying, normally distributed, latent variable has a mean which is a function of observable variables. The latent variable gives rise to a set of observed dummy variables for ordered categories based on ranges between unobserved but estimable truncation points which correspond to levels of effort, ability, or other factors reflected in the explanatory variables. If L categories are observed, there are L1 L1 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaqcaaKaamitaiabgkHiTiaaigdaaaa@38B6@ truncation points, of which the first is normalized to be zero, so that L2 L2 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaqcaaKaamitaiabgkHiTiaaikdaaaa@38B7@ truncation points are estimated and reported in the table. The values correspond to standard deviations of the latent normally distributed variable.

The key idea is that the values of cutoffs are relative and can be normalized around any value. Notice that the Stata results do not report an intercept term but do report six cutoff values. Moreover, the difference between the estimate by Stata for the first cutoff (3.08402) and the estimate for the second cutoff (3.356916) is equal to 0.272896, which is itself equal to the first truncation point reported by BFS (1998: 193). Use Table 5 to report the difference between the first cutoff value and each of the cutoff points reported by Stata.

Table 5: Reconciling Stata estimates of cutoff points with Butler, et al.'s truncation points.
Cutoff Estimate Estimate - _cut1 BFS Truncation Points
_cut1 3.0840    
_cut2 3.3569   0.27
_cut3 3.4146   0.33
_cut4 4.6013   1.52
_cut5 4.8774   1.79
_cut6 5.1202   2.04

The second part of the reconciliation of the two sets of results is to compute the t-ratios. To do this we need to compute the standard deviation of the estimates of the cutoff points reported by Stata. To do this we need to retrieve the variance-covariance matrix from the regression. First, let's see what we are interested in computing. Let β ^ i β ^ i MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaKaadaWgaaWcbaGaamyAaaqabaaaaa@38BF@ be the estimate of the ith cutoff point. In column 3 of Table 5 you computed α ^ i = β ^ i β ^ 1 α ^ i = β ^ i β ^ 1 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqySdeMbaKaadaWgaaWcbaGaamyAaaqabaGccqGH9aqpcuaHYoGygaqcamaaBaaaleaacaWGPbaabeaakiabgkHiTiqbek7aIzaajaWaaSbaaSqaaiaaigdaaeqaaaaa@4027@ for i=2,,6 i=2,,6 . MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaiabg2da9iaaikdacaGGSaGaeSOjGSKaaiilaiaaiAdaaaa@3BE6@ . The variance of the new variable is:

V( α ^ i )=V( β ^ i )2Cov( β ^ i β ^ 1 )+V( β ^ 1 )= σ i 2 2 σ i1 + σ 1 2 V( α ^ i )=V( β ^ i )2Cov( β ^ i β ^ 1 )+V( β ^ 1 )= σ i 2 2 σ i1 + σ 1 2 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOvamaabmaabaGafqySdeMbaKaadaWgaaWcbaGaamyAaaqabaaakiaawIcacaGLPaaacqGH9aqpcaWGwbWaaeWaaeaacuaHYoGygaqcamaaBaaaleaacaWGPbaabeaaaOGaayjkaiaawMcaaiabgkHiTiaaikdacaWGdbGaam4BaiaadAhadaqadaqaaiqbek7aIzaajaWaaSbaaSqaaiaadMgaaeqaaOGafqOSdiMbaKaadaWgaaWcbaGaaGymaaqabaaakiaawIcacaGLPaaacqGHRaWkcaWGwbWaaeWaaeaacuaHYoGygaqcamaaBaaaleaacaaIXaaabeaaaOGaayjkaiaawMcaaiabg2da9iabeo8aZnaaDaaaleaacaWGPbaabaGaaGOmaaaakiabgkHiTiaaikdacqaHdpWCdaWgaaWcbaGaamyAaiaaigdaaeqaaOGaey4kaSIaeq4Wdm3aa0baaSqaaiaaigdaaeaacaaIYaaaaaaa@60F0@
(5)

The variance-covariance matrix will give us estimates of these variances and covariances. When there are j parameters in a regression equation, this matrix is defined to be:

Σ ^ =[ σ ^ β 1 2 σ ^ β 1 β 2 σ β 1 β k σ ^ β 2 β 1 σ ^ β 2 2 σ ^ β 2 β k σ β k β 1 σ β k β 2 σ ^ β k 2 ]. Σ ^ =[ σ ^ β 1 2 σ ^ β 1 β 2 σ β 1 β k σ ^ β 2 β 1 σ ^ β 2 2 σ ^ β 2 β k σ β k β 1 σ β k β 2 σ ^ β k 2 ]. MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafu4OdmLbaKaacqGH9aqpdaWadaqaauaabeqaeqaaaaaabaGafq4WdmNbaKaadaqhaaWcbaGaeqOSdi2aaSbaaWqaaiaaigdaaeqaaaWcbaGaaGOmaaaaaOqaaiqbeo8aZzaajaWaaSbaaSqaaiabek7aInaaBaaameaacaaIXaaabeaaliabek7aInaaBaaameaacaaIYaaabeaaaSqabaaakeaacqWIVlctaeaacqaHdpWCdaWgaaWcbaGaeqOSdi2aaSbaaWqaaiaaigdaaeqaaSGaeqOSdi2aaSbaaWqaaiaadUgaaeqaaaWcbeaaaOqaaiqbeo8aZzaajaWaaSbaaSqaaiabek7aInaaBaaameaacaaIYaaabeaaliabek7aInaaBaaameaacaaIXaaabeaaaSqabaaakeaacuaHdpWCgaqcamaaDaaaleaacqaHYoGydaWgaaadbaGaaGOmaaqabaaaleaacaaIYaaaaaGcbaGaeS47IWeabaGafq4WdmNbaKaadaWgaaWcbaGaeqOSdi2aaSbaaWqaaiaaikdaaeqaaSGaeqOSdi2aaSbaaWqaaiaadUgaaeqaaaWcbeaaaOqaaiabl6Uinbqaaiabl6UinbqaaiablgVipbqaaaqaaiabeo8aZnaaBaaaleaacqaHYoGydaWgaaadbaGaam4AaaqabaWccqaHYoGydaWgaaadbaGaaGymaaqabaaaleqaaaGcbaGaeq4Wdm3aaSbaaSqaaiabek7aInaaBaaameaacaWGRbaabeaaliabek7aInaaBaaameaacaaIYaaabeaaaSqabaaakeaacqWIVlctaeaacuaHdpWCgaqcamaaDaaaleaacqaHYoGydaWgaaadbaGaam4AaaqabaaaleaacaaIYaaaaaaaaOGaay5waiaaw2faaiaac6caaaa@82FC@

If you type the command .vce, Stata will report Σ ^ Σ ^ , MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafu4OdmLbaKaaaaa@3788@ as shown in Figure 4. We need the section of this matrix shown in Part A of Table 6. Use equation (5) to estimate the standard errors of the estimates of the cutoff points and complete Part B of Table 6 and compares the t-ratios with the values reported by Butler, et al. (and shown in the last column 4 of Table 6). Are you satisfied that we have been able to come reasonably close to the results reported in the article?

Figure 4: Stata estimate of the variance-covariance matrix.
Copy of the Stata variance-covariance matrix estimate.

Table 6: Calculation of the t-ratios for the cutoff estimates.
Part A. Relevant portion of the variance-covariance matrix.
  _cut1 _cut2 _cut3 _cut4 _cut5 _cut6
_cut1 0.329          
_cut2 0.329 0.330        
_cut3 0.329 0.330 0.331      
_cut4 0.332 0.333 0.334 0.341    
_cut5 0.333 0.334 0.334 0.341 0.343  
_cut6 0.333 0.334 0.335 0.342 0.343 0.345
Part B. Calculation of the t-ratios (with comparison of values reported in BFS)
  V( β ^ β ^ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaKaaaaa@37A5@ ) St. Dev.( β ^ β ^ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaKaaaaa@37A5@ t-ratio BFS t-ratio
_cut2       7.29  
_cut3       8.16  
_cut4       20.32  
_cut5       23.07  
_cut6       23.72  

8. The next step in the process is to generate the term we will use in the estimation of the grade regression to account for the potential sample selection bias. To do this we will need to find a reference in the literature that offers a clear description of what we need to do. As it turns out, a reasonable explanation of the appropriate estimation technique is available in Jimenez and Kugler (1987). Since much of what follows comes directly from this article, I highly recommend you read it yourself.

The gist of the method suggests that the potential sample bias is accounted for by an inverse Mills ratio for each of the categories. What we need to do is calculate:

λ ^ i = ϕ( μ ^ j z ^ i )ϕ( μ ^ j+1 z ^ i ) Φ( μ ^ j+1 z ^ i )Φ( μ ^ j z ^ i ) λ ^ i = ϕ( μ ^ j z ^ i )ϕ( μ ^ j+1 z ^ i ) Φ( μ ^ j+1 z ^ i )Φ( μ ^ j z ^ i ) MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafq4UdWMbaKaadaWgaaWcbaGaamyAaaqabaGccqGH9aqpdaWcaaqaaiabew9aMnaabmaabaGafqiVd0MbaKaadaWgaaWcbaGaamOAaaqabaGccqGHsislceWG6bGbaKaadaqhaaWcbaGaamyAaaqaaiabgEHiQaaaaOGaayjkaiaawMcaaiabgkHiTiabew9aMnaabmaabaGafqiVd0MbaKaadaWgaaWcbaGaamOAaiabgUcaRiaaigdaaeqaaOGaeyOeI0IabmOEayaajaWaa0baaSqaaiaadMgaaeaacqGHxiIkaaaakiaawIcacaGLPaaaaeaacqqHMoGrdaqadaqaaiqbeY7aTzaajaWaaSbaaSqaaiaadQgacqGHRaWkcaaIXaaabeaakiabgkHiTiqadQhagaqcamaaDaaaleaacaWGPbaabaGaey4fIOcaaaGccaGLOaGaayzkaaGaeyOeI0IaeuOPdy0aaeWaaeaacuaH8oqBgaqcamaaBaaaleaacaWGQbaabeaakiabgkHiTiqadQhagaqcamaaDaaaleaacaWGPbaabaGaey4fIOcaaaGccaGLOaGaayzkaaaaaaaa@6799@
(6)

for the category that the individual actually is in. What we will do is calculate (6) for all of the categories and then sum the product of this number and a dummy variable indicating if a course is the highest math class completed by an individual. Since the dummy variables will equal 0 for math categories an individual is not in, the resulting sum will preserve the value of (6) that is associated with the category the individual does belong to.

It is clear from (6) that we will need to retain the 6 cutoffs. We can do this with the commands:

. generate cutoff1 = _b[_cut1]

. generate cutoff2 = _b[_cut2]

. generate cutoff3 = _b[_cut3]

. generate cutoff4 = _b[_cut4]

. generate cutoff5 = _b[_cut5]

. generate cutoff6 = _b[_cut6]

Technically, this step is not necessary since the parameter estimates are preserved until the next regression is estimated; I suggest doing this purely as a precaution.

9. Preserve the predicted values of the ordered-probit using the command:

. predict zhat, xb

. predict phat1 phat2 phat3 phat4 phat5 phat6 phat7, p

These two commands will generate for each observation the predicted mean category of math classes and the probability that this individual will fall in each category. To see what is going on we will retrieve some representative values of these variables and then graph them for one individual. Table 7 reports these values for 10 individuals in the sample. Now consider individual 2. Fitting a normal distribution with a mean of 4.25 and using the critical values from our estimation yields the probabilities that the individual is in each of the categories. For example, the probability that individual 1 will have completed no math classes is equal to 0.1223. Figure 5 illustrates the results for individual 1. The dashed vertical lines are the six cutoff values that are the same for each individual. The solid vertical line is the zhat for individual 1. The heavy blue line represents the normal probability density function for this individual. While, there is, of course, a different probability distribution for each individual, the cutoff values are the same for all members of the sample.

Table 7: Predicted values of the ordered probit regression.
Observation Highest Math Class zhat Pr(0) Pr(1) Pr(2) Pr(3) Pr(4) Pr(5) Pr(6)
1 3 3.9657 0.1890 0.0824 0.0194 0.4467 0.0816 0.0568 0.1241
2 0 4.2507 0.1217 0.0640 0.0158 0.4355 0.0975 0.0731 0.1923
165 0 3.5982 0.3036 0.1011 0.0225 0.4149 0.0575 0.0364 0.0640
166 6 4.6914 0.0540 0.0370 0.0098 0.3633 0.1097 0.0922 0.3340
214 3 3.4533 0.3560 0.1056 0.0229 0.3900 0.0483 0.0294 0.0478
215 3 4.0840 0.1587 0.0749 0.0180 0.4459 0.0887 0.0637 0.1501
225 3 3.5250 0.3296 0.1036 0.0228 0.4031 0.0528 0.0328 0.0553
226 3 3.6990 0.2693 0.0969 0.0219 0.4285 0.0641 0.0417 0.0776
453 3 3.9713 0.1875 0.0820 0.0194 0.4468 0.0819 0.0571 0.1253
454 5 4.1650 0.1399 0.0697 0.0170 0.4422 0.0932 0.0684 0.1697
495 3 4.4168 0.0913 0.0533 0.0135 0.4151 0.1043 0.0816 0.2409
496 0 2.9811 0.5410 0.1055 0.0212 0.2797 0.0236 0.0127 0.0162
526 0 2.9247 0.5633 0.1039 0.0207 0.2653 0.0214 0.0114 0.0141
527 3 3.9757 0.1863 0.0817 0.0193 0.4469 0.0822 0.0574 0.1262

Now we are ready to calculate (6). The commands are:

.generate lambda0 = (-normden(cutoff1-zhat))/(norm(cutoff1-zhat)-norm(-zhat))

.generate lambda1 = (normden(cutoff1-zhat)-normden(cutoff2-zhat))/(norm(cutoff2-zhat)-norm(cutoff1-zhat))

.generate lambda2 = (normden(cutoff2-zhat)-normden(cutoff3-zhat))/(norm(cutoff3-zhat)-norm(cutoff2-zhat))

.generate lambda3 = (normden(cutoff3-zhat)-normden(cutoff4-zhat))/(norm(cutoff4-zhat)-norm(cutoff3-zhat))

.generate lambda4 = (normden(cutoff4-zhat)-normden(cutoff5-zhat))/(norm(cutoff5-zhat)-norm(cutoff4-zhat))

.generate lambda5 = (normden(cutoff5-zhat)-normden(cutoff6-zhat))/(norm(cutoff6-zhat)-norm(cutoff5-zhat))

.generate lambda6 = (normden(cutoff6-zhat))/(1-norm(cutoff6)-norm(cutoff5-zhat))

.generate lambda = m170*lambda0 + m171a*lambda1 + m172a*lambda2 + m171b*lambda3 + m172b*lambda4 + m221a*lambda5+m221b*lambda6

One thing to notice in these calculations is that cutoff0 is assumed to be MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeyOeI0IaeyOhIukaaa@3852@ and cutoff7 is assumed to be ∞. ∞. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeyOhIukaaa@3765@

Figure 5: The probability distribution of math class category for individual 2.
The probability distribution for individual 2.

10. Now we are ready to estimate our regression explaining the grade that each individual received in intermediate microeconomics. Use Table 8 to report the regression results for four specifications of the model. The first question is can the null hypothesis of sample selection bias be rejected? How does this conclusion compare with BFS's conclusions? (See Table 9.) Second, since many of the potential explanatory variables like class size and scores on the SATs do not seem to be statistically significant, it is reasonable to focus our comments on the results reported in column (4) of Table 8.

What can you conclude about the impact of calculus on how well a student will do in intermediate microeconomics? Do the final grades earned in a majority of the math classes impact the grade earned in intermediate microeconomics? Do the grades earned in any of the math classes positively and significantly affect the grade earned in intermediate microeconomics? Can you explain the impact of the freshman GPA on the grade earned in intermediate microeconomics? What, if any, is your bottom line conclusions about what matters in determining the grades earned in intermediate microeconomics?

Table 8: Determinants of Final Grade in Intermediate Microeconomics.
Robust t-ratios are in parentheses.
Explanatory variables Model (1) Model (2) Model (3) Model (4)
Lambda    
         
Sophomore    
         
Senior    
         
Same        
         
Skip    
         
M171a        
         
M172a        
         
M171b        
         
M172b        
         
M221a        
         
M221b        
         
GE100        
         
GDE100        
         
GE101        
         
GDE101      
         
GDE231        
         
Size      
         
FGPA        
         
Female        
         
MSAT      
         
VSAT      
         
Grade in highest Math  
class        
GM170      
         
GM171a      
         
GM172a      
         
GM171b      
         
GM172b      
         
GM221a      
         
GM221b      
         
Intercept        
         
F( 28, 580)  
Prob > F  
F( 27, 581)  
Prob > F  
F( 20, 588)    
Prob > F    
F( 19, 589)  
Prob > F  
R-Squared        
Root MSE        
Sample Size 609 609 609 609
Table 9: Results reported in BFS (p. 195).
a Omitted reference groups in MICRO-2 regression: attained Math 170; took MICRO-2 in Junior year; took MICRO-1 in spring, MICRO-2 next fall. b Significant at 0.01 level, one- or two-tailed test as appropriate. c Significant at 0.05 level, one- or two-tailed test as appropriate.
    MICRO-2
Variablea Expected sign Mean (SD) Coefficient(t-value)
Intercept -1.64
      (3.48)
Selection bias correction + -0.00 0.10
(Predicted residual)   (0.92) (1.29)
Level of calculus attained:
Math 171A + 0.08 0.39
    (0.27) (1.04)
Math 172A + 0.02 -0.18
    (0.13) (0.21)
Math 171B + 0.37 1.02b
    (0.48) (3.49)
Math 172B + 0.07 1.52 b
    (0.25) (3.53)
Math 221A + 0.05 1.33c
    (0.22) (2.27)
Math 221B or 222 + 0.14) 0.75c
    (0.35 (1.67)
Grade in last calculus course:
Math 170 + 3.06 0.36b
    (0.70) (4.36)
Math 171A + 2.22 0.26c
    (0.86) (2.21)
Math 172A + 2.94 0.42
    (0.80) (1.54)
Math 171B + 2.62 0.10c
    (0.93) (1.85)
Math 172B + 2.63 -0.01
    (0.90) (0.10)
Math 221A + 3.10 -0.09
    (0.77) (0.55)
Math 221B or 222 + 3.15 0.11
    (0.76) (1.04)
Grade deflator of instructor in intermediate theory + -0.16 0.88b
course   (0.27) (8.28)
Taken in Sophomore year ? 0.32 0.07
    (0.47) (0.94)
Taken in Senior year - 0.06 -0.02
    (0.24) (0.13)
MICRO-1 and MICRO-2 in same academic year + 0.35 0.04
    (0.48) (0.46)
At least one semester between MICRO-1 and - 0.27 0.13
MICRO-2   (0.44) (1.85)
Grade in MACRO-1 + 2.73 0.20b
    (0.73) (3.93)
Grade in MICRO-1 + 2.67 0.29b
    (0.74) (5.93)
Instructor's grade deflator:
 
MACRO-1 - -0.32 -0.33c
    (0.20) (2.20)
MICRO-1 - -0.29 -0.11
    (0.16) (0.53)
Class size (intermediate theory course) ? 28.2 -0.002
    (5.5) (0.45)
Freshman Grade Point Average + 2.79 0.29b
    (0.46) (3.04)
Sex (female = 1; male = 0) ? 0.39 0.13c
    (0.49) (2.09)
SAT-Math score x 10-2 + 6.25 0.12c
    (0.60) (1.75)
SAT-Verbal score x 10-2 + 5.56 0.04
    (0.67) (0.78)
OVERALL RESULTS
Mean (SD) of dependent variable      
       
Adjusted R2   0.44  
Number of observations   609  

Exercises

Exercise 1

Quite often health professionals request that a patient a report their perception of their health status on a scale of 0 to 10, where 0 is the lowest possible health status and 10 is the highest health status. This type of data set is best analyzed using ordered probit. In this exercise you will analyze a data set of responses to a survey made in Germany between 1984 and 1995. The question we are interested in analyzing is the respondent’s perception of their own health status.

The file Riphahn, Wambach, Million data.xls is an MS Excel file that contains 27,326 observations on 25 variables, one observation per line. The data are from Riphahn, Wambach, and Million (2003) and are also available on the web. The variables are defined in Table 10. As a first step you will need to load these data into Stata. However, due to the large sample size you will need to first expand the size of the memory that is available to Stata with the command: . set memory 1G. Here I have increased the memory to 1 gigabyte. This amount may be overkill but it seemed to be big enough on my computer to handle the data.

Table 10: Variables in the German Socioeconomic Panel Data Set.
Column Variable Variable definition
A ID individual's ID number
B Female female = 1; male = 0
C Year calendar year of the observation
D Age age in years
E HSAT health satisfaction, coded 0 (low) - 10 (high)
F Handdum handicapped = 1; otherwise = 0
G Handper degree of handicap in percent (0 - 100)
H HhnINC household nominal monthly net income in German marks / 1000
I HHKIDS children under age 16 in the household = 1; otherwise = 0
J Educ years of schooling
K Married married = 1; otherwise = 0
L Haupts highest schooling degree is Hauptschul degree = 1; otherwise = 0
M Reals highest schooling degree is Realschul degree = 1; otherwise = 0
N FachHS highest schooling degree is Polytechnical degree = 1; otherwise = 0
O Abitur highest schooling degree is Abitur = 1; otherwise = 0
P Univ highest schooling degree is university degree = 1; otherwise = 0
Q Working employed = 1; otherwise = 0
R BlueC blue collar employee = 1; otherwise = 0
S WhiteC white collar employee = 1; otherwise = 0
T Self self employed = 1; otherwise = 0
U Beamt civil servant = 1; otherwise = 0
V DocVis number of doctor visits in last three months
W HospVis number of hospital visits in last calendar year
X Public insured in public health insurance = 1; otherwise = 0
Y Addon insured by add-on insurance = 1; otherwise = 0

Figure 6: Distribution of responses on health status.
Distribution of health status responses.

One of the major problems with survey indices is that the numbers seem to mean different things to respondents. One way to reduce this problem is to collapse the index into fewer outcomes by combining some of the responses together. However, anyway we do this is going to be ad hoc. Figure 6 shows a histogram of the responses to this question. Based on this graph, we will create 5 categories—(0) HSat = 0, 1, or 2; (1) HSat = 3, 4 or 5; (2) HSat = 6, 7, or 8; (3) HSat = 9; and (4) HSat = 10. We can create a new categorical variable called hsatnew with the command:

. recode hsat (0/2 = 0) (3/5 = 1) (6/8 = 2) (9 = 3) (10 = 4), generate(hsatnew)

Figure 7 shows the histogram of the new variable.

Figure 7: The collapsed distribution of health status responses.
The collapsed distribution of health status responses.

  1. Create a table of summary statistics for (1) health status, (2) age, (3) household income, (4) years of education, (5) marital status, and (6) number of children by year and sex. (You might want to use the command .bysort year female, list of variables).
  2. Estimate an ordered probit regression for 1988 for health status (the new variable) using age, income, education, married, and kids as the explanatory variables. Here you might want to used the command: .oprobit hsatnew age hninc educ married hhkids if year==1988.
  3. Use the predict newvariable, xb command to calculate the predicted mean values for each individual for the 1988 observations. Compare this histogram to one using the 1988 regression parameters to estimate xb for all years.
  4. Estimate the ordered probit model for all of the years in the sample and put the results into a table like Table 11. (Here you might want to make use of the command: .bysort year: oprobit hsatnew varlist)
Table 11: Sample table for part (d) of Exercise 1.
t-ratios are in parentheses.
Variable 1984 1985 1986 1987 1988 1991 1994
age              
income              
education              
married              
kids              
_cut1              
_cut2              
_cut3              
_cut4              
Observations              
LR χ2(5)              
Prob > χ2              
Log likelihood              
Pseudo-R2              

References

Amemiya, T. (1985). Advanced Econometrics (Cambridge, MA: Harvard University Press).

Bourguignon, François, Martin Fournier, and Marc Gurgand (2007). Selection bias corrections based on the multinomial logit model: Monte Carlo comparisons. Journal of Economic Surveys 21(1): 174-205.

Butler, J. S., T. Aldrich Finegan, and John J. Siegfried (1998). Does more calculus improve student learning in intermediate micro- and macroeconomic theory?" Journal of Applied Econometrics 13(2): 185-202.

Chiburis, Richard and Michael Lokshin (2007). Maximum likelihood and two–step estimation of an ordered–probit selection model. The Stata Journal 7(2): 167-182.

Dahl, Gordon B. (2002). Mobility and the returns to education: testing a roy model with multiple markets. Econometrica 70(6): 2367–2420.

Dubin, Jeffrey A. and Daniel L. McFadden (1984). An econometric analysis of residential electric appliance holdings and consumption. Econometrica 52(2): 345–362.

Greene, William H. (1990). Econometric Analysis (New York: Macmillan Publishing Company).

Heckman, James J. (1979). Sample selection bias as a specification error. Econometrica 47(1): 153–161.

Jimenez, Emmanuel and Bernardo Kugler (1987). The earnings impact of training duration in a developing country an ordered probit selection model of Colombia's Servicio Nacional de Aprendizaje (SENA). Journal of Human Resources 22(2): 230-233.

Lee, Lung-Fei (1983). Generalized econometric models with selectivity. Econometrica 51(2): 507–512.

Maddala, G. S. (1983). Limited-Dependent and Qualitative Variables in Econometrics (Cambridge: Cambridge University Press).

Main, B. and B. Reilly (1993). The employer size-wage gap: Evidence for Britain. Economica 60: 125–142.

McFadden, Daniel L. (1973). Conditional logit analysis of qualitative choice behavior. In P. Zarembka (ed.) Frontiers in Econometrics (New York: Academic Press).

Newey, W. K. and Daniel L. McFadden (1994). Large sample estimation and hypothesis testing. In R. F. Engle and Daniel L. McFadden (eds.) Handbook of Econometrics, vol. IV (Amsterdam: North Holland).

Riphahn, Regina T., Achim Wambach, and Andreas Million (2003). Incentive effects in the demand for health care: a bivariate panel count data estimation. Journal of Applied Econometrics 18(4): 387-405

Schmertmann, Carl P. (1994). Selectivity bias correction methods in polychotomous sample selection models. Journal of Econometrics 60(1): 101–132.

Vella, Francis (1998). Estimating models with sample selection bias. The Journal of Human Resources 33(1): 127-169.

Footnotes

  1. Butler, J. S., T. Aldrich Finegan, and John J. Siegfried (1998). Does more calculus improve student learning in Intermediate Micro- and Macroeconomic Theory? Journal of Applied Econometrics13(2):185-202.
  2. This particular notation implies that there are k1 k1 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadUgacqGHsislcaaIXaaaaa@3881@ explanatory variables.
  3. See Greene (1990): 704.
  4. One way to make the conversion from the Stata output to the neater table relatively easily is to follow these steps: (1) replace each double space by a single space until there were none left; (2) replace each space with a tab (^t); (3) convert the material into a table using the "Insert/Table" command with a tab as the separator; and (4) clean up the table by moving the data into an Excel file, fixing the formatting, and returning the data to the Word file (alternatively, you can use formatting commands in Stata to control how the output appears).

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks