# Connexions

You are here: Home » Content » Replication of econometric studies

### Recently Viewed

This feature requires Javascript to be enabled.

# Replication of econometric studies

Module by: Christopher Curran. E-mail the author

Summary: This module leads the advanced undergraduate student through the process of replicating the econometrics in a published study.

## Replication

### Introduction

One of the most important first steps in a science experiment is to replicate the results of earlier research. For a variety of reasons (most of them practical and not theoretically sound) economists generally do not undertake this step; what they tend to do is report the results of earlier papers and then compare their results with the earlier results without asking the question of whether these earlier results were reported accurately. Omitting this step in a world of honest careful researchers might seem to be a minor problem. However, there is enough casual evidence to suggest that a large portion of the econometric results reported in the journals cannot be replicated because the original researcher (1) does not have the data set used in the research because it has been lost for a variety of reasons, (2) cannot share the data set because it is proprietory, (3) is unwilling to share the data set because there are other issues they wish to investigate using the data set, or (4) just are unwilling to share the data set. For this reason much of the published econometrics research has never been replicated. In recognization of this problem several journals like the Journal of Applied Econometrics now require that authors submit the data set they used to the journal to be posted on the web for use by any other researcher. Whether this effort has been successful will not be clear unless someone undertakes to replicate the work in this journal to see if all of the data necessary to replicate an article have been posted and if the regressions included in the article actually can be replicated. It is very unlikely anyone would undertake such an effort given the fact that no journal will publish results that are merely a replication of previously published articles.

In this module we explore some of the difficulties that exist in replicating existing research by undertaking to replicate some of the results reported in the Butler, Finegan, and Siegfried (1998) (BFS, hereafter) article analyzing the effect of a student's calculus background on the grade he or she earns in intermediate microeconomics or in intermediate macroeconomics.1 The goal of this module is to (1) help students to learn how to read in detail an article that appears in a typical economics trade journal, (2) introduce them to ordered probit, an advanced econometrics tool, and (3) teach them how to present and discuss the results of an estimation of a model in an economics paper. While most of the discussion in this module focuses on using Stata in this replication, one can use most any econometrics program they are comfortable with to replicate some of the results reported in the BFS article.

## Butler, Finegan, and Siefried (1998).

The obvious first step is to find and print a copy of the article by Butler, Finegan, and Siefried. In fact, do not proceed any further in reading this module until you have read the article. We will discuss in class what the authors do in the paper and how clearly they present their conclusions. In this first pass at the article you are to pay attention to how convincing you find their arguments to be. Since everyone in the class has completed an intermediate microeconomics course, your discussion of their conclusions should reflect your own experiences. Also, you need to be able to discuss in class the estimation strategy they use in the paper. In particular, you will need to be able to identify what the source of the data is and what equations did they estimate. Also, try to determine how the estimations in the "first" stage are used in the estimations of the "second" stage. Why did the authors use a two-stage estimation strategy?

Also, what do you think the authors mean in their description of their estimation strategy by their statement about the estimation methods they use:

Estimation Methods and Expectations
To cope with the selection bias problem, we use a two-stage estimation procedure. The first stage employs an ordered probit model to predict the highest level of calculus attained by each student prior to taking each intermediate economic theory course.... In the second stage, the student's grade in MICRO-2 ... (the outcome') is regressed on the actual level of calculus attained, the grade earned in that calculus course, the predicted residual in the grade equation that we would expect on the basis of the actual level of calculus attained, and a roster of control variables reflecting ability and motivation. Individuals are the unit of observation. Ordinary least squares estimation is used because there are twelve categories of grades which are commonly interpreted as cardinal measures of performance (as is implied by the calculation of grade point averages'). (Butler, Finegan, and Siegfried, 1998: 188)

## The ordered-probit model

In what follows you are to “replicate” the equations the authors estimate in the paper for the intermediate microeconomics course. In order to complete this assignment you will need to figure out several things including (1) what an ordered-probit model is and (2) how to use Stata to estimate an ordered-probit model. In this section of the module we introduce the ordered-probit model. I strongly encourage you to consult Greene (1990: 703-706) for an excellent and clear discussion of the ordered-probit model. The discussion here follows Greene closely.

It is common for surveys to have questions that require the responder to choose one of several categories that have an innate order to them. For instance, most course evaluations ask the respondent to choose an answer to a question that reflects their agreement with a statement about the course. For instance, the question might read, "The Professor was interested in the material taught in the class" where the student completing the evaluation would choose a number from 1 to 9 where a 1 indicates complete disagreement with the statement and a 9 reflects complete agreement with the statement. Thus, there is an order to the potential answers. Using a logit, probit, or multilogit model would completely ignore this order. A linear regression is inappropriate because OLS treats the difference between answers of 1 and 2 as being the same as the difference between a 7 and and 8, when in fact the numbers only provide a ranking.

Consider a latent variable, y*, that is not observed but where y= β x+ε. y= β x+ε. MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadMhacqGH9aqpceWHYoGbauaacaWH4bGaey4kaSIaeqyTduMaaiOlaaaa@3D72@ We want to estimate the β k 's β k 's MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabek7aInaaBaaaleaacaWGRbaabeaakiaacEcacaqGZbaaaa@3A51@ in the vector β=( β 0 β 1 β K ). β=( β 0 β 1 β K ). MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaahk7acqGH9aqpdaqadaqaauaabeqabqaaaaqaaiabek7aInaaBaaaleaacaaIWaaabeaaaOqaaiabek7aInaaBaaaleaacaaIXaaabeaaaOqaaiabl+Uimbqaaiabek7aInaaBaaaleaacaWGlbaabeaaaaaakiaawIcacaGLPaaacaGGUaaaaa@4431@ 2 We may not observe y* but we do observe:

The μ i 's μ i 's MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeY7aTnaaBaaaleaacaWGPbaabeaakiaabEcacaqGZbaaaa@3A63@ in (1) are parameters that must be estimated along with β. β. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaahk7acaGGUaaaaa@37D9@ As usual, we assume that the error term ε is normally distributed (with a normalized mean and variance arbitrarily set to 0 and 1, respectively). It is trivial to estimate the model with the error terms having a logistic distribution, but this chance in assumptions appears to make virtually no difference in practice).3 With the normal distribution, we have:

y={ 0 if y <0, 1 if 0 y < μ 1 , 2 if μ 1 y < μ 2 , J if μ J1 y . y={ 0 if y <0, 1 if 0 y < μ 1 , 2 if μ 1 y < μ 2 , J if μ J1 y . MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadMhacqGH9aqpdaGabaabaeqabaGaaGimaiaabccacaqGGaGaaeiiaiaabMgacaqGMbGaaeiiaiaadMhadaahaaWcbeqaaiabgEHiQaaakiabgYda8iaaicdacaGGSaaabaGaaGymaiaabccacaqGGaGaaeiiaiaabMgacaqGMbGaaeiiaiaaicdacqGHKjYOcaWG5bWaaWbaaSqabeaacqGHxiIkaaGccqGH8aapcqaH8oqBdaWgaaWcbaGaaGymaaqabaGccaGGSaaabaGaaeOmaiaabccacaqGGaGaaeiiaiaabMgacaqGMbGaaeiiaiabeY7aTnaaBaaaleaacaaIXaaabeaakiabgsMiJkaadMhadaahaaWcbeqaaiabgEHiQaaakiabgYda8iabeY7aTnaaBaaaleaacaaIYaaabeaakiaacYcaaeaacqWIUlstaeaacaWGkbGaaeiiaiaabccacaqGGaGaaeyAaiaabAgacaqGGaGaeqiVd02aaSbaaSqaaiaadQeacqGHsislcaaIXaaabeaakiabgsMiJkaadMhadaahaaWcbeqaaiabgEHiQaaakiaac6caaaGaay5Eaaaaaa@70C5@
(1)
Pr( y=0 )=Φ( β x ), Pr( y=1 )=Φ( μ 1 β x )Φ( β x ), Pr( y=2 )=Φ( μ 2 β x )Φ( μ 1 β x ), Pr( y=J )=1Φ( μ J1 β x ), Pr( y=0 )=Φ( β x ), Pr( y=1 )=Φ( μ 1 β x )Φ( β x ), Pr( y=2 )=Φ( μ 2 β x )Φ( μ 1 β x ), Pr( y=J )=1Φ( μ J1 β x ), MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOabaeqabaGaciiuaiaackhadaqadaqaaiaadMhacqGH9aqpcaaIWaaacaGLOaGaayzkaaGaeyypa0JaeuOPdy0aaeWaaeaacqGHsislceWHYoGbauaacaWH4baacaGLOaGaayzkaaGaaiilaaqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGymaaGaayjkaiaawMcaaiabg2da9iabfA6agnaabmaabaGaeqiVd02aaSbaaSqaaiaaigdaaeqaaOGaeyOeI0IabCOSdyaafaGaaCiEaaGaayjkaiaawMcaaiabgkHiTiabfA6agnaabmaabaGaeyOeI0IabCOSdyaafaGaaCiEaaGaayjkaiaawMcaaiaacYcaaeaaciGGqbGaaiOCamaabmaabaGaamyEaiabg2da9iaaikdaaiaawIcacaGLPaaacqGH9aqpcqqHMoGrdaqadaqaaiabeY7aTnaaBaaaleaacaaIYaaabeaakiabgkHiTiqahk7agaqbaiaahIhaaiaawIcacaGLPaaacqGHsislcqqHMoGrdaqadaqaaiabeY7aTnaaBaaaleaacaaIXaaabeaakiabgkHiTiqahk7agaqbaiaahIhaaiaawIcacaGLPaaacaGGSaaabaGaeSO7I0eabaGaciiuaiaackhadaqadaqaaiaadMhacqGH9aqpcaWGkbaacaGLOaGaayzkaaGaeyypa0JaaGymaiabgkHiTiabfA6agnaabmaabaGaeqiVd02aaSbaaSqaaiaadQeacqGHsislcaaIXaaabeaakiabgkHiTiqahk7agaqbaiaahIhaaiaawIcacaGLPaaacaGGSaaaaaa@8C59@
(2)

where Φ( ) Φ( ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaahA6adaqadaqaaiabgwSixdGaayjkaiaawMcaaaaa@3AEE@ is the cumulative normal function. In order for all of the probabilities to be positive, we need μ 1 < μ 2 << μ J1 , μ 1 < μ 2 << μ J1 , MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeY7aTnaaBaaaleaacaaIXaaabeaakiabgYda8iabeY7aTnaaBaaaleaacaaIYaaabeaakiabgYda8iabl+UimjabgYda8iabeY7aTnaaBaaaleaacaWGkbGaeyOeI0IaaGymaaqabaGccaGGSaaaaa@4545@ as shown in Figure 1. One thing to note in Figure 1 is that the cutoff locations change when the values of the explanatory variables change.

The estimation strategy from here follows the usual maximum likelihood method. The computer program forms the likelihood function and then chooses the values of the parameters (including the cutoffs) that maximize this likelihood function.

The estimated coefficients are not equal to the marginal effects of a change in one of the explanatory variables (as is also true with the logit and probit models). Consider the simple example Greene (1990, 704) describes. Assume that there are three categories. Then (2) becomes:

(3)

Figure 2 shows this situation. The solid curve shows the distribution of y and y*. Increasing one of the x's while holding the β constant (that is, changing β ^ x 0 β ^ x 0 MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqahk7agaqcgaqbaiaahIhadaWgaaWcbaGaaGimaaqabaaaaa@3928@ to β ^ x 1 ) β ^ x 1 ) MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqahk7agaqcgaqbaiaahIhadaWgaaWcbaGaaGymaaqabaGccaGGPaaaaa@39E0@ is the same as shifting the entire distribution of y and y* to the right with μ ^ μ ^ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqbeY7aTzaajaaaaa@37AF@ remaining constant. As a result the probabilities that y takes on the values of 0, 1, and 2 change. Clearly, as shown in Figure 2, Pr( y=0 ) Pr( y=0 ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGimaaGaayjkaiaawMcaaaaa@3BFC@ decreases and Pr( y=2 ) Pr( y=2 ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGOmaaGaayjkaiaawMcaaaaa@3BFE@ increases. The Pr( y=1 ), Pr( y=1 ), MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGymaaGaayjkaiaawMcaaiaacYcaaaa@3CAD@ on the other hand, may increase or decrease and, thus, the effect of an increase in one of the explanatory variables is ambiguous. It is easy to show this result algebraically. The marginal effects for the 3 probabilities in (3) are, assuming β>0: β>0: MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaahk7acqGH+aGpcaaIWaGaaiOoaaaa@39A7@

Pr( y=0 ) x =ϕ( β x )β<0, Pr( y=1 ) x =ϕ( μ β x )βϕ( β x )β, Pr( y=2 ) x =ϕ( μ β x )β>0. Pr( y=0 ) x =ϕ( β x )β<0, Pr( y=1 ) x =ϕ( μ β x )βϕ( β x )β, Pr( y=2 ) x =ϕ( μ β x )β>0. MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOabaeqabaWaaSaaaeaacqGHciITciGGqbGaaiOCamaabmaabaGaamyEaiabg2da9iaaicdaaiaawIcacaGLPaaaaeaacqGHciITcaWH4baaaiabg2da9iabgkHiTiabew9aMnaabmaabaGabCOSdyaafaGaaCiEaaGaayjkaiaawMcaaiaahk7acqGH8aapcaaIWaGaaiilaaqaamaalaaabaGaeyOaIyRaciiuaiaackhadaqadaqaaiaadMhacqGH9aqpcaaIXaaacaGLOaGaayzkaaaabaGaeyOaIyRaaCiEaaaacqGH9aqpcqaHvpGzdaqadaqaaiabeY7aTjabgkHiTiqahk7agaqbaiaahIhaaiaawIcacaGLPaaacaWHYoGaeyOeI0Iaeqy1dy2aaeWaaeaaceWHYoGbauaacaWH4baacaGLOaGaayzkaaGaaCOSdiaacYcaaeaadaWcaaqaaiabgkGi2kGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGOmaaGaayjkaiaawMcaaaqaaiabgkGi2kaahIhaaaGaeyypa0Jaeqy1dy2aaeWaaeaacqaH8oqBcqGHsislceWHYoGbauaacaWH4baacaGLOaGaayzkaaGaaCOSdiabg6da+iaaicdacaGGUaaaaaa@7EEF@
(4)

In general, only the sign's of the change Pr( y=0 ) Pr( y=0 ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaaGimaaGaayjkaiaawMcaaaaa@3BFC@ and Pr( y=J ) Pr( y=J ) MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiGaccfacaGGYbWaaeWaaeaacaWG5bGaeyypa0JaamOsaaGaayjkaiaawMcaaaaa@3C11@ are unambiguous. Greene (1990, 705) cautions that "“[w]e must be very careful in interpreting the coefficients in this model.... Indeed, without a fair amount o extra calculation, it is quite unclear how the coefficients in the ordered-probit model should be interpreted.”"

## The BFS Dataset

The data used by BFS are available at the Journal of Applied Econometrics data website or in the MS Excel file Vanderbilt data set.xls. Table 1 identifies the variables in the dataset.

## Replication of the Ordered Probit Regression

At this point we are ready to begin the replication. Since it is easy to get lost in the process, I have created a list of steps that include both instructions on what to do and questions you need to answer. As part of this exercise you will be asked to complete several tables of results. In order to make this effort easier, I have provided a MS Word file, Tables for ordered probit discussion.doc, with the tables to be completed in it.

1. Load the data in Stata from Excel.

2. Convert MSAT and VSAT to MSAT/100 and VSAT/100, respectively, using the commands:

.replace msat = msat/100

.replace vsat = vsat/100

3. Common sense dictates that we should calculate the means and standard deviations of the variables to be sure that there are no entry errors. We need to construct a table that compares the means and standard deviations reported in BFS with those in our dataset. Table 2, which has the means and standard deviations reported by BFS, gives a place to put the means and standard deviations for the variables in our dataset. Fill in the information missing from Table 2.

 Our data Butler, et al. Variable Mean Std. Dev. Mean Std. Dev. msat 6.25 0.60 foreign 0.11 0.32 female 0.39 0.49 emecon 0.34 0.48 emoss 0.17 0.38 emns 0.21 0.41 emh 0.07 0.25 am1 0.49 0.50 am2 0.45 0.50 am3 0.01 0.11 phy1 0.67 0.47 Phy2 0.02 0.14 chem1 0.82 0.39 chem2 0.12 0.32

4. Estimate the ordered probit regression using (in Stata) the commands:

.global indvar msat foreign female emecon emoss emns emh am1 am2 am3 phy1 phy2 chem1 chem2

.oprobit highestmath \$indvar

5. Use the result of this estimation to complete Table 3.4

 highestmath Coef. Std. Err. z P>z [95% Conf. Interval] msat1 foreign female emecon emoss emns emh am1 am2 am3 phy1 Phy2 chem1 chem2 _cut1 _cut2 _cut3 _cut4 _cut5 _cut6 Observations Log likelihood LR χ2(14) Prob > χ2 Pueudo-R2

6. Compare your results with the table reported in the article. The table in the article is Table II on page 193 and is reproduced in Figure 3. What we are interested in is comparing column 4 in Figure 3 with columns 2 and 4 in Table 3. Table 4 below offers a model for this comparison.

Table 4. Comparison of ordered probit estimations.

 Our estimates Butler, et al. estimates Estimate z Estimate t-value msat1 0.05 6.12 foreign 0.02 0.14 female 0.25 2.59 emecon -0.11 0.86 emoss -0.29 1.99 emns 0.43 3.10 emh -0.37 1.78 am1 0.24 1.07 am2 0.93 4.04 am3 0.77 1.70 phy1 0.26 2.71 Phy2 0.38 1.07 chem1 -0.12 0.69 chem2 0.17 0.75 Intercept -3.09 5.48 _cut1 0.27 7.29 _cut2 0.33 8.16 _cut3 1.52 20.32 _cut4 1.79 23.07 _cut5 2.04 23.72 _cut6

7. It is easy to see from Table 4 is that almost without exception the estimates of the parameters and their t-ratios are very similar. The exception arises with the estimates of the truncation points (_cut# in the Stata results). We will have to figure out what these are estimates of in order to make sense of them. Figure 1 shows the "cutoffs" that are being estimated. Footnote c in the BFS Table II on page 193 (shown in Figure 3) offers a useful observation:

In an ordered probit, an underlying, normally distributed, latent variable has a mean which is a function of observable variables. The latent variable gives rise to a set of observed dummy variables for ordered categories based on ranges between unobserved but estimable truncation points which correspond to levels of effort, ability, or other factors reflected in the explanatory variables. If L categories are observed, there are L1 L1 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaqcaaKaamitaiabgkHiTiaaigdaaaa@38B6@ truncation points, of which the first is normalized to be zero, so that L2 L2 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaqcaaKaamitaiabgkHiTiaaikdaaaa@38B7@ truncation points are estimated and reported in the table. The values correspond to standard deviations of the latent normally distributed variable.

The key idea is that the values of cutoffs are relative and can be normalized around any value. Notice that the Stata results do not report an intercept term but do report six cutoff values. Moreover, the difference between the estimate by Stata for the first cutoff (3.08402) and the estimate for the second cutoff (3.356916) is equal to 0.272896, which is itself equal to the first truncation point reported by BFS (1998: 193). Use Table 5 to report the difference between the first cutoff value and each of the cutoff points reported by Stata.

 Cutoff Estimate Estimate - _cut1 BFS Truncation Points _cut1 3.0840 _cut2 3.3569 0.27 _cut3 3.4146 0.33 _cut4 4.6013 1.52 _cut5 4.8774 1.79 _cut6 5.1202 2.04

The second part of the reconciliation of the two sets of results is to compute the t-ratios. To do this we need to compute the standard deviation of the estimates of the cutoff points reported by Stata. To do this we need to retrieve the variance-covariance matrix from the regression. First, let's see what we are interested in computing. Let β ^ i β ^ i MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaKaadaWgaaWcbaGaamyAaaqabaaaaa@38BF@ be the estimate of the ith cutoff point. In column 3 of Table 5 you computed α ^ i = β ^ i β ^ 1 α ^ i = β ^ i β ^ 1 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqySdeMbaKaadaWgaaWcbaGaamyAaaqabaGccqGH9aqpcuaHYoGygaqcamaaBaaaleaacaWGPbaabeaakiabgkHiTiqbek7aIzaajaWaaSbaaSqaaiaaigdaaeqaaaaa@4027@ for i=2,,6 i=2,,6 . MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaiabg2da9iaaikdacaGGSaGaeSOjGSKaaiilaiaaiAdaaaa@3BE6@ . The variance of the new variable is:

V( α ^ i )=V( β ^ i )2Cov( β ^ i β ^ 1 )+V( β ^ 1 )= σ i 2 2 σ i1 + σ 1 2 V( α ^ i )=V( β ^ i )2Cov( β ^ i β ^ 1 )+V( β ^ 1 )= σ i 2 2 σ i1 + σ 1 2 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOvamaabmaabaGafqySdeMbaKaadaWgaaWcbaGaamyAaaqabaaakiaawIcacaGLPaaacqGH9aqpcaWGwbWaaeWaaeaacuaHYoGygaqcamaaBaaaleaacaWGPbaabeaaaOGaayjkaiaawMcaaiabgkHiTiaaikdacaWGdbGaam4BaiaadAhadaqadaqaaiqbek7aIzaajaWaaSbaaSqaaiaadMgaaeqaaOGafqOSdiMbaKaadaWgaaWcbaGaaGymaaqabaaakiaawIcacaGLPaaacqGHRaWkcaWGwbWaaeWaaeaacuaHYoGygaqcamaaBaaaleaacaaIXaaabeaaaOGaayjkaiaawMcaaiabg2da9iabeo8aZnaaDaaaleaacaWGPbaabaGaaGOmaaaakiabgkHiTiaaikdacqaHdpWCdaWgaaWcbaGaamyAaiaaigdaaeqaaOGaey4kaSIaeq4Wdm3aa0baaSqaaiaaigdaaeaacaaIYaaaaaaa@60F0@
(5)

The variance-covariance matrix will give us estimates of these variances and covariances. When there are j parameters in a regression equation, this matrix is defined to be:

If you type the command .vce, Stata will report Σ ^ Σ ^ , MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafu4OdmLbaKaaaaa@3788@ as shown in Figure 4. We need the section of this matrix shown in Part A of Table 6. Use equation (5) to estimate the standard errors of the estimates of the cutoff points and complete Part B of Table 6 and compares the t-ratios with the values reported by Butler, et al. (and shown in the last column 4 of Table 6). Are you satisfied that we have been able to come reasonably close to the results reported in the article?

 Part A. Relevant portion of the variance-covariance matrix. _cut1 _cut2 _cut3 _cut4 _cut5 _cut6 _cut1 0.329 _cut2 0.329 0.330 _cut3 0.329 0.330 0.331 _cut4 0.332 0.333 0.334 0.341 _cut5 0.333 0.334 0.334 0.341 0.343 _cut6 0.333 0.334 0.335 0.342 0.343 0.345 Part B. Calculation of the t-ratios (with comparison of values reported in BFS) V( β ^ β ^ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaKaaaaa@37A5@ ) St. Dev.( β ^ β ^ MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaKaaaaa@37A5@ t-ratio BFS t-ratio _cut2 7.29 _cut3 8.16 _cut4 20.32 _cut5 23.07 _cut6 23.72

8. The next step in the process is to generate the term we will use in the estimation of the grade regression to account for the potential sample selection bias. To do this we will need to find a reference in the literature that offers a clear description of what we need to do. As it turns out, a reasonable explanation of the appropriate estimation technique is available in Jimenez and Kugler (1987). Since much of what follows comes directly from this article, I highly recommend you read it yourself.

The gist of the method suggests that the potential sample bias is accounted for by an inverse Mills ratio for each of the categories. What we need to do is calculate:

(6)

for the category that the individual actually is in. What we will do is calculate (6) for all of the categories and then sum the product of this number and a dummy variable indicating if a course is the highest math class completed by an individual. Since the dummy variables will equal 0 for math categories an individual is not in, the resulting sum will preserve the value of (6) that is associated with the category the individual does belong to.

It is clear from (6) that we will need to retain the 6 cutoffs. We can do this with the commands:

. generate cutoff1 = _b[_cut1]

. generate cutoff2 = _b[_cut2]

. generate cutoff3 = _b[_cut3]

. generate cutoff4 = _b[_cut4]

. generate cutoff5 = _b[_cut5]

. generate cutoff6 = _b[_cut6]

Technically, this step is not necessary since the parameter estimates are preserved until the next regression is estimated; I suggest doing this purely as a precaution.

9. Preserve the predicted values of the ordered-probit using the command:

. predict zhat, xb

. predict phat1 phat2 phat3 phat4 phat5 phat6 phat7, p

These two commands will generate for each observation the predicted mean category of math classes and the probability that this individual will fall in each category. To see what is going on we will retrieve some representative values of these variables and then graph them for one individual. Table 7 reports these values for 10 individuals in the sample. Now consider individual 2. Fitting a normal distribution with a mean of 4.25 and using the critical values from our estimation yields the probabilities that the individual is in each of the categories. For example, the probability that individual 1 will have completed no math classes is equal to 0.1223. Figure 5 illustrates the results for individual 1. The dashed vertical lines are the six cutoff values that are the same for each individual. The solid vertical line is the zhat for individual 1. The heavy blue line represents the normal probability density function for this individual. While, there is, of course, a different probability distribution for each individual, the cutoff values are the same for all members of the sample.

 Observation Highest Math Class zhat Pr(0) Pr(1) Pr(2) Pr(3) Pr(4) Pr(5) Pr(6) 1 3 3.9657 0.1890 0.0824 0.0194 0.4467 0.0816 0.0568 0.1241 2 0 4.2507 0.1217 0.0640 0.0158 0.4355 0.0975 0.0731 0.1923 165 0 3.5982 0.3036 0.1011 0.0225 0.4149 0.0575 0.0364 0.0640 166 6 4.6914 0.0540 0.0370 0.0098 0.3633 0.1097 0.0922 0.3340 214 3 3.4533 0.3560 0.1056 0.0229 0.3900 0.0483 0.0294 0.0478 215 3 4.0840 0.1587 0.0749 0.0180 0.4459 0.0887 0.0637 0.1501 225 3 3.5250 0.3296 0.1036 0.0228 0.4031 0.0528 0.0328 0.0553 226 3 3.6990 0.2693 0.0969 0.0219 0.4285 0.0641 0.0417 0.0776 453 3 3.9713 0.1875 0.0820 0.0194 0.4468 0.0819 0.0571 0.1253 454 5 4.1650 0.1399 0.0697 0.0170 0.4422 0.0932 0.0684 0.1697 495 3 4.4168 0.0913 0.0533 0.0135 0.4151 0.1043 0.0816 0.2409 496 0 2.9811 0.5410 0.1055 0.0212 0.2797 0.0236 0.0127 0.0162 526 0 2.9247 0.5633 0.1039 0.0207 0.2653 0.0214 0.0114 0.0141 527 3 3.9757 0.1863 0.0817 0.0193 0.4469 0.0822 0.0574 0.1262

Now we are ready to calculate (6). The commands are:

.generate lambda0 = (-normden(cutoff1-zhat))/(norm(cutoff1-zhat)-norm(-zhat))

.generate lambda1 = (normden(cutoff1-zhat)-normden(cutoff2-zhat))/(norm(cutoff2-zhat)-norm(cutoff1-zhat))

.generate lambda2 = (normden(cutoff2-zhat)-normden(cutoff3-zhat))/(norm(cutoff3-zhat)-norm(cutoff2-zhat))

.generate lambda3 = (normden(cutoff3-zhat)-normden(cutoff4-zhat))/(norm(cutoff4-zhat)-norm(cutoff3-zhat))

.generate lambda4 = (normden(cutoff4-zhat)-normden(cutoff5-zhat))/(norm(cutoff5-zhat)-norm(cutoff4-zhat))

.generate lambda5 = (normden(cutoff5-zhat)-normden(cutoff6-zhat))/(norm(cutoff6-zhat)-norm(cutoff5-zhat))

.generate lambda6 = (normden(cutoff6-zhat))/(1-norm(cutoff6)-norm(cutoff5-zhat))

.generate lambda = m170*lambda0 + m171a*lambda1 + m172a*lambda2 + m171b*lambda3 + m172b*lambda4 + m221a*lambda5+m221b*lambda6

One thing to notice in these calculations is that cutoff0 is assumed to be MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeyOeI0IaeyOhIukaaa@3852@ and cutoff7 is assumed to be ∞. ∞. MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeyOhIukaaa@3765@

10. Now we are ready to estimate our regression explaining the grade that each individual received in intermediate microeconomics. Use Table 8 to report the regression results for four specifications of the model. The first question is can the null hypothesis of sample selection bias be rejected? How does this conclusion compare with BFS's conclusions? (See Table 9.) Second, since many of the potential explanatory variables like class size and scores on the SATs do not seem to be statistically significant, it is reasonable to focus our comments on the results reported in column (4) of Table 8.

What can you conclude about the impact of calculus on how well a student will do in intermediate microeconomics? Do the final grades earned in a majority of the math classes impact the grade earned in intermediate microeconomics? Do the grades earned in any of the math classes positively and significantly affect the grade earned in intermediate microeconomics? Can you explain the impact of the freshman GPA on the grade earned in intermediate microeconomics? What, if any, is your bottom line conclusions about what matters in determining the grades earned in intermediate microeconomics?

 Explanatory variables Model (1) Model (2) Model (3) Model (4) Lambda — — Sophomore — — Senior — — Same Skip — — M171a M172a M171b M172b M221a M221b GE100 GDE100 GE101 GDE101 — GDE231 Size — FGPA Female MSAT — VSAT — Grade in highest Math — — — class GM170 — GM171a — GM172a — GM171b — GM172b — GM221a — GM221b — Intercept F( 28, 580) — — — Prob > F — — — F( 27, 581) — — — Prob > F — — — F( 20, 588) — — Prob > F — — F( 19, 589) — — — Prob > F — — — R-Squared Root MSE Sample Size 609 609 609 609
 MICRO-2 Variablea Expected sign Mean (SD) Coefficient(t-value) Intercept — — -1.64 (3.48) Selection bias correction + -0.00 0.10 (Predicted residual) (0.92) (1.29) Level of calculus attained: Math 171A + 0.08 0.39 (0.27) (1.04) Math 172A + 0.02 -0.18 (0.13) (0.21) Math 171B + 0.37 1.02b (0.48) (3.49) Math 172B + 0.07 1.52 b (0.25) (3.53) Math 221A + 0.05 1.33c (0.22) (2.27) Math 221B or 222 + 0.14) 0.75c (0.35 (1.67) Grade in last calculus course: Math 170 + 3.06 0.36b (0.70) (4.36) Math 171A + 2.22 0.26c (0.86) (2.21) Math 172A + 2.94 0.42 (0.80) (1.54) Math 171B + 2.62 0.10c (0.93) (1.85) Math 172B + 2.63 -0.01 (0.90) (0.10) Math 221A + 3.10 -0.09 (0.77) (0.55) Math 221B or 222 + 3.15 0.11 (0.76) (1.04) Grade deflator of instructor in intermediate theory + -0.16 0.88b course (0.27) (8.28) Taken in Sophomore year ? 0.32 0.07 (0.47) (0.94) Taken in Senior year - 0.06 -0.02 (0.24) (0.13) MICRO-1 and MICRO-2 in same academic year + 0.35 0.04 (0.48) (0.46) At least one semester between MICRO-1 and - 0.27 0.13 MICRO-2 (0.44) (1.85) Grade in MACRO-1 + 2.73 0.20b (0.73) (3.93) Grade in MICRO-1 + 2.67 0.29b (0.74) (5.93) Instructor's grade deflator: MACRO-1 - -0.32 -0.33c (0.20) (2.20) MICRO-1 - -0.29 -0.11 (0.16) (0.53) Class size (intermediate theory course) ? 28.2 -0.002 (5.5) (0.45) Freshman Grade Point Average + 2.79 0.29b (0.46) (3.04) Sex (female = 1; male = 0) ? 0.39 0.13c (0.49) (2.09) SAT-Math score x 10-2 + 6.25 0.12c (0.60) (1.75) SAT-Verbal score x 10-2 + 5.56 0.04 (0.67) (0.78) OVERALL RESULTS Mean (SD) of dependent variable Adjusted R2 0.44 Number of observations 609

## Exercises

### Exercise 1

Quite often health professionals request that a patient a report their perception of their health status on a scale of 0 to 10, where 0 is the lowest possible health status and 10 is the highest health status. This type of data set is best analyzed using ordered probit. In this exercise you will analyze a data set of responses to a survey made in Germany between 1984 and 1995. The question we are interested in analyzing is the respondent’s perception of their own health status.

The file Riphahn, Wambach, Million data.xls is an MS Excel file that contains 27,326 observations on 25 variables, one observation per line. The data are from Riphahn, Wambach, and Million (2003) and are also available on the web. The variables are defined in Table 10. As a first step you will need to load these data into Stata. However, due to the large sample size you will need to first expand the size of the memory that is available to Stata with the command: . set memory 1G. Here I have increased the memory to 1 gigabyte. This amount may be overkill but it seemed to be big enough on my computer to handle the data.

 Column Variable Variable definition A ID individual's ID number B Female female = 1; male = 0 C Year calendar year of the observation D Age age in years E HSAT health satisfaction, coded 0 (low) - 10 (high) F Handdum handicapped = 1; otherwise = 0 G Handper degree of handicap in percent (0 - 100) H HhnINC household nominal monthly net income in German marks / 1000 I HHKIDS children under age 16 in the household = 1; otherwise = 0 J Educ years of schooling K Married married = 1; otherwise = 0 L Haupts highest schooling degree is Hauptschul degree = 1; otherwise = 0 M Reals highest schooling degree is Realschul degree = 1; otherwise = 0 N FachHS highest schooling degree is Polytechnical degree = 1; otherwise = 0 O Abitur highest schooling degree is Abitur = 1; otherwise = 0 P Univ highest schooling degree is university degree = 1; otherwise = 0 Q Working employed = 1; otherwise = 0 R BlueC blue collar employee = 1; otherwise = 0 S WhiteC white collar employee = 1; otherwise = 0 T Self self employed = 1; otherwise = 0 U Beamt civil servant = 1; otherwise = 0 V DocVis number of doctor visits in last three months W HospVis number of hospital visits in last calendar year X Public insured in public health insurance = 1; otherwise = 0 Y Addon insured by add-on insurance = 1; otherwise = 0

One of the major problems with survey indices is that the numbers seem to mean different things to respondents. One way to reduce this problem is to collapse the index into fewer outcomes by combining some of the responses together. However, anyway we do this is going to be ad hoc. Figure 6 shows a histogram of the responses to this question. Based on this graph, we will create 5 categories—(0) HSat = 0, 1, or 2; (1) HSat = 3, 4 or 5; (2) HSat = 6, 7, or 8; (3) HSat = 9; and (4) HSat = 10. We can create a new categorical variable called hsatnew with the command:

. recode hsat (0/2 = 0) (3/5 = 1) (6/8 = 2) (9 = 3) (10 = 4), generate(hsatnew)

Figure 7 shows the histogram of the new variable.

1. Create a table of summary statistics for (1) health status, (2) age, (3) household income, (4) years of education, (5) marital status, and (6) number of children by year and sex. (You might want to use the command .bysort year female, list of variables).
2. Estimate an ordered probit regression for 1988 for health status (the new variable) using age, income, education, married, and kids as the explanatory variables. Here you might want to used the command: .oprobit hsatnew age hninc educ married hhkids if year==1988.
3. Use the predict newvariable, xb command to calculate the predicted mean values for each individual for the 1988 observations. Compare this histogram to one using the 1988 regression parameters to estimate xb for all years.
4. Estimate the ordered probit model for all of the years in the sample and put the results into a table like Table 11. (Here you might want to make use of the command: .bysort year: oprobit hsatnew varlist)
 Variable 1984 1985 1986 1987 1988 1991 1994 age income education married kids _cut1 _cut2 _cut3 _cut4 Observations LR χ2(5) Prob > χ2 Log likelihood Pseudo-R2

## References

Amemiya, T. (1985). Advanced Econometrics (Cambridge, MA: Harvard University Press).

Bourguignon, François, Martin Fournier, and Marc Gurgand (2007). Selection bias corrections based on the multinomial logit model: Monte Carlo comparisons. Journal of Economic Surveys 21(1): 174-205.

Butler, J. S., T. Aldrich Finegan, and John J. Siegfried (1998). Does more calculus improve student learning in intermediate micro- and macroeconomic theory?" Journal of Applied Econometrics 13(2): 185-202.

Chiburis, Richard and Michael Lokshin (2007). Maximum likelihood and two–step estimation of an ordered–probit selection model. The Stata Journal 7(2): 167-182.

Dahl, Gordon B. (2002). Mobility and the returns to education: testing a roy model with multiple markets. Econometrica 70(6): 2367–2420.

Dubin, Jeffrey A. and Daniel L. McFadden (1984). An econometric analysis of residential electric appliance holdings and consumption. Econometrica 52(2): 345–362.

Greene, William H. (1990). Econometric Analysis (New York: Macmillan Publishing Company).

Heckman, James J. (1979). Sample selection bias as a specification error. Econometrica 47(1): 153–161.

Jimenez, Emmanuel and Bernardo Kugler (1987). The earnings impact of training duration in a developing country an ordered probit selection model of Colombia's Servicio Nacional de Aprendizaje (SENA). Journal of Human Resources 22(2): 230-233.

Lee, Lung-Fei (1983). Generalized econometric models with selectivity. Econometrica 51(2): 507–512.

Maddala, G. S. (1983). Limited-Dependent and Qualitative Variables in Econometrics (Cambridge: Cambridge University Press).

Main, B. and B. Reilly (1993). The employer size-wage gap: Evidence for Britain. Economica 60: 125–142.

McFadden, Daniel L. (1973). Conditional logit analysis of qualitative choice behavior. In P. Zarembka (ed.) Frontiers in Econometrics (New York: Academic Press).

Newey, W. K. and Daniel L. McFadden (1994). Large sample estimation and hypothesis testing. In R. F. Engle and Daniel L. McFadden (eds.) Handbook of Econometrics, vol. IV (Amsterdam: North Holland).

Riphahn, Regina T., Achim Wambach, and Andreas Million (2003). Incentive effects in the demand for health care: a bivariate panel count data estimation. Journal of Applied Econometrics 18(4): 387-405

Schmertmann, Carl P. (1994). Selectivity bias correction methods in polychotomous sample selection models. Journal of Econometrics 60(1): 101–132.

Vella, Francis (1998). Estimating models with sample selection bias. The Journal of Human Resources 33(1): 127-169.

## Footnotes

1. Butler, J. S., T. Aldrich Finegan, and John J. Siegfried (1998). Does more calculus improve student learning in Intermediate Micro- and Macroeconomic Theory? Journal of Applied Econometrics13(2):185-202.
2. This particular notation implies that there are k1 k1 MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadUgacqGHsislcaaIXaaaaa@3881@ explanatory variables.
3. See Greene (1990): 704.
4. One way to make the conversion from the Stata output to the neater table relatively easily is to follow these steps: (1) replace each double space by a single space until there were none left; (2) replace each space with a tab (^t); (3) convert the material into a table using the "Insert/Table" command with a tab as the separator; and (4) clean up the table by moving the data into an Excel file, fixing the formatting, and returning the data to the Word file (alternatively, you can use formatting commands in Stata to control how the output appears).

## Content actions

PDF | EPUB (?)

### What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks