# A sample Honors paper

Module by: Christopher Curran. E-mail the author

Summary: This module offers an introduction to how to put together an empirical research paper. The topic is the impact of per se 0.08 BAC laws on automobile fatalities. The module leads the reader through the whole process of putting together the paper and includes a data set for the student to work with.

Traditionally, empirical research papers in economics journals have five or more sections. In the first section, unimaginatively known as the introduction, the researcher briefly (1) describes what question he or she is attempting to answers, (2) indicates why the reader should be interested in the answer to the questions, and (3) often summarizes what the paper's conclusions. It is traditional in the second section for authors to discuss the instidutional background to the question and provide a theoretical model to be used in the estimation process. Quite often it makes more sense to refer to the variables in conceptual terms in this section and leave the actual specification of the variables in later parts of the paper. A traditional example of this is the ubiquitous "socioeconomic variables" included in many economic models. The reason for this generality is that perfect measures of the variables conceived in most models are not available and most researchers are forced to use proxies for the variables in the model when completing their empirical work. For this reason it is traditional in the third section of the paper to discuss what variables are used as proxies for the variables mentioned in the model. For instance, many papers use this section to specify what variables will proxy the "socioeconomic variables." It is appropriate to discuss shortcoming of the data set in the third section.

Economists use the fourth section of the paper to describe the econometric model estimated along with the statistical issues created by the shortcomings of data and the model. The fourth section of the paper also usually includes a presentation of the empirical estimations and a discussion of the implications of the estimations for the central questions of the paper. The fifth section of the paper usually includes a recap of the research, a discussion of the implications of the empirical work, and suggestions for further research.

Obviously, not all economics journal articles are split into the five sections described above; every author has his or her way of organizing their arguments. Indeed, how a paper is organized will reflect the story the author is trying to tell. It is as James Joyce noted in Protrait of an Artist as a Young Man, in art "the whole is related to the parts and the parts are related to the whole." In a well-crafted paper the author's message dictates the organizational structure of the paper and the material in each section must relate back to this message. In what follows we will outline what might go into each of these sections, leaving it to you to fill in the missing parts.

## Section 1. Introduction

In this hypothetical Honors paper we examine the impact of a law change on a desired outcome of the law. In particular, sometime during the years leading up to 2007 all of the states adopted a 0.08 per se rule on the blood alcohol content (BAC) of determining if a driver is drunk: after passage of the law any driver with a BAC of 0.08 or higher is presumed to be driving under the influence. Some of the states also have "zero tolerance for underaged drinking and driving" level that applies only to drivers under age 21. Defence of drivers accused of DUI is, not surprisingly, big business for lawyers. Table 1 reports the some of the current DUI laws by state as reported on the website of a law firm specializing in DUI cases.

 State Per se BAC Level Zero Tolerance BAC Level Enhanced Penalty BAC Level State Per se BAC Level Zero Tolerance BAC Level Enhanced Penalty BAC Level Alabama 0.08 0.02 N/A Montana 0.08 0.02 0.18 Alaska 0.08 0.00 0.16 Nebraska 0.08 0.02 0.15 Arizona 0.08 0.00 0.15 Nevada 0.08 0.02 0.18 Arkansas 0.08 0.02 0.15 New Hampshire 0.08 0.02 0.16 California 0.08 0.01 0.15 New Jersey 0.08 0.01 N/A Colorado 0.08 0.02 0.20 New Mexico 0.08 0.02 0.16 Connecticut 0.08 0.02 0.16 New York 0.08 0.02 0.18 Delaware 0.08 0.02 0.15 North Carolina 0.08 0.00 0.16 DC 0.08 0.00 0.20 North Dakota 0.08 0.02 0.18 Florida 0.08 0.02 0.15 Ohio 0.08 0.02 0.17 Georgia 0.08 0.02 0.15 Oklahoma 0.08 0.00 0.15 Hawaii 0.08 0.02 0.15 Oregon 0.08 0.00 N/A Idaho 0.08 0.02 0.20 Pennsylvania 0.08 0.02 0.16 Illinois 0.08 0.00 0.16 Rhode Island 0.08 0.02 0.15 Indiana 0.08 0.02 0.15 South Carolina 0.08 0.02 0.15 Iowa 0.08 0.02 0.15 South Dakota 0.08 0.02 0.17 Kansas 0.08 0.02 0.15 Tennessee 0.08 0.02 0.20 Kentucky 0.08 0.02 0.18 Texas 0.08 0.00 0.15 Louisiana 0.08 0.02 0.15 Utah 0.08 0.00 0.16 Maine 0.08 0.00 0.15 Vermont 0.08 0.02 N/A Maryland 0.08 0.02 N/A Virginia 0.08 0.02 0.15 Massachusetts 0.08 0.02 0.20 Washington 0.08 0.02 0.15 Michigan 0.08 0.02 N/A West Virginia 0.08 0.02 N/A Minnesota 0.08 0.00 0.20 Wisconsin 0.08 0.00 0.17 Mississippi 0.08 0.02 N/A Wyoming 0.08 0.02 0.15 Missouri 0.08 0.02 0.15

The theoretical justifications for the per se BAC level rule is (1) that it will provide a disincentive for individuals to drive after drinking and (2) that it will reduce the cost of prosecuting DUI drivers. In terms of economics the law aims to reduce the negative externalities created by drunk drivers. The question to be examined in this paper is whether the per se laws have reduce the number of automobile fatalities. Persumably, if the law is successful in reducing the number of DUI drivers, it will reduce the number of accidents they cause and, thus, reduce the number of DUI fatalities. Whether the per se BAC law does reduce the number of automobile fatalities—and, thus, is a useful law—is the empirical issue this paper proposes to investigate.

### Exercises

1. The introduction or section 2 should include a discussion of the current state of the literature. What, if anything, is written in economics journals about the impact of DUI laws on the automobile fatality rate?
2. The introduction presented above is very "thin". How would you fill out this discussion? Is this the appropriate place to introduce a discussion of the institutional history of the adoption of the per se BAC law?
3. How would your introduction be affected by the results you report later in the paper?
4. A priori, do you think that the per se BAC law is an effective way of reduing drunk driving or is it just a placebo for voters upset with drunk drives (like MOM)? Does it "matter" to you as a researcher whether the per se BAC law is effective?

## Theoretical issues

Any model of automobile fatalities is a function of the unit of observation. Since we are interested in the impact of state laws on automobile fatalities, it seems reasonable that we construct a model to explain the differences in automobile fatalities at the state level (although it is tempting to use county level data). There are interstate differences that potentially explain differences in fatalities. First, people drive more in phyically larger states and states with larger populations than they do in other states. since more driving increases the probability of an accident, we need to standardize our measure of fatalities by the vehicle miles driven in the state. It is traditional in the empirical literature to measure the number of fatalities as fatalities per 100 million vehicle miles driven rather than the number of fatalities; in the interest of simplicity we follow this tradition.

A second phyical characteristic that affects the fatality rate is the type of road used in a state. In particular, it is well-known that in the United States perhaps the safest roads are rural interestate highways. Thus, in our model we will need to hold constant the type of highway in the state. An additional variable that potentially affects the fatality rate is the mix of drivers. In particular, given the propensity of insurance companies to charge higher rates to individuals under the age of 25, it is reasonable to assume that the more young drivers in the state the higher the fatality rate. Similarly, given the tendency of the elderly to have decreased reaction rates, it is possible that the presence of more elderly drivers would drive up the automobile accident rate.

There are several behavioral variables that might affect driving habits and, thus, automobile accident rates. First, it seems reasonable to assume that the value of time and cost of death are higher for wealthier people than they are for less wealth drivers. However, the direction of the effect of income on driver behavior is unclear. A person with a higher value of time might be more willing to speed than one with a lower value of time because time spent driving is time not spent earning income or engaging in leisure. Additionally, and here the issue is very uncertain, a wealthier person may be less willing to engage in risky driving or drinking behavior because he or she has more income to lose than a poorer individual.

A second variable that affects the behavior of individuals is the cost of gasoline. Higher gas prices will cause individuals to drive less and closer to the gas efficient speed. Most often driving closer to the gas efficient speed implies a slower and safer speed. Moreover, since all drivers are driven toward the gas efficient speed, the variance in speeds on the highways should be reduced. In either case, a higher price of gasoline should cause the number of automobile fatalities to fall. Since gasoline is purchased on the world market, the major source of differences in state-level gasoline prices is diffences among the state gasoline taxes. Similarly, we would expect things like state taxes on alcohol consumption and the strictness of the of the DUI laws to reduce both the amount of alcohol comsumption and the amount of driving under the influence.

In the most general terms the model to be estimated is:

FPVMD = f( type of roads, mix of drivers, income, cost of gasoline, state laws ), FPVMD = f( type of roads, mix of drivers, income, cost of gasoline, state laws ), MathType@MTEF@5@5@+=feaagyart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadAeacaWGqbGaamOvaiaad2eacaWGebGaaeiiaiabg2da9iaabccacaWGMbWaaeWaaeaacaqG0bGaaeyEaiaabchacaqGLbGaaeiiaiaab+gacaqGMbGaaeiiaiaabkhacaqGVbGaaeyyaiaabsgacaqGZbGaaeilaiaabccacaqGTbGaaeyAaiaabIhacaqGGaGaae4BaiaabAgacaqGGaGaaeizaiaabkhacaqGPbGaaeODaiaabwgacaqGYbGaae4CaiaabYcacaqGGaGaaeyAaiaab6gacaqGJbGaae4Baiaab2gacaqGLbGaaeilaiaabccacaqGJbGaae4BaiaabohacaqG0bGaaeiiaiaab+gacaqGMbGaaeiiaiaabEgacaqGHbGaae4Caiaab+gacaqGSbGaaeyAaiaab6gacaqGLbGaaeilaiaabccacaqGZbGaaeiDaiaabggacaqG0bGaaeyzaiaabccacaqGSbGaaeyyaiaabEhacaqGZbaacaGLOaGaayzkaaGaaeilaaaa@79CB@
(1)

where FPVMD is a measure of the number of automobile fatalities per vehicle mile driven annually in a state. In the next section of the paper we will make this model useable by chosing specific variables to proxy the explanatory variables

### Exercises

1. The model described is incomplete (as are almost all useful models). What, if anything, would you add to the model?
2. Often the models in economics papers involve constrained optimization models that yield the predictions that are tested in the empirical part of the paper. Are there any optimization models implicit in the description above?

## The data

Many of the states adopted the 0.08 BAC per se standard between 1994 and 2008. In fact, all states adopted this standard by 2007. Thus, a panel data set of data from all of the 50 states and the District of Columbia should offer enough variance in the this variable to enable us to evaluate the effectiveness of the law. The Department of Transportation and the Census Bureau provide enough data to enable us to construct a reasonable data set for all of the states for this period. What should ensue here is a detailed description of all of the variables in the data set along with the sources used to collect the data. However, we leave the construction of this part of the paper to you and resort to summarizing the variables included in the data set in Table 2. The data are available in the file the "Data set" sheet in Auto_fatalities_data.xls; the definition of the FIPS codes are included in a sheet named "State FIPS codes" in the same file. Table 3 defines the variables included by column in the "Data set" sheet of Auto_fatalities_data.xls.

Care needs to be taken when gathering the data because some sources list the states in alphabetical order by the full name of the state, the way that the FIPS codes orders the states. In this case Deleware preceeds the District of Columbia. In other sources the states are listed in alphabetical order of the each state's abreviated title. In these cases the District of Columbia preceeds Deleware because DC preceeds DE. This sorting of the states causes several states to appear in an order different than they appear in the FIPS codes. A similar problem occurs with working with county level data because some government sources list all county names beginning with Mc ahead of all other county names beginning with M while other sources list county names beginning with Mc after county names beginning with Ma. In both cases order all of the state or county data by their FIPS code prevents confusing the order of the observations.

 Variable Source Period FIPS code identifying each state 1994-2008 Fatalities from automobile accidents http://www-fars.nhtsa.dot.gov/States/StatesFatalitiesFatalityRates.aspx 1994-2008 Fatalities per 100 million vehicle miles driven www-fars.nhtsa.dot.gov 1994-2008 State gas tax rate per gallon in dollars www.fhwa.dot.gov/policyinformation/statistics 1994-2008 Real state gas tax rate per gallon in 2009 dollars State gas tax rate per gallon in dollars divided by the CPI with a base year of 2009 1994-2008 State cigarette tax per pack in dollars, State Sales, Gasoline, Cigarette, and Alcohol Tax Rates by State, 2000-2010 2000-2008 State tax on spirits State Sales, Gasoline, Cigarette, and Alcohol Tax Rates by State, 2000-2010 2000-2008 State tax wine State Sales, Gasoline, Cigarette, and Alcohol Tax Rates by State, 2000-2010 2000-2008 State tax on beer State Sales, Gasoline, Cigarette, and Alcohol Tax Rates by State, 2000-2010 2000-2008 Vehicle miles driven on state rural interstates Table VM-202 for various years on: http://www.fhwa.dot.gov 1994-2008 Total vehicle miles driven on state rural roads Table VM-202 for various years on: http://www.fhwa.dot.gov 1994-2008 Vehicle miles driven on state urban interstates Table VM-202 for various years on: http://www.fhwa.dot.gov 1994-2008 Total vehicle miles driven on state urban roads Table VM-202 for various years on: http://www.fhwa.dot.gov 1994-2008 Percent of the registered drivers under the age of 20 Table VM-202 for various years on: http://www.fhwa.dot.gov 1994-2008 Percent of the registered drivers under the age of 25 Table DL-22 for various years on: http://www.fhwa.dot.gov 1994-2008 Percent of the registered drivers over age 70 Table DL-22 for various years on: http://www.fhwa.dot.gov 1994-2008 Percent of the registered drivers over age 75 Table DL-22 for various years on: http://www.fhwa.dot.gov 1994-2008 Percent of the registered drivers over age 80 Table DL-22 for various years on: http://www.fhwa.dot.gov 1994-2008 Percent of the registered drivers over age 85 Table DL-22 for various years on: http://www.fhwa.dot.gov 1994-2008 State mean family income in 2009 dollars1 http://www.census.gov 1994-2008 Dummy variable = 1 if the state has passed the 0.08 per se BAC law; 0 otherwise NHTSA, Regional Office. Updated as of December 1, 2008. 1994-2008
 Column Column title Variable A FIPS FIPS code identifying each state B Year Variable denoting the year and ranges from1994 to 2008 C Fatalities Fatalities from automobile accidents D DPVM Fatalities per 100 million vehicle miles driven E SGasTax State tax on gasoline, $/gallon F RSGasTax Real state tax on gasoline, 2009$/gallon G CigTax State tax on cigarettes, dollars per 20-pack H SpTax State tax on spirits, dollars per gallon I WineTax State tax on wine, dollars per gallon J BeerTax State tax on beer, dollars per gallon K RuralInterstateVMD Vehicle-miles driven in a year on rural interstates, 100 million L RuralTotalVMD Vehicle-miles driven in a year on all rural roadways, 100 million M UrbanInterstateVMD Vehicle-miles driven in a year on urban interstates, 100 million N UrbanTotalVMD Vehicle-miles driven in a year on all urban roadways, 100 million O PU20 Percent of licensed under the age of 20 P PU25 Percent of licensed under the age of 25 Q PO70 Percent of licensed over the age of 70 R PO75 Percent of licensed over the age of 75 S PO80 Percent of licensed over the age of 80 T PO85 Percent of licensed over the age of 85 U BACPS Dummy variable equal to 1 if the state has adopted the 0.08 BAC per se law; 0 otherwise V RMFI09 Median family income in a state in 2009 dollars

### Exercises

1. At this point in your thesis you would want to point out that each of the variables in the data set are proxies for the variables discussed in part 2 of your paper. As an exercise explain how each of the explanatory variables in Table 2 are proxies for the explanatory variables mentioned in the theory section.
2. It would seem that the "cleanest" variable in the whole data set is "fatalities." Lookup the official definition of how a fatality from an automobile accident is measured. Does this variable still seem to have a clear and unequivocal meaning?

## Empirical estimation

Now we are almost ready to present the estimation results from the model. There are a few things we need to cover before we move to presenting the estimation results. First, what, if any, are the econometric issues raised by the model and the data set? In this case we are using a panel data set to estimate the regression:

(2)

where fpvmdit is the number of fatalities per 100 million vehicle miles driven in state i in year t, the xjit is the jth explanatory variable in state i in year t, and D it BAC D it BAC MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadseadaqhaaWcbaGaamyAaiaadshaaeaacaWGcbGaamyqaiaadoeaaaaaaa@3B1B@ is the dummy variable equal to 1 if state i has a 0.08 per se BAC law in year t. From a policy point of view what we are interested in is the sign of β k β k MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabek7aInaaBaaaleaacaWGRbaabeaaaaa@38A6@ and if β k β k MathType@MTEF@5@5@+=feaagyart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabek7aInaaBaaaleaacaWGRbaabeaaaaa@38A6@ is statistically different from zero. At this point it would be appropriate to discuss whether you intend to use a fixed effects or a random effect model. In the interest is simplicity, we will use a fixed effects model but in your own research you would need to consider using either model.

A second issue that needs to be considered is if you plan to use a linear model as specified above or if you might use the natural logarithm of the fatality rate. Since we have no a priori reason to believe that the relationship between the fatality rate and the explanatory variables are linear, we will estimate both log-linear and a log-log models. In this way we can test if our policy conclusions are sensitive to the mathematical specification of our model.

Now we are ready to report the results of the estimation. The key here is to avoid writing a travelog of the estimations. Instead, report all of the regressions in one or more tables and then discuss the results presented in each table.

### Exercises

1. In our estimations we use (a) a linear model, (b) a log-linear model, and (c) a log-log linear model. What are the economic interpretation of the estimated parameters in each of the models? Be sure to discuss both dummy variables and continuous variables.
2. Why does it not make more sense to use an explanatory variable rather than the log of that explanatory variable when that explanatory variable is a percentage?

## Notes on the estimation of the model

Since you will find it useful to replicate the estimation of the basic results, this section consists mainly of a set instructions in Table 4 for use with Stata.

 Instruction Stata commands 1. Open Stata and copy the data in Auto_fatalities_data.xls into the data editor. You will have 765 observations of 22 variables. 2. Tell Stata what variable denotes the state .iis 3. Tell Stata what variable denotes the year .tis 4. Create the new variable the percentage of the total vehicle miles driven that are on rural interstate roads .generate privmd = ruralinterstatevmd/(ruraltotalvmd + urbantotalvmd) 5. Create the new variable the percent of the total vehicle miles driven that are on urban interstate roads .generate puivmd= urbaninterstatevmd/(ruraltotalvmd+urbantotalvmd) 6. Create the logarithm transportation of all of the variables that are not percentages .generate lz = log(z), where z = dpvmd, sgastax, rsgastax, and rmfi09 7a. Estimate the fixed effects model for the linear model (see output in Figure 1) .xtreg dpvm rsgastax pu25 po70 privmd puivmd rmfi09 bacps, fe vce(robust) vsquish 7b. Estimate the fixed effects model for the log-linear model (see output in Figure 2) .xtreg ldpvm rsgastax pu25 po70 privmd puivmd rmfi09 bacps, fe vce(robust) vsquish 7c. Estimate the fixed effects model for the log-log model (see output in Figure 3) .xtreg ldpvm lrsgastax pu25 po70 privmd puivmd lrmfi09 bacps, fe vce(robust) vsquish 8. Place the results into a table making it easier to compare your results; Table 5 is one such table. 9a. The results in Table 5 suggest that the per se 0.08 BAC is a successful way to reduce automobile deaths. However the sign on the real gasoline tax rate is the opposite of what we might reasonably expect. Let's check the sensitivity of our results by rerunning the same three regressions with the real gasoline tax replaced by the nominal gasoline tax. See Table 6 for the results of these regressions. . xtreg dpvm sgastax pu25 po70 privmd puivmd rmfi09 bacps, re vce(robust) vsquish 9b. .xtreg ldpvm sgastax pu25 po70 privmd puivmd rmfi09 bacps, fe vce(robust) vsquis 9c. .xtreg ldpvm lsgastax pu25 po70 privmd puivmd lrmfi09 bacps, fe vce(robust) vsquish

At this point is makes some sense to compare the parameter estimates for 0.08 BAC per se law; this comparison, shown in Table 5, suggests that the effect of the per se 0.08 BAC law was to reduce fatalities. Moreover, the estimates for each of the models is very stable whether one uses the real price of gasoline or the nominal price of gasoline, thus giving us some more confidence in our conclusions.

 Linear Log-linear Log-log State tax of gasoline in 2009 dollars State has a 0.08 per se BAC law -0.1054 -0.0692 -0.0594 (-3.88) (-3.67) (-3.18) State tax of gasoline in current dollars State has a 0.08 per se BAC law -0.1191 -0.0778 -0.0762 (-4.83) (-4.54) (-4.52)

The balance of this section of the paper would be devoted to further tests of the stability of our results under varying assumptions. Among other tests one would expect to see if the choice of a fixed-effects model affects your policy conclusions.

### Exercises

1. Complete the Lagrange test for random effects for each of the three models, using the nominal price of gasoline. Organize the results of this test into a table.
2. Re-estimate the three models replacing the percent of registered drivers under the age of 25 with the percent of drivers under 20. Make the same same kind of replacement for the number of drivers over age 70 (i.e., experiment with the alternative age cutoffs—over 75, over 80, and over 85). Do any of your major conclusions change?
3. What, if any, explanation can you give for the differences in the parameter estimates for the price of gasoline generated when the real price of gasoline is replaced by the nominal price of gasoline?

## Conclusions and further research

This section of your paper should be devoted to a careful recapping of your results and providing suggestions for further research. Such a discussion might include some cautious guesses at why the 0.08 BAC per se standard appears to affect driver behavior. The discussion could also include some estimates of the number of lifes saved by the introduction of a per se standard.

## Footnotes

1. This variable is the mean family income in a state divided by the Consumer Price Index for 2009. The CPI is from Table B-60 of the Economic Report of the President, 2010.

