The importance of predicting the United States Presidential Election cannot be understated because of the impact the U.S. president has on national policy. The U.S. President holds the veto power, which can make it very difficult for the Congress to pass legislation without his approval. In addition, the president appoints all federal judges, most notably the justices of the Supreme Court. While all nominees must be confirmed by the Senate, the confirmation process is usually a rubber stamp and few nominees are turned down, with some notable exceptions (such as Ronald Reagan's failed appointment of Robert Bork). Also, the U.S. President is the Commander in Chief of the U.S. military and can go to war unilaterally and then ask for Congressional approval 48 hours later under the War Powers Resolution of 1973. Lastly, presidents use the executive order to carry out their wishes, perhaps outside the constraints of the Constitution, to choose which laws they want to enforce. For example, President Obama does not enforce neither Defense of Marriage Act nor the deportation of illegal immigrants under the age of 30, Harry Truman ended racial segregation of the military, and Jimmy Carter created the Federal Emergency Management Agency (FEMA). Clearly the U.S. president has enormous power in the U.S. and abroad. Therefore, a company or an individual making decisions that demand foresight about the political state of the country would benefit knowing who the next president is.
Despite the large demand for predicting elections, most existing methods are either unscientific or unreliable. Unscientific judgements include evaluations of a candidate's character or analysis of a candidate's rhetoric. It is not wrong to say that one candidate will win over another because he has more charisma or more appeal to the party base. Such statements are meaningful because political intangibles like charisma and appeal to the base are vital aspects for any politician. The problem is that such intangibles lack some sort of reliable measuring stick, something that can be viewed objectively rather than through the lens of political opinion, in which people may view candidates as they want to see them rather than as they really are.
The most common answer to the measuring stick problem is using public opinion polls. However, public opinion polls are unreliable. Polls are no doubt useful for a first approximation to predicting an election, but they contain many pitfalls for the election forecaster. For example, national polls may not be able to predict who will win the electoral college. The electoral college evolved to limit direct democracy, and an important consequence of the electoral college is that a president can win a presidential election without winning the popular vote. Four U.S. presidents have done so: John Quincy Adams in 1824, Rutherford B. Hayes in 1876, Benjamin Harrison in 1888, and most recently George W. Bush in 2000[4]. Furthermore, national polls include voters from states of no consequence in that they are solidly Republican or Democratic states (i.e. Democratic voters from Texas or Republican voters from Massachusetts or New York). For these reasons, state level polls are more accurate.
But where might a state poll go wrong? The most obvious answer is with independent voters, voters who haven't made up their mind yet. For example, Rasmussen's latest Colorado poll has 5% of people undecided, 6% of people for some third party candidate, and equal percentages for the two major party candidates[2] . State level polls are fine for a month before an election: vice presidential candidates are decided, undecided voters are becoming less and less common, primary struggles are long finished, and party conventions are over. But as of the writing of this paper, these factors, crucial for the accuracy of polls, are not yet decided.
A key to predicting a presidential election is using swing states as the focus of the analysis. In the United States, swing states such as Ohio, Florida, Colorado, Virginia, and Nevada are different than other states in the sense that they are on the border between Republican and Democrat, hovering around 50% in recent elections. These swing states are often bellwethers for the presidency. For example, no president has been elected without winning Ohio since John F. Kennedy lost Ohio in 1960. Also, by focusing on the state level, data becomes more meaningful. For example, while the national unemployment rate is hovering around 8.2% for the June jobs report from the Bureau of Labor Statistics, North Dakota has unemployment below 5% whereas Nevada and California have statewide unemployment rates above 10%[7]. Unemployment levels by county can vary even more from the national average[1]. The point here is that while national averages are indicative of the state of the country as a whole, they are less meaningful for the states in question.
One important insight about the level of detail available on the state and local level is the relationship between counties in a state. Two observations are absolutely critical for the prediction method described herein. First is that strongly Republican counties are usually not right next to strongly Democratic counties, and that the change between Democratic and Republican counties in a state is gradual rather than sudden. The second observation is that the differences between two counties is not random and does not vary greatly nor randomly between elections. If one knows the voting percentages of one county, it is possible to guess the voting percentages of the neighboring counties using historical data. In order to take advantage of these facts, we can model the counties as a Markov Random Field and find the most likely outcome for each county (maximum a posteriori inference).










