2016 Postmortem

Loki Liesmith

(4,602 posts) Tue Sep 27, 2016, 11:31 AM Sep 2016

How to do as well as 538 with only National Poll Averages (or why I'm jealous of Nate Silver).

Last edited Tue Sep 27, 2016, 04:50 PM - Edit history (1)

So I was recently accused by a DUer (whom I hold in the highest esteem) of being “jealous of Nate Silver”. While this is undoubtedly true, I want to really dig into the roots of my jealousy. Trauma buried this deep needs to be excavated and examined, before one can really begin to heal.

You see, I’ve been very disturbed by Silver’s methodology this year, the way it gyrates almost in sync with the polls, when what it is SUPPOSED TO DO is aggregate the information in the polls and give us a stabler and truer picture of where the race is (and maybe where it’s headed).

My suspicion has been that all Nate has really constructed is a glorified poll average, with all his other bells and whistles providing marginal added value (if any at all).

So with that in mind, I converted two graphics, one from the Huffington Post pollster site (formerly pollster.com) and one from Silver’s site and looked at the data from the last 117 days or so. The Huffpost data I rasterized the default smoothed national poll averages for HRC and DJT. From 538, the polls-only win probabilities.

I then constructed two non-linear models from the HuffPost data. Model 1 performs a multiple regression of Trump’s 538 win probability (you only need his, because if you know his, you know hers, assuming no third parties can win) against the Trump national poll average from HuffPost, and the square of the poll average. Model 2 adds the Clinton poll average and the square of this poll average to the mix. The resulting plots can be seen below (note graph inverted due to direction convention on JPEG indices. If it bothers you, think of it as HRC's win prob).

As you can see, Model 1 (Polynomial Model Trump Poll Average in the legend) approximates the 538 curve reasonably well. Really very well. It captures 71% of the variability in the 538 Trump win probability. Model 2 1 (Polynomial Model Both Poll Average in the legend) is even better. It captures 87% of the variability in the 538 Trump win probability. The lower graph shows just how well this model tracks the 538 probabilities.
Plotting the output of the model against the 538 probabilities, you can see a strongly linear relationship.

There is a little structure to the noise, but it’s nothing to write home about, and I could probably reduce it further by adding a few more terms to the model. But capturing almost 90% of the variability, well, you are probably going to be right in ~90% of elections. Unless a razor thin year 2000 scenario comes up, you should be just fine looking at the national poll averages to see who will win.

So…why am I jealous of Nate Silver? Because he’s got one hell of a scam going on right now.

I coulda been a contender.

2 replies

= new reply since forum marked as read

Highlight:

How to do as well as 538 with only National Poll Averages (or why I'm jealous of Nate Silver). (Original Post) Loki Liesmith Sep 2016 OP

I'd agree. I think his model's weaknesses are too heavily weighted ffr Sep 2016 #1

He actually did lowball Obama in 2012 Loki Liesmith Sep 2016 #2

ffr

(22,671 posts)

1. I'd agree. I think his model's weaknesses are too heavily weighted

Reply to Loki Liesmith (Original post)

Tue Sep 27, 2016, 11:38 AM

Sep 2016

on polling data from unscientific polls some media outlets are pushing to present a picture that there is a horse race and it's a close horse race. Those polls have no value. They even create there own false narrative about where the campaign stands between the two candidates.

I guess we'll find out how far off his prediction models are after the election. I'm thinking a couple percentages and millions of votes in favor of Hillary.

Loki Liesmith

(4,602 posts)

2. He actually did lowball Obama in 2012

Reply to ffr (Reply #1)

Tue Sep 27, 2016, 11:51 AM

Sep 2016

although this model is new and is designed to be hyper-reactive to new information...because he got so badly burned by predicting against Trump in the primaries.

Essentially the model is designed to assume a 50-50 race and any bit of info that moves the estimate back in that direction is weighted more heavily than information in the opposite direction.

He designed this model to save his reputation. I'm wondering if the opposite might happen.

Reply to this discussion