538 Model Megathread (user search)
       |           

Welcome, Guest. Please login or register.
Did you miss your activation email?
June 21, 2024, 09:26:14 PM
News: Election Simulator 2.0 Released. Senate/Gubernatorial maps, proportional electoral votes, and more - Read more

  Talk Elections
  Election Archive
  Election Archive
  2016 U.S. Presidential Election
  538 Model Megathread (search mode)
Pages: 1 [2]
Author Topic: 538 Model Megathread  (Read 86232 times)
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #25 on: November 02, 2016, 02:17:41 PM »

What is the uncertainty of the model, given the # of simulations they run?  That is, each time they enter new polls, they apparently run 10,000 simulations based on the latest #s, and that produces (among other things) an overall projected vote margin and win probability.  But let's say they then ran *another* 10,000 simulations with the same input #s but a different random seed?  How different would the results be?  Because if they'd be different, then in theory you could put in a favorable poll for one candidate and it would end up "helping" the other candidate, just because of simulation noise.  But I'm assuming that 10,000 is enough for the simulation noise to be small?


This is purely a statistical noise effect; given Clinton's win percentage of 70% or so, we shouldn't be surprised by jumps of 0.6% or so in Clinton's win percentage just from rerunning the simulations (1 sigma).  If you start looking at individual battlegrounds, we definitely shouldn't be surprised if one of them jumps a percent or two just from statistical noise.

I wouldn't have expected changes that significant with 10,000 trials, although I will say my technical experience in statistics is fairly limited.

Thinking about it some more, I guess it's just Poisson noise.  It's like taking a poll of 10,000 people.  Your margin of error is small, but it's not going to be as low as 0.1%.  You should expect the win probability for one of the candidates to shift by ~0.5% from one set of simulations to the next (meaning that the gap between them will shift by ~1% from one set of simulations to the next).

Which means, yeah, if the win probability changes by about half a percent, it is statistically meaningless.  You would literally expect a shift of that size just from using exactly the same set of polls, but running the simulations a second time.  And given that there are 50 states, there are going to be at least a few states where the simulation creates a big shift, even when the polls haven't changed at all.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #26 on: November 02, 2016, 02:25:47 PM »

And the T+1 Nevada poll gives her .2% nationally too huh.

Yeah, I just noticed that too.  It's a poll that's very mildly more pro-Trump than the Nevada average, but it's a very low-weight poll.  It's only the 8th most heavily weighted poll for Nevada at the moment, meaning that it should have virtually no impact on the overall win statistics.  And we see the national win numbers shift to Clinton by 0.2%, which again is not really because of this poll, but because of the statistical noise in the model.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #27 on: November 02, 2016, 02:43:07 PM »

IIRC, the old polls from the same pollster always get less weight at that moment when a newer poll is added to the database.

It might be a part of explanation.

I don't think you need that to explain it.  As I said, shifts or around ~0.5% in the win probability are going to happen even when you don't add any new polls at all, simply because of statistical noise.  And that's an average.  Sometimes the "phantom shifts," that cannot be explained by any particular poll, are larger than that.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #28 on: November 02, 2016, 02:48:44 PM »

IIRC, the old polls from the same pollster always get less weight at that moment when a newer poll is added to the database.

It might be a part of explanation.

I don't think you need that to explain it.  As I said, shifts or around ~0.5% in the win probability are going to happen even when you don't add any new polls at all, simply because of statistical noise.  And that's an average.  Sometimes the "phantom shifts," that cannot be explained by any particular poll, are larger than that.


OK. 10000 seems like a lot for 0.5% shift to me.

As I said, it's basically equivalent to taking a poll with 10,000 people in it.  The MoE is not going to be less than 0.5%.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #29 on: November 02, 2016, 04:44:05 PM »

Now I think about it, when Silver went on a mini-rant during a recent podcast, emphasising how big he thought Trump's chance of winning was, he sounded genuinely scared. I wonder if he sees it as part of his mission to get Democrats to take the threat seriously and go out and vote. If so, I can't help feeling there are better ways of doing it.

He applied punditry over his own model's numbers during the primary...

Not really.  He didn't have a model during the primaries.  The reason being, as he put it, modeling the primaries is too difficult.  Once we were deep into the primaries, he made some toy models based on the demographics of people who had voted so far, but by that time he was acknowledging that Trump was the favorite.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #30 on: November 02, 2016, 04:53:26 PM »

He didn't have a overall prediction model like the presidential race but he did have a per state model which took into account all the polls done for the state.

He didn't roll that out until around Christmas time, by which time he was already saying that Trump had a good chance of winning the nomination, if not the favorite.  Most of the "Trump can't win" comments he made were back in the summer / early fall of 2015, before he had any such model.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #31 on: November 02, 2016, 05:11:47 PM »

He didn't have a overall prediction model like the presidential race but he did have a per state model which took into account all the polls done for the state.

He didn't roll that out until around Christmas time, by which time he was already saying that Trump had a good chance of winning the nomination, if not the favorite.  Most of the "Trump can't win" comments he made were back in the summer / early fall of 2015, before he had any such model.

To follow up on the above: I just dug up the blog post in which Nate announced the primary forecasts.  Looks like it started on Jan. 12, 2016:

http://fivethirtyeight.com/features/how-we-are-forecasting-the-2016-presidential-primary-election/

By that time he was no longer saying that Trump was doomed.  At the time that he was saying Trump was doomed, he didn't have any model.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #32 on: November 03, 2016, 09:24:03 AM »

For Polls plus the addition of the AZ Saguaro Strategies poll which had Clinton +1 which got adjusted to Trump +1 actually increased Trump's chances from 34.1 to 34.4 which seems bizarre to me given where Trump should be for AZ.  Only reason I can think of is that this poll is slight more positive for Trump than the prev Saguaro Strategies poll had Clinton +2.

Again, shifts of about half a percent are totally meaningless, given that it's based on 10,000 simulations.  You can take the exact same set of polls and run them in a second set of 10,000 simulations, and you'll get, on average, a shift of about half a percent in Clinton's win probability.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #33 on: November 03, 2016, 02:32:55 PM »

Like, how important is the ground game in this state compared to the nation? Is a certain state harder or easier to poll? What about early voting?

How can you put ground game and early voting into the model?  There's no way to quantify "ground game" in any meaningful way, and early voting?  It hasn't been around for very long.  The model is based on how accurately various parameters have predicted election outcomes over the past X years.  You can't plug in things that have only been important for the past couple of elections.  You'll just get garbage.

But as I've said before, that's kind of missing the point here.  Yes, there are additional factors that you might think are important in making subjective assessments of the probabilities.  And you're free to use them when you make subjective assessments!  The analogy I used before was if you're watching a football game, and the commentator says "28% of the time that a game has this score, and this field position, with this much time left, the underdog wins."  Maybe there are other factors that you think are important that are harder to quantify, like the weather.  If so, you're free to use them in setting your own subjective probabilities.  You're free to mentally shift that 28% higher or lower.  It's just a baseline, telling you how other teams have done in the past under those conditions.

That's what the 538 model is.  It's just telling you "X% of the time that the underdog has been behind by this much in the polls, he's won."  You're welcome to mentally adjust that up or down depending on what other factors you think are relevant.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #34 on: November 04, 2016, 03:00:40 PM »

Ok... I'm annoyed by this. 1 poll from UT comes in, with Trump up 7, and the national probability of his winning goes UP 1.3%. That's a ridiculous amount of over sensitivity.

Mr. Morden explained, but had misstake about on average. I think.

I don't think you need that to explain it.  As I said, shifts or around ~0.5% in the win probability are going to happen even when you don't add any new polls at all, simply because of statistical noise.  And that's an average.  Sometimes the "phantom shifts," that cannot be explained by any particular poll, are larger than that.

What's wrong with what I said about 0.5% being an average?  That's around the average shift you'd expect when you run a second set of 10,000 simulations with the exact same polls.  Do you disagree?
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #35 on: November 04, 2016, 05:19:36 PM »

Somehow, a Clinton +18 VA poll helped Trump's chances (combined with a T+6 UT poll).  I guess they view Virginia as gone and say that that means Clinton is wasting a ton of her popular vote there? 

Again, it is pointless to read anything into shifts in win probability of a few tenths of a percent, since such shifts could simply be due to random chance in the simulations rather than any real change in outcome of the model being introduced by the latest polls.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #36 on: November 04, 2016, 05:56:04 PM »

What's wrong with what I said about 0.5% being an average?  That's around the average shift you'd expect when you run a second set of 10,000 simulations with the exact same polls.  Do you disagree?


IDk, it sounded to me as you were talking about mean when you say "the average shift". The mean of shift should be zero.
Or were you talking about abs(mean) or variance/std?

I assume he means root mean square, which in this case should work out to sqrt(2) * the usual standard deviation for sample proportion.  That usual standard deviation is sqrt(p * (1-p) / N), which for p=.643 in the polls-only and N = 10000, is 0.5 percentage points.  Multiplying that by the sqrt(2) factor gives 0.7 percentage points.

A bit more math if you like:

The difference between two independent runs of a normally-distributed random variable with some mean \mu and some standard deviation \sigma is itself a normally-distributed random variable with mean 0 and standard deviation \sqrt(2) * \sigma.

What this means in practice is that, at the moment, 68% of the time the (statistical error only) shift between simulation runs should be less than 0.7 percentage points, and 95% of the time it should be less than 1.4 percentage points.

If you are seeing shifts of 1.4 percentage points more than 5% of the time that means:

1) The poll updates are actually having an effect (which is why they are put into the model in the first place, of course).

and/or

2) You're cherry-picking (consciously or not) the large shifts.

Here’s what I was wondering about though Erc:

If you were talking about the forecast for 10,000 simulations of the outcome of a single state, then yes, your math as above is correct.  However, what the simulations actually do is simulate all 50 states (albeit with heavy correlations between them) and then sum up the electoral votes for each candidate in the simulation to project an overall national winner for that run of the simulation.  Since each state has its own run of random numbers, doesn’t that bring the statistical noise down at the national level?

Now, it probably doesn’t bring it down by that much, because 1) the states are heavily correlated with each other and 2) there are only a few swing states with any realistic chance of flipping the national winner from one candidate to the other.  But it should still bring the noise down at the national level a little, right?
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #37 on: November 05, 2016, 11:08:16 AM »

At the top of their Friday daily podcast, the 538ers said they were getting a lot of tweet and emails about "Why did this poll with this rating push the win probabilities this much?", and in his answer Harry didn't mention statistical noise at all.  Nate himself wasn't on the podcast though.  I'm assuming that Nate has thought about statistical noise in the model, but not sure if anyone else there has.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #38 on: November 05, 2016, 11:54:20 AM »

Nick Gourevitch ‏@nickgourevitch  1h1 hour ago
There has not been a large (2 points+) polling error that has gone against Democrats since 1996. Chart via @ForecasterEnten:



But that's only five elections ago, so not sure it really means much one way or the other.  Small number statistics on the basis of five events isn't that compelling.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #39 on: January 02, 2017, 09:35:24 PM »

That makes a lot of sense. I think 538 might use past election results as a baseline for how nationwide margins translate into statewide ones - thus, it tends to underestimate the extent to which States are trending away from their prior results.

Do they actually do that in the polls only model on states that actually get polled though?

Even if they do, what about if you compare how straight up polling averages in the more heavily polled states (taken from somewhere like HuffPo or RCP, which doesn't include the 538 secret sauce) did relative to their actual vote results?  Do you see the same correlation with the trend map?  If so, then maybe it means that the pollsters themselves were weighting their results in a way that would match up with the 2012 results, and so were blind to real shifts?
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #40 on: January 02, 2017, 09:47:03 PM »

In the polls-only they used national polls to estimate the outcome in each state, but gave individual state polls more weight than national polls. Polls-plus was exactly the same but was set to be less reactive to sudden polling shifts and modify the numbers based on economic conditions, which predicted that the race should be around 50-50.

But were they using 2012 vote results as part of the prediction in any of the models?  Or maybe they were doing that, but only in states that never got polled?
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #41 on: January 02, 2017, 10:44:46 PM »

In the polls-only they used national polls to estimate the outcome in each state, but gave individual state polls more weight than national polls. Polls-plus was exactly the same but was set to be less reactive to sudden polling shifts and modify the numbers based on economic conditions, which predicted that the race should be around 50-50.

But were they using 2012 vote results as part of the prediction in any of the models?  Or maybe they were doing that, but only in states that never got polled?


Not entirely sure. They explained the usage of national polls to predict state polls as (basically) "Well, if the polls were showing the republican pulling away nationally, you'd expect them to be doing well in all or almost all of the battleground states".

Well, in any case, like I said, it would be interesting to see if you take just a straight up polling average, and ignore all the extra things that 538 does....if you take that polling average for each state and compare to the actual results, does the map of deviations correlate with the trend map, as posted upthread?  If so, then I think it probably hints at the fact that the pollsters themselves were weighting things to look like 2012, which caused them to miss the real changes that were taking place.
Logged
Pages: 1 [2]  
Jump to:  


Login with username, password and session length

Terms of Service - DMCA Agent and Policy - Privacy Policy and Cookies

Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Page created in 0.047 seconds with 14 queries.