538 Model Megathread (user search)
       |           

Welcome, Guest. Please login or register.
May 18, 2024, 09:26:00 AM
News: Election Simulator 2.0 Released. Senate/Gubernatorial maps, proportional electoral votes, and more - Read more

  Talk Elections
  Election Archive
  Election Archive
  2016 U.S. Presidential Election
  538 Model Megathread (search mode)
Pages: [1]
Author Topic: 538 Model Megathread  (Read 84482 times)
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« on: July 21, 2016, 10:59:42 AM »

The main difference between polls only and polls plus, by my understanding, is that polls plus bakes in an expected reversion to the mean: the "fundamentals" of the race (i.e. the economy is doing decently but this would be a 3rd consecutive term for the democrats) suggest this should be a close race, and polls plus gives some weight to that fact.

Of course, I would think that the real fundamental of this race is that Donald Trump is the Republican nominee, and that the natural "mean" in such a race is a 10-point loss for Trump.  Of course, such thinking wasn't really helpful in the primary.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #1 on: July 21, 2016, 11:25:52 AM »

People don't seem to understand that Trump has no turnout game, which as a Republican means almost no chance. Look for an Iowa Caucus repeat.

Turnout game matters a lot less in the general than for caucuses, for obvious reasons, though it could matter on the margins.

Though there are a lot of things generally relating to the likely/registered voter distinction (ground game, the Latino vote, the undecideds, Johnson, etc.) that I'm not confident the polls will ever have a good handle on, and could prove to be a surprise in November.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #2 on: July 25, 2016, 01:31:50 PM »

Serious question - why does the Nowcast do a trendline? That doesn't make sense to me. If Clinton leads in Nevada now (as she does per their poll model) why would the Nowcast not predict her as ahead?

Maybe I'm misunderstanding something, but the latest 5 it's showing have Trump up in 3 and Clinton up in 1. The most recent is Clinton +4, but it gets adjusted to a tie based on other factors (e.g. national polling, covariance with other states). So I guess it doesn't surprise me.

If you look at the Nevada Nowcast it has a polling average of Clinton +1.5%.

Adjustment for some of their stuff brings it to Clinton +1.2%.

THEN they adjust for "trend" which swings it by 4.5% in Trump's favour and gives it as Trump +3.4%.

I don't get what trend they're adjusting for if it's "election held today". The net swing is roughly the same as in the forecasts for November 8th.

Perhaps it's the fact that Trump has gained in recent national polling, and all of the polls are at least two weeks old?  Since Trump has gained nationally, you'd expect him to have gained in Nevada as well compared to the old polls.

It's amusing that the current polls-only forecast is the 269-269 tie, but there's something seriously wrong with any model that gives Trump a better than 1-in-3 chance of winning.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #3 on: August 04, 2016, 08:18:37 PM »

After AZ and GA, the next state to flip is.... SOUTH CAROLINA! Y'all better wake up - I've been saying this for too long!

We here on Atlas seemed to disagree, with it placing a rather distant 4th among next-tier pickups (behind Alaska, Montana, and Texas; IN/MO/MS/UT not included in that poll).

What's this based off of?  Some close polls from back in November?  South Carolina, like other heavily racially polarized Southern states with relatively little Yankee influx, doesn't seem like a great target.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #4 on: August 04, 2016, 08:26:56 PM »
« Edited: August 04, 2016, 09:01:47 PM by Erc »

Here's a map that 538 seems to think is as equally likely as a 270-268 Trump win (according to the ever-excitable Now-Cast):



468 Clinton - 70 Trump (assuming NE-1 and NE-2 for Clinton).

EDIT: next target is apparently Kansas.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #5 on: September 22, 2016, 05:49:57 PM »

I thought everything about Silver's most recent post...

fivethirtyeight.com/features/clintons-leading-in-exactly-the-states-she-needs-to-win/

...was spot-on, except for this puzzling line near the end: "She has one really good Electoral College path, but it’s only one path, instead of the robust electoral map that President Obama had in 2008 and 2012."

WTH? I mean, he JUST went into tremendous detail about how Clinton's closest state, New Hampshire, is safer than Trump's five closest states. And how states don't appear to be as well-correlated this election as they have been in the past.

So saying Clinton doesn't have a robust path is just silly...

Isn't that precisely the point?  If state-to-state correlation is low, that should increase the chance of surprises (in either direction).  All else equal, that helps the candidate who's down (i.e. Trump).  You can talk about the firewall to death, but that's contingent on her winning every single state behind the firewall.

As the correlation between state results goes down, the odds that some state behind that firewall does fall goes up.  In an election of surprises (both on the national and local level), we should not be surprised when there's one surprise on election day (even if we can and should be surprised by a particular result).

All of the states behind the firewall seem like a relatively sure bet, but there are a lot of them.  Taking everything in 538 more secure than Rhode Island, you have 9 states: NH, ME, PA, VA, MI, WI, MN, CO, NM. 

If the states were perfectly correlated, the odds of Clinton winning all 9 would be the odds of her winning her weakest (NH, at 64.7%).  If they were perfectly uncorrelated, it would be the product of all of them (around 4.9%).  Obviously, Nate thinks its somewhere in between those two numbers. 

Of course, there's also the potential of surprises in the other direction, as well (Clinton pulling off Georgia, or something).  But the possibility of surprises in Trump's favor are what should keep us up at night.

Personally, I think Clinton will win this one and it won't be close.  But that's one part wishful thinking, and one part all the LV screens are wrong when it comes to Latino voters.  Of course, a Trump supporter could think that the LV screens are wrong when it comes to uneducated white voters.  We could go through the whole unskewing thing, but that's a sign of desperation.

TL;DR  We should expect polling surprises on November 8, both in terms of the overall PV margin and in terms of state-by-state results.  This is a different election than the last 3-4 we've had.  I tend to think the surprises will be in Clinton's favor, but I can't say that for sure.  And that's why I'm worried.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #6 on: September 23, 2016, 03:00:34 PM »

Discussions like these are why I wish Nate would actually release the 538 model (or just the results of the simulations) for us to play around with.  We could then actually get a sense of whether the firewall really exists within his model.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #7 on: October 10, 2016, 05:10:10 PM »

He never thought Rhode Island was a swing state. That one poll showing Clinton up by only 3 (with no other polls showing otherwise) did cause Trump to have over a 10 percent chance of winning Rhode Island for a while, which while seemingly ridiculous given past results there, is probably about right given a lack of any other evidence.

That same pollster, Emerson, now has Clinton up by 20 in their latest poll, thereby "fixing" the Rhode Island odds.

It was 25% on the now-cast. lmao. That's completely embarrassing.

Well I'm glad you know better than everyone else that Trump wouldn't have had a 25% chance to win Rhode Island at that exact moment (which is the variable the Now-Cast "predicts").

Any person with a brain cell knew/knows that, but thanks for being glad for me anyway. Wink

Hillary Clinton had a 70% chance of winning Alabama on August 21st, 2016 at 8:03 AM. Prove me wrong.

Yes, I understand that we're talking in circles. My point is that their model is based on data points only. If you want a model that has subjective "checks," don't use 538.

There still is some element of subjectivity in the baseline (which has some value in their model when there are few/no polls) and in the polls-plus model; of course they set this subjectivity earlier in the year based on demographics / previous election results.

My model-based issue with this (and it's a small issue) is that this subjectivity should have a larger weight (i.e. there should be a narrower prior), so that a crappy poll or two shouldn't affect the probability so much (as in Rhode Island).

The issue with that is you'll be slower to pick up on actual shifts (e.g. Utah, though that's been polled enough now, or potentially Alaska).  But I think that's a fair tradeoff.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #8 on: November 02, 2016, 01:56:17 PM »

What is the uncertainty of the model, given the # of simulations they run?  That is, each time they enter new polls, they apparently run 10,000 simulations based on the latest #s, and that produces (among other things) an overall projected vote margin and win probability.  But let's say they then ran *another* 10,000 simulations with the same input #s but a different random seed?  How different would the results be?  Because if they'd be different, then in theory you could put in a favorable poll for one candidate and it would end up "helping" the other candidate, just because of simulation noise.  But I'm assuming that 10,000 is enough for the simulation noise to be small?


This is purely a statistical noise effect; given Clinton's win percentage of 70% or so, we shouldn't be surprised by jumps of 0.6% or so in Clinton's win percentage just from rerunning the simulations (1 sigma).  If you start looking at individual battlegrounds, we definitely shouldn't be surprised if one of them jumps a percent or two just from statistical noise.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #9 on: November 02, 2016, 02:16:19 PM »

What is the uncertainty of the model, given the # of simulations they run?  That is, each time they enter new polls, they apparently run 10,000 simulations based on the latest #s, and that produces (among other things) an overall projected vote margin and win probability.  But let's say they then ran *another* 10,000 simulations with the same input #s but a different random seed?  How different would the results be?  Because if they'd be different, then in theory you could put in a favorable poll for one candidate and it would end up "helping" the other candidate, just because of simulation noise.  But I'm assuming that 10,000 is enough for the simulation noise to be small?


This is purely a statistical noise effect; given Clinton's win percentage of 70% or so, we shouldn't be surprised by jumps of 0.6% or so in Clinton's win percentage just from rerunning the simulations (1 sigma).  If you start looking at individual battlegrounds, we definitely shouldn't be surprised if one of them jumps a percent or two just from statistical noise.

I wouldn't have expected changes that significant with 10,000 trials, although I will say my technical experience in statistics is fairly limited.

Statistical noise scales as 1 / sqrt(number of trials), so it takes a lot of trials to get your error below the percent level.  To decrease your statistical error by a factor of 10, you need to run your simulation 100 times longer.

This does mean that 538 really shouldn't be listing the percentage points on their probabilities, unless they feel like running a million simulations each time.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #10 on: November 04, 2016, 03:23:18 PM »
« Edited: November 04, 2016, 03:47:14 PM by Erc »

What's wrong with what I said about 0.5% being an average?  That's around the average shift you'd expect when you run a second set of 10,000 simulations with the exact same polls.  Do you disagree?


IDk, it sounded to me as you were talking about mean when you say "the average shift". The mean of shift should be zero.
Or were you talking about abs(mean) or variance/std?

I assume he means root mean square, which in this case should work out to sqrt(2) * the usual standard deviation for sample proportion.  That usual standard deviation is sqrt(p * (1-p) / N), which for p=.643 in the polls-only and N = 10000, is 0.5 percentage points.  Multiplying that by the sqrt(2) factor gives 0.7 percentage points.

A bit more math if you like:

The difference between two independent runs of a normally-distributed random variable with some mean \mu and some standard deviation \sigma is itself a normally-distributed random variable with mean 0 and standard deviation \sqrt(2) * \sigma.

What this means in practice is that, at the moment, 68% of the time the (statistical error only) shift between simulation runs should be less than 0.7 percentage points, and 95% of the time it should be less than 1.4 percentage points.

If you are seeing shifts of 1.4 percentage points more than 5% of the time that means:

1) The poll updates are actually having an effect (which is why they are put into the model in the first place, of course).

and/or

2) You're cherry-picking (consciously or not) the large shifts.
Logged
Erc
Junior Chimp
*****
Posts: 5,823
Slovenia


« Reply #11 on: November 07, 2016, 09:48:48 AM »

Sure, he bombed with Trump in the primaries - mainly because he suddenly played pundit instead of sticking to crunching numbers, which is what he is renowned for - but since then he has returned to number crunching and stayed away from over-confident predictions.

The guy is not pro-Trump in any way, like some are acting. Anybody who has listened to the 538 podcast should know this. He has just been careful in not ruling Trump out, which is very understandable after the primaries, and if you ask me, aligned with the actual data, which has shown Clinton ahead but not consistently enough to rule out a Trump win by any means - particularly with the uncertainties surrounding this strange election with a highly "unconventional" candidate and weird polls.

He gets flack for including polls that seem trashy - like the 50 state polls or the LA times poll - but basically he is just staying true to the methodology that predicted the 2012 and 2008 elections pretty accurately. People act as if they want Silver to subjectively throw out every poll that looks like an outlier, but that is exactly when you end up with a biased result.

People complain about the "house effect" calls and that's the one part I am sceptical about as well, but at least they are applying consistent methods in working this out.

Overall, it is just a model and a model that has proven to be pretty damn good in 2008 and 2012. Maybe it isn't so good this time around, maybe it is. That's the way it works. These models are only as good as the numbers they put into them. If the polling is off the models will be off. 538 is doing their stuff in a fairly transparent way.

Most of the most important things in the model (how he derives the correlation matrix between states, for example) are not transparent at all.  This makes business sense, of course, but makes it hard to assess it properly.

Otherwise, you're mostly right.  This is a model that worked well for the last couple elections (when we had very consistent and accurate state polling) but is failing now because it can't handle terrible polling and Donald Trump.  It's still doing as well as it is because of its proprietary secret sauce, and it is the most honest about the uncertainty of its forecast.

Its baseline is still way off though, but that's because the polls are.
Logged
Pages: [1]  
Jump to:  


Login with username, password and session length

Terms of Service - DMCA Agent and Policy - Privacy Policy and Cookies

Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Page created in 0.034 seconds with 11 queries.