Nate Silver finds and quantifies evidence of (D)-favoring voter fraud
November 10, 2012, 10:22 pm
Filed under: Uncategorized

Nate Silver has an interesting post summarizing how the various polling firms did in their state polls. Generally it looks like he finds most of them had consistently overestimated the (R) vote. He throws out the usual cuffed explanation for this (not enough cell phones, younger people have cell phones, younger people are more (D), bla bla).

But he appears to miss the elephant in the room, which is cheating. But surely no serious, scientific, quantitative genius person such as Nate Silver can possibly forget or just un-scientifically ignore the fact that the final vote contains some nonzero amount of cheating.

And cheating (or just, mistakes in tabulating) can’t possibly, even in principle, be picked up by pre-election polling. So it is always a possible, nonzero factor in explaining and understanding any mismatch between polls and “the actual vote” (i.e. the actual vote + cheating/mistakes).


Now – to echo a bunch of arguments I made against Silverbating righties – the fact that almost all these polls from all these different polling companies with all different sorts of methodologies find a consistent, systematic “(R) bias” just beggars belief. No quantitative-minded person can just accept that as the result of random chance. Sure, there will be errors and biases but wouldn’t the errors and biases cancel each other out? How likely is it that virtually all polls would come out with an (R) bias? That is an extraordinary claim which requires extraordinary evidence, which Nate Silver does not have.

The more parsimonious and scientific inference is that the “(R) bias” Nate Silver has found is, of course, nothing other than an estimate of the (D) cheating advantage. What else could it be, after all? Yes, it could theoretically be something else – but that would require an explanation, and evidence. Surely the null hypothesis is that the “bias” showing up from these polls is just the result of voter fraud.

I have taken the liberty of using Nate Silver’s data to estimate this fraud. Silver reports a table of the empirical bias from 23 polling companies who did at least 5 LV polls in the last 21 days. Only 4 of these 23 showed a (D) bias – again, highly unlikely to be the result of chance.

Tossing out the high (R) bias (Gallup with R+7.2) and the low (Pharos with D+2.5), the average bias of the remaining 21 polling firms was about R+1.2.

In other words, we now have scientific, quant-friendly evidence here that the (D)s get something like a ~1.2% advantage from fucking cheating.

Again, if you have a better explanation why all these polls would come out with a R+1.2 bias on average, you are welcome to advance your argument, along with your evidence. But fair warning, if you mumble some facile BS about ‘cell phones’ or ‘hurricane Sandy’ I’m going to fucking make just as much fun of you as I made of righties who BS’ed stuff about lefties more likely to lie about being likely voters or pollsters ‘using a 2008 turnout model even though (R) enthusiasm is really high’.

I think I’ve earned the right to say this because I have been and continue to be consistently on the side of the quants: if you don’t see Nate Silver’s table as, absent other quantified and supported explanations, prima facie evidence of the size of the (D) cheating advantage, then guess what? You’re not on the side of the quants, and you must hate math.

UPDATE: South Bend Seven respectfully disagrees.

30 Comments so far
Leave a comment

I would point out that this stands on much more solid ground, there are very few assumptions being made here. You don’t need a shaky model to compute the error for the poll, given the outcome.

What would be very useful here would be county-level polling, compared with county-level results, especially in black dominated areas of Ohio. If someone has that data, it would at least point out the most likely fraudulent precincts.

Of course we don’t need any data come up with a good guess for the locations of said precincts…

Comment by Dave

At least, it’s perfectly analogous to Silver-defenders’ critique of the Silverbaters. They (and I) would say:

“In order to believe Obama’s not the odds-on favorite, you have to believe virtually all these polls are systematically-biased to the (D)s. How likely is that? And if that’s what you think, whyy would it happen, and where’s your evidence? Until you answer that, you’re just denying data and math.”

I am saying:

In order to believe the (D)s didn’t vote-cheat to the tune of something like ~1.2%, you have to believe virtually all these polls are systematically-biased to the (R)s. How likely is that? And if that’s what you think, why would it happen, and where’s your evidence? Until you answer that, you’re just denying data and math.

I guess I can see rejecting both of these paragraphs, perhaps – but not just *one* of them.

Comment by Sonic Charmer

Well, if you read the Silver post to which you linked, Nate spends several paragraphs trying to come up with a different explanation for the systematic bias. None of his possible explanations are more compelling than “there’s at least 1% Dem. cheating”.

Comment by Dave

Exactly. His explanations are all post hoc just-so stories, every bit as much as ‘Bradley effect’ and ‘lying on the likely-voter screening questions’ the righties were clinging to.

My explanation is more parsimonious and just simple math:

Poll Bias = Poll Finding – Actual Vote% Reported
Poll Finding = Actual Vote% + Poll Error
Actual Vote% Reported = Actual Vote% + Cheated Vote% (which can include cheating & tabulation error)


Poll Bias = Poll Error – Cheated Vote%

But one assumes the mean Poll Error is 0. Thus Poll Bias = – Cheated Vote%, and so if you find a statistically-significant poll bias over a large number of polls, *you should assume it is due to cheating in the election*. This should be the default position, absent strong evidence to the contrary.

Because again, to think otherwise is to think that there is a systematic Poll Error. And we know that all the Smart People of the world think that is a silly, unquantitative, and innumerate thing to think. Or at least, it requires strong evidence. And Nate Silver doesn’t have it or if he does he doesn’t think it is worth taking seriously. We know that from literally all of his work leading up to the election.

Comment by Sonic Charmer

Well, before the elections you stated that D’s are perhaps/probably more likely to respond to calls. Perhaps the pollsters factor that in and try to calibrate. This would be systematic. There’s also the ‘shy tory’ effect (whether real or not, it’s perceived to be real) / ‘I vote for the black guy’ effect – and this is a well known effect the pollsters certainly (well, probably certainly) took into account and tried to calibrate for.
Before the elections, many said they systematically didn’t calibrate properly for it. Perhaps they over-did it. Whether they under-calibrate or over-calibrate, it seems reasonable that it would happen not randomly but across the board.

To my mind, that renders your hypothesis null. At best one could argue that part, but by no means all, of the discrepancy is due to cheating but there’s no way to tell what portion that is without having the pollsters give raw data. If they would, it still wouldn’t be a good fit, since then we might see raw data pointing to an even bigger D vote, due to the reasons stated above.

Comment by Anon.

When talking about lefties-answer-phone-more (and shy Tories, etc), I was trying to think of the best/most convincing poll-truther porn I could muster. I never used it to second-guess polls though, i.e. my official position was still that these poll-truther stories were pretty weak and that polls presumptively weren’t systematically biased and that Nate Silver’s probability calc was just fine (remember?). And this was Silver’s position too, of course.

I have no idea whether polling companies try to do a shy Tory ‘adjustment’. I have never heard that (doesn’t mean they don’t, as I only started paying attention to polls maybe a month ago). In any event I made no such assumption when running my model prior to the election and I’m making no such assumption now. But if you have genuine info that this is widespread polling practice that would be interesting. I actually doubt it (how would such an ‘adjustment’ even work? And if they were doing a uniform ad hoc ‘adjustment’ of conservative white respondents’ answer%, surely that fact would have made its way into R poll-truther canon?), but who knows…you tell me…

Comment by Sonic Charmer

Man, I was clinging, but now I’m not bitter! Thanks, Sonic Charmer!

Comment by A Lady

Sonic Charmer, you know that I’m a Denier (Global Warming and other stuff), but I think that the biggest argument against this is that it’s simply impossible to keep cheating of this magnitude secret (I used to work in the Intel community, and if there are more than 3 people who know a secret, then it’s not a secret anymore).

Sure, the press would want to cover up for the Democrats, and sure the GOP isn’t called the “Stupid Party” for nothing, but this would be very well known to anyone active in campaigns. *Everyone* would know, and a bunch would be highly motivated to get the story out.

I’m at a loss as to how this would stay hushed up by everyone.

Comment by Borepatch

I guess that’s why we totally know the full story on Benghazi by now – it’s just impossible to keep secrets bc people always talk. For that matter, this also explains why we know so much about Obama’s bio as a young man. ;)

But seriously, to meet you halfway (since you have a point), let’s keep in mind such ‘cheating’ doesn’t have to be the deliberate result of a tiny cabal. For example how do you interpret the news of these Florida precincts with more than 120% of registered voters voting? Maybe I’m missing something but to me that is de facto evidence of vote fraud of some sort.

But the fraud could, I guess, just be something like a bunch of NY/NJ/CT residents with winter homes in FL who are double-voting. That wouldn’t require any cabal, just widespread amorality (not hard to believe) and laziness/negligence – whether inadvertent or kinda deliberate – on the part of the FL officials in policing their voter rolls. So, there would be nothing to ‘keep secret’ per se. Yet it would still be fraud.

Also, you say ‘of this magnitude’, but let’s not overestimate the magnitude required. I’ve read that Obama’s margin boils down to about 400k votes in four key states. That isn’t all that huge; the St. Lucie county incident alone appears to have created about 75k ‘extra’ voters. Of course I don’t know that all or even a majority went for Obama, but it does show that squeezing an extra 400k votes out of 4 large states wouldn’t have been all that difficult.

Comment by Sonic Charmer

Just saw that Ace of Spades thinks the overturnout story is phony…FYI

Comment by Sonic Charmer

PS None of which is to say that I think Obama won due to fraud. I could see a Perfect Accounting showing that fraud tipped Florida to him. Obviously, that wouldn’t have been enough though. My overall sense, without looking into the fraud too deeply, is that he would have won, fraud or no fraud. The polls suggested it anyway.

But anyone saying that must admit that the very same logic suggests that the final vote count from *some* fraud – about 1.2% give or take. According to my handy eleven-state table that probably still wouldn’t have been enough for Romney to win though.

Comment by Sonic Charmer

Gonna have to agree with Borepatch on this: the idea that vote cheating constitutes over a percentage of the D votes is just ridiculous. Besides, if it really is as consistent as you say, then that defeats your own argument: why the hell would they bother cheating in Texas/California/Alabama/etc. where there’s no point? If we take your cheating hypothesis, we should expect bigger R bias in the polls in closer-swinging states.

Sorry, man, but reality is cheating your theory with all its stubborn annoying “facts”.

Comment by anon

1. Why is 1% a ridiculous number? What is the threshold from easy to not-easy? How steep is that difficulty curve really and how do you know? Note, here are some possible methods of cheating: people double-voting (i.e. in more than one state or locale), filling up dead or nonvoting registered voters with votes you control, ‘losing’ boxes of votes in precincts known to vote a certain way, controlling low-info people who voting rights (i.e. recent immigrants) and telling them how to vote. And of course, simply programming a trojan horse into the vote machines to flip the count to whatever you want. Maybe there are others I just don’t have the imagination to think of.

Why couldn’t they add up to 1%? Some of them (the box-‘losing’) can take care of thousands at a time. The electronic method can flip a vote to any number, there is no limit in principle, no threshold which is “hard” – changing a vote count in a machine by 10% is no harder than changing it by 1 vote right?

2. You are right that rational cheaters wouldn’t focus their cheating on all states uniformly, they would focus on a few key states. Right now I only have the aggregate numbers (since Silver reported the *average* poll bias for all these pollsters) but the data could certainly be drilled into further to see if, say, Gallup’s “R+7.2 bias” mostly comes from Florida or what. In any event, the fact that cheating would likely be concentrated in a few key states rather than spread out amongst all states actually should make it *easier* not harder to believe that cheating swings elections. As noted above, I don’t really think this was one of them, but that doesn’t mean that cheating didn’t happen. Cheating presumptively happens, the only question is how big. We now have an estimate: D+1.2%.

3. So, you’re also right that according to my thesis, if the cheating were deliberate and centrally-controlled, we’d expect it – i.e. the empirical “(R) bias” in polls – to be bigger in closer or more key states like OH, VA, FL. That is something that can and should be checked. I just don’t have the data yet because Silver didn’t break it out that way. So, that means the jury’s still out – why do you think it means the case is closed?


Comment by Sonic Charmer

It is extremely disingenuous (and you know it [1]) to lump all those things under the term “voter fraud”. A better term for almost all of them would be “election fraud”. The person losing the box is not cheating with their vote, they’re cheating with their box-carrying duty.

I assert that if you claim >1% of the dem vote is due to “voter fraud”, virtually all readers will parse this as a claim that >1% of all dem votes occur due to double-registration/graveyard-registration. Which is patently absurd. With the number of participants involved, there would be whole communities and forums and subreddits of these people. (Basically, see Borepatch’s comment)


Comment by anon

I’ll plead guilty to being sloppy/lazy, and that I could speak more carefully. The fact is I am pretty agnostic as to whether we call this vote fraud or election fraud or election error or tabulation error or cheating or whatever the hell. As far as I’m concerned, *any* mismatch between the Final Certified Count By The S.o.S., and the *actual cast votes* of *only legitimate voters* is equally problematic and troublesome. So if only for ease in presentation and quicker typing I decided to sweep it all into the umbrella term ‘Cheating’, if that’s ok with you.

Also, I didn’t specify any particular type of ‘Cheating’, I just gave some examples. Confusion in ‘parsing’ on that score is the fault of the reader.

I don’t think there’s any fallacy of composition going on. I haven’t accused any particular person of anything here, nor am I reasoning from one particular kind of ‘Cheating’ to any larger conclusion. I am interested solely in how much ‘Cheating’ there is, for the purpose of enlightenment and discussion and perhaps further research (by other people ;-) ). The answer – unless you’ve got a better one – appears to be something around ~1.2%.

Or do you just hate math? ;-)

Comment by Sonic Charmer

Let’s cast it in your Austerity Logic language (which I love, btw).

1. Voter fraud is defined to be multi-registration, graveyard-registration, intentionally losing ballots, hacking voting machines, and deceiving voters.
2. Evidence suggests that ~1.2% of Dem votes are due to voter fraud, this is a dire situation.
3. A national ID card would greatly hinder multi-registration and graveyard-registration.
4. By definition, that means a national ID card would greatly hinder voter fraud.
5. Therefore, we urgently need national ID cards.

You see the problem now?

Comment by anon


Did I say anything about national ID cards?

Comment by Sonic Charmer

No, but statements have consequences. Besides getting webtraffic and/or getting the Democratic party outlawed, what exactly are you hoping to accomplish with this post? Or maybe a better question is, which interests could most easily hijack your post/argument? Answer: national ID card advocates.

Comment by anon

I dunno, if you’re gonna accuse me of bad-logic fallacies I guess I think it should be based on what I actually said instead of on what you imagine some reader might think after reading what I say. Call me old-fashioned.

Can I just take this pointless side-trip goose chase as evidence that you won’t/can’t rebut the actual logic of what I said?

Comment by Sonic Charmer

It’s a common (and legitimate) debate technique to formally accept the opponent’s argument, and then show them where it ultimately leads (even if this involves bringing in some element the opponent himself didn’t mention).

Sure, your logic is ok* given your unusual definition of “voter fraud”. Likewise, I can easily prove Fermat’s Last Theorem if I’m allowed to redefine the meaning of the symbol “+”.

*(Well, modulo other possibilities such as the cell phone thing, but we won’t go there)

Comment by anon

‘Where it leads’ (in your thought experiment, if you do it right) is that everyone in the country magically adopts my concern over vote fraud and accepts obvious measures to ensure the votes are cast legitimately and counted correctly.

Of course, this will not happen, and the fraudulent vote% breaking disproportionately toward one party goes a long way to explaining why.

Comment by Sonic Charmer

It is, after all, at the state level that national elections are regulated. Why not simply allow states’ secretaries of state to require that a voter present one of any number of acceptable forms of identification prior to voting? Why jump immediately to a “national ID card” which has negative (fascist) connotations to some Americans.
Of course, some states already have such laws, and several states have tried to enact such laws but have been rebuffed by DOJ on the ground that it would place an inordinate burden upon those who may find it difficult to obtain such identification (objectors suggest that the poor and minorities would be most susceptible.) I have no idea how effective such laws are.
PS: I also have no idea what SC’s “Austerity Logic” language is, so please excuse if I’m being too literal or obtuse here.

Comment by colocomment

That is staggeringly wrong.

SC is making a factual assertion (backed by evidence) and you’re claiming that he’s wrong because there are unsavory consequences to acting on those facts (in a specific way).


We’re on a life raft and running low on water. One more day of hot sunshine and we will all be too dehydrated to survive. However, if one person had all the water they could survive to reach land. Everyone in the boat knows this and so a life and death struggle might result if we know that the sun will come up tomorrow. I argue that the sun will rise tomorrow. You disagree based on the “common, legitimate debate technique” of pointing out the negative consequences of my argument. Conclusion? The sun will not come up tomorrow!

Unless the point you’re actually making is more like “‘shut up’, he explained” and you accept that SC’s argument is correct but you just don’t want people to think about it or do anything about it.

Comment by Steve Johnson

Cheating on elections does seem relatively congruent with the lefty “By Any Means Necessary”, OWS, SEIU bully boy way of doing things.

That said, mumble mumble Hurricane Sandy mumble mumble.

Looking at the charts at 538, what I see is that the candidates’ positions started at the end of May relatively close. As the summer went on, however, the Democratic advantage in marketing/culture war and their advantage in the way the MSM spins everything their way steadily took its toll, and Obama steadily pulled further away, to the point of having an 86.1% chance to win on Oct 3.

Romney closed much of the gap in the time after the first debate, as he had the opportunity to directly reveal to people people that he and Obama were not who the Obama campaign (and the MSM, but I repeat myself) said they were. But from Romney’s high point of 37.1% on Oct 13, the control-the-discourse BS machine revved up again, and Obama started trending steeply up again, and Romney down. And yes, spin around Sandy was a big part of it – the need for FEMA and global warming becoming the story, rather than the government’s lack of intelligent preparations (disaster plans or harbor barriers) and minority looting.

So, all told, it would not surprise me if polls taken a five days before the election underestimated Obama’s winning % by about one percent.

Comment by Ian

You may be confusing % chance to win with % margin of victory here. % chance to win for the favorite will (the way I understand/conjecturize Silver’s model) tend to creep upward even if polls stay the same and nothing else changes, ie just due to the passage of time. Also 1% chance to win is not equiv to 1% predicted vote differential.

That said, it’s true that % chance to win had shot way up prior to the election, but this was largely explainable by poll movements. In other words, the recent pre-election polling had all pretty much ‘priced in’ the Obama surge youare pointing to in the % to win movement. Yet it is these polls that ended up R+1 compared to the final tally.

To say that this is fully explainable by some sort of final surge in the last 3-5 days not picked up by polling is basically to make the hurricane Sandy mumble mumble argument :)

To be clear, it could be true. So could ‘lefties lie more about being likely voters’. But as of now neither has much in the way of hard evidence behind them, which is why (as good scientists) our default position is that the polls are unbiased and we do NOT assume the systematic bias is there and then invent some post hoc reason for it. In either case.


Comment by Sonic Charmer

As always, good points.

Comment by Ian

The great thing about the US’s distributed election process is that fraud is fractal. No matter what level you look at, you can probably find examples of voter fraud, tabulation fraud, election official misconduct, etc.See for example this picayune story of probable fraud where a guy officially received 9 votes in an election but had 28 people who claimed to have voted for him:

But the fractal nature of fraud also means that there’s no central conspiracy where Comandante Pelosi orders up 100K votes in NE Ohio or whatever. It’s just people at all levels acting in their perceived self-interest. Most of the time, it doesn’t occur on a scale that would change election results and thus it usually receives zero scrutiny,

When you write
Poll Bias = Poll Error – Cheated Vote%
It seems obvious that all the fraud term must sum up to +D given which party opposes voter ID laws, cleaning the voting rolls of ineligible voters, etc. But the assumption of 0 mean poll error is questionable – it is certainly not hard to imagine that polls contain systematic errors AND that all elections contain fraud.

Comment by SkepticalCynical

You’re right there need not be any central conspiracy. Many forces contribute to the size and direction of cheating. A lot of ‘cheating’ probably just occurs at the individual level, e.g. when a NY resident decides to also vote in the Florida county of their summer home. No such person needs any marching orders.

Of course, that doesn’t preclude people in more or less ‘central’ positions from identifying/recognizing such forces and who they favor, and consciously acting (or not-acting) to boost one or suppress another. For example, to name two that presumably work in opposite directions: by whining about voter-ID laws, or by supporting S.o.S. ‘purges’ of voter rolls that arguably will predictably delete a disproportionate # of one side’s valid voters.

These forces (decentralized and distributed, or centralized and ‘conspiratorial’) all act together on the cheating. The net effect and direction of all of them summed together is an empirical question. But there is no a priori reason to believe that the effects all ‘cancel out’. It is perfectly conceivable that for one reason or another the dynamics create a persistent advantage for one side.

And that is what Silver’s table appears to indicate, anyway, since the final official vote-count (which by definition includes cheating) appears to be more tilted toward Ds than (presumably unbiased) polls would have suggested.

Finally you’re right that the assumption of unbiased polling is questionable, but it’s one that we’ve all been going with and calculating probabilities off of since well before the election. At the very least, it would be very strange for people who have spent the past six+ months saying ‘for this % to be wrong, all the state polls would have to be systematically biased, and that’s unlikely’ to suddenly flip to ‘hey, I guess all the state polls are systematically biased after all, let’s figure out how much’ merely due to the final vote tally not matching the polls.

Because (again), the final vote tally is NOT the true measure of valid voters’ votes; it includes nonzero cheating. Surely Smart People who care about math & science would not have forgotten that undeniable fact! :-)

But a more careful way to state things is that some combination of poll systematic bias, and cheating, adds to ~1.2% for the Ds. People who have spent the past months telling us we’re not allowed to assume systematic bias are on quite shaky ground if they insist this is evidence of systematic poll bias and not cheating.

Comment by Sonic Charmer

Oh, I agree. It’s a beautiful dilemma this particular set of facts poses for Smart People. Unfortunately, despite your brilliant analysis of right wing poll trutherism, this probably means you’ll have to wait a while for a link from Krugman.

Comment by SkepticalCynical

[…] just as a reminder, that F factor – if I’m right – will, just as Nate Silver (whether consciously or not) helped document in the 2012 Presidential election, serve as an unbiased estimate of the systematic advantage (D)s obtain in our electoral system from […]

Pingback by Election prediction and the F Factor | RWCG

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 496 other followers

%d bloggers like this: