(Lack of) Statistical Thinking in the NYT

The New York Times yesterday (Nov 2, 2023) had a story and podcast about the "only [US county] that has voted for the winner of the [US] presidential race every year since 1980."

They write:

"Prediction," you say? That will awaken any quant!

Also, it's about Clallam County, Washington — here in my home state. To paraphrase a famous political quote, I can see it from my house (at least its mountains). It's a great place, and one of my favorite bookstores is there in Port Angeles, WA. You should check out the superb photography in the NYT story!

OK, it's a human interest story — complete with the required diner! — and shouldn't be taken seriously. Still, it is in the Times and it makes a somewhat explicit statistical claim. So it's fair game to examine whether voters in Clallam County, Washington (in a great diner) are "resembling a prediction."

In this post, I examine the following question: given 11 elections, how many of the 3000 counties would be on the "winning" side in all of them, simply due to random chance?

Is the NYT reporting something interesting? Or are they making a story from random data? Here are some simple ways I answer that.

Approach 1: Mental Arithmetic

For simplicity, let's assume that every election is exactly 50/50. If we make a single guess, we have p=0.5 odds of getting it correct, or 1 chance out of 2 outcomes. With a second election, there are 4 possible outcomes (2^2), and the random odds of getting both correct are 1 out of 4.

In 11 elections there are 2^11 outcomes. If you know the powers of 2, as many programmers do, that's 2048 outcomes.

Or you might approximate it as follows: 2^8 = 256 and 2^3 = 8, so 2^11 is 256 * 8, or about 250 * 8, which is 2000. Or 250 * 10 less 20%, which is 2500 - 500 = 2000.

From that, we might expect about 1/2000 counties to get 11 elections correct, randomly. An observation of 1/3000 is roughly what we'd expect randomly.

Approach 2: Binomial Confidence Intervals

We can formalize that with a binomial test. Suppose we have 11 independent elections, with 0.5 odds in each one, 3000 counties, and observe 1 success case.

Here's the R code:

# number of US presidential elections 1980 - 2020 
n.elec <- (2020-1980) / 4 + 1
# supposed odds of voting with the "winning" side in one election
p.elec <- 0.50
# number of counties in the US
n.cty  <- 3000   # roughly

# odds of observing 1 success in 3000 counties, assuming 11 elections at 0.5 odds
binom.test(x=1, n=n.cty, p = p.elec ^ n.elec)

The result is:

To expand on that, let's multiply the confidence intervals by the sample size to get the expected number of successes rather than the proportion:

# confidence intervals around that observation 
binom.test(x=1, n=n.cty, p = p.elec ^ n.elec)$conf.int * n.cty  # (0-6 expected "successes")

That is:

Under our assumptions, we expect 0 - 6 counties to get it right. So observing 1 county is not interesting statistically. (It's still a well-photographed story!)

R has other binomial estimation methods that are robust to extremely small proportions, in the binom package. Here's the code and result:

# estimates using multiple methods
library(binom)              # install if needed
binom.confint(x=1, n=n.cty)
# see the confidence intervals (0-7 successes expected)
binom.confint(x=1, n=n.cty)[ , c("lower", "upper")] * n.cty

Using every binomial estimation method, we expect somewhere between 0 and 3-7 counties to get it right (again, under the assumptions we made).

Approach 3: Binomial Distribution

Another binomial approach is to examine the density — i.e., the likelihood — of particular outcomes. In R, dbinom() is the density function for the exact binomial test.

What's the most likely observation for the number of counties? Let's plot the likelihood of observing 0:10 successes (i.e., counties) that correctly call all 11 elections in a row under our assumptions. First the R code:

# plot of the density function shows the most likely number of successes
plot(x=0:10, y=dbinom(0:10, n.cty, p = p.elec ^ n.elec), 
     main="Expected using coin flip odds", xlab="Number of counties, assuming independent random events")

Here's the plot:

If the county results are random and independent, the single most likely result is that exactly 1 county would align with 11 elections. Maybe 0 or 2, or 3, or 4. Almost certainly fewer than 8.

Although I love Clallam County, WA, and enjoy reading about the people there, the result highlighted in the New York Times appears to be random.

But It's Not Random

If you follow elections closely, you'll be aware of the limitations of such simple analyses. Each variable in my simple binomial assumptions is imprecise:

The odds of aligning with the outcome are not 50/50. When there is a landslide election (like Reagan in 1984) more counties will have a higher likelihood of aligning with the vote.
Votes are not equally distributed across counties. Most counties have small populations, and it's possible to win in a relative landslide while winning few counties (In 2012, Obama won 51% of the vote and 62% of the electoral college, but only 22% of counties.)
Side question: are there articles about counties that are wrong in every election? Statistically, that would be just as interesting!
Counties are not predictors of, or otherwise related to, national elections. Instead, for national elections, counties are just a way to aggregate and report votes. The nests are Electoral College <== States <== Voters, and separately Counties <== Voters. (Although counties are nested inside states geographically, they are not meaningful units in an electoral sense.)
Results from one election to the next — the 11 wins in a row — are not independent probabilities. Many counties never vary, always voting the same way in presidential elections.
Counties are correlated and not independent of one another, so there are not 3000 independent actors.

One might model each of those things under various assumptions (hello, FiveThirtyEight). Overall I would expect that such modeling to have modest effects — highly significant in terms of predicting elections but unlikely to have a vastly different answer to the question we examined.

However, the primary point of this post remains: simple stats and mental arithmetic suggest that although the New York Times story might be fun, it is not serious. They should know better.

But did I mention that the photos are excellent? I recommend visiting Clallam County!

All the Code

Here's the complete R code — plus a brief, bonus snippet for the Electoral College vote shares. (For more on binomial models, see Chapter 4 in the R book or Python book.)

# probability estimates
# re New York Times, https://www.nytimes.com/2023/11/02/podcasts/run-up-trump-biden-election.html

# number of US presidential elections 1980 - 2020 
n.elec <- (2020-1980) / 4 + 1
# supposed odds of voting with the "winning" side in one election
p.elec <- 0.50
# number of counties in the US
n.cty  <- 3000   # roughly

# odds of observing 1 success in 3000 counties, assuming 11 elections at 0.5 odds
binom.test(x=1, n=n.cty, p = p.elec ^ n.elec)
# confidence intervals around that observation 
binom.test(x=1, n=n.cty, p = p.elec ^ n.elec)$conf.int * n.cty  # (0-6 expected "successes")

# estimates using multiple methods
library(binom)              # install if needed
binom.confint(x=1, n=n.cty)
# see the confidence intervals (0-7 successes expected)
binom.confint(x=1, n=n.cty)[ , c("lower", "upper")] * n.cty

# plot of the density function shows the most likely number of successes
plot(x=0:10, y=dbinom(0:10, n.cty, p = p.elec ^ n.elec), 
     main="Expected using coin flip odds", xlab="Number of counties, assuming independent random events")

# history of electoral college proportions
margin.elec <- c(0.909, 0.976, 0.792, 0.688, 0.704, 0.504, 0.532, 0.678, 0.617, 0.565, 0.569)
(mean.elec   <- exp(mean(log(margin.elec)))) # geometric mean

# expected number of "correct calls", using electoral college probabilities
plot(x=0:60, y=dbinom(0:60, n.cty, p = mean.elec ^ n.elec), 
     main="Expected using electoral college", xlab="Number of 'wins' out of 3000")

(Lack of) Statistical Thinking in the NYT

Approach 1: Mental Arithmetic

Approach 2: Binomial Confidence Intervals

Approach 3: Binomial Distribution

But It's Not Random

All the Code

Comments

R Language posts

Multidimensional Sentiment Analysis (Part 2)

More from this blog

What Is Your T?

My UX Research "Rolodex"

"Rigor" in Quant UX Research

Qualitative Inquiry Can Make Quantitative Work More Meaningful: But Which Comes First?

Things I'm Hearing about UX Research

Command Palette

Approach 1: Mental Arithmetic

Approach 2: Binomial Confidence Intervals

Approach 3: Binomial Distribution

But It's Not Random

All the Code

Comments

R Language posts

Multidimensional Sentiment Analysis (Part 2)

More from this blog