Errors in Satisfaction – Point Estimates

I came across an article on American life satisfaction yesterday. The trend analysis was fine, but the local news story version of it decided to take it a step further and do some sub-group analysis. While some news stories are pretty obviously wrong - if a reporter were to issue a story about how much better the grocery stores in Moscow are, for instance, you wouldn’t take long to realize it’s nonsense* - there’s something about numbers that lulls us into a “looks right, I guess” sense.

*Scott Lincicome worked the numbers, BTW, and concluded the same amount of groceries would cost about $130 over here, vs. the $400 the segment estimated. The more you know.

I had several questions that I was hoping to resolve by getting complete data (more on that later), but the most I could find was a cross-tab with some (not all) of the demographics in the report. This isn’t to pick on this poll, it looks to be well done and you can’t blame them for somebody else possibly reading too much into it. Still, having recently read Steve Wexler’s excellent take on checking uncertainty in data, this seemed like a good place to put it to practice.

Show the code

library(gt)
library(gtExtras)
library(hrbrthemes)
library(tidyverse)

sat_data <- tibble(demo = character(),
       class = character(), 
       n = numeric(), 
       sat = numeric()) %>%
  add_row(demo = 'Total', class = 'Total', n = 1011, sat = 791) %>%
  add_row(demo = 'Gender', class = 'Male', n = 502, sat = 392) %>%
  add_row(demo = 'Gender', class = 'Female', n = 500, sat = 394) %>%
  add_row(demo = 'Race', class = 'White', n = 649, sat = 518) %>%
  add_row(demo = 'Race', class = 'Non-White', n = 334, sat = 253) %>%
  add_row(demo = 'Age', class = '18-34', n = 276, sat = 210) %>%
  add_row(demo = 'Age', class = '35-54', n = 310, sat = 256) %>%
  add_row(demo = 'Age', class = '55+', n = 396, sat = 301) %>%
  add_row(demo = 'Education', class = 'College Grad', n = 374, sat = 317) %>%
  add_row(demo = 'Education', class = 'Some College', n = 279, sat = 212) %>%
  add_row(demo = 'Education', class = 'HS Grad or Less', n = 355, sat = 260) %>%
  add_row(demo = 'Party ID', class = 'Republican', n = 255, sat = 197) %>%
  add_row(demo = 'Party ID', class = 'Independent', n = 461, sat = 346) %>%
  add_row(demo = 'Party ID', class = 'Democrat', n = 273, sat = 230) %>%
  add_row(demo = 'Household Income', class = 'Less Than $50,000', n = 313, sat = 219) %>%
  add_row(demo = 'Household Income', class = '$50,000 - 100,000', n = 322, sat = 247) %>%
  add_row(demo = 'Household Income', class = '$100,000+', n = 278, sat = 247) %>%
  mutate(sat_perc = sat / n)

sat_data %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = c('n', 'sat'), decimals = 0) %>%
  fmt_percent(columns = c('sat_perc'), decimals = 0)

demo	class	n	sat	sat_perc
Total	Total	1,011	791	78%
Gender	Male	502	392	78%
Gender	Female	500	394	79%
Race	White	649	518	80%
Race	Non-White	334	253	76%
Age	18-34	276	210	76%
Age	35-54	310	256	83%
Age	55+	396	301	76%
Education	College Grad	374	317	85%
Education	Some College	279	212	76%
Education	HS Grad or Less	355	260	73%
Party ID	Republican	255	197	77%
Party ID	Independent	461	346	75%
Party ID	Democrat	273	230	84%
Household Income	Less Than $50,000	313	219	70%
Household Income	$50,000 - 100,000	322	247	77%
Household Income	$100,000+	278	247	89%

The smallest group here has a sample size of 273. Surely that’s a big enough number, right?* Let’s calculate the standard error - that is, if we drew another sample of the same size, we’d expect to get a number between x and y. The bars indicate this range, if they overlap, then we’re less sure that our difference is real vs. due to the sample we happened to get.

Show the code

sat_data %>%
  mutate(se = sqrt( (sat_perc * (1 - sat_perc)) / n),
         se_low = sat_perc - (1.96 * se),
         se_high = sat_perc + (1.96 * se)) %>%
  filter(demo != 'Total') %>%
  ggplot(aes(x = sat_perc, y = class))+
  geom_point()+
  geom_errorbar(aes(xmin = se_low, xmax = se_high))+
  theme_ipsum()+
  facet_wrap(demo ~ ., scales = "free_y")+
  xlab('')+
  ylab('')+
  scale_x_continuous(breaks = seq(0.6, 0.96, by = 0.10),
                     labels = scales::percent)

We see that both college grads and those making over $100K report higher satisfaction*. This is where the cross-tab lets us down, though. It’s reasonable to assume that most (not all, but most) making over $100K are also college grads. If we were building a model on this, we’d test the relationship and either only keep one or use an interaction term, i.e. measure satisfaction for each combination of education and income. Political party shows clear separation between Independents and Democrats, but everything else has overlap.

*This of course assumes that we have a representative sample. Their methodology seems sound - random calls spread across the country - but you should always check how people were polled. If they opted in to an online poll on a partisan website, you can bet the results will be skewed.

Eyeballing stuff is fun, but we can be a bit more rigorous and run a chi-squared test in each group. If the p-value is less than 0.05, we’ll say the differences are likely real - though we should incorporate our own knowledge of how the data was generated and not just check the box and call it good.

Show the code

sat_data %>%
  filter(demo != 'Total') %>%
  mutate(non_sat = n - sat) %>%
  select(-sat_perc, -class, -n) %>%
  group_by(demo) %>%
  nest() %>% 
  mutate(chisq_p = map_dbl(data, ~chisq.test(.)$p.value)) %>%
  select(-data) %>%
  ungroup() %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = c(chisq_p), decimals = 3)

demo	chisq_p
Gender	0.844
Race	0.166
Age	0.070
Education	0.000
Party ID	0.013
Household Income	0.000

There may be other lurking variables as well. For instance, age may be more important that ethnicity, but if the survey contained a higher proportion of whites in the 35-55 group, then that would skew their results, even though they’d be dead even if we adjusted for age. The news report also mentioned religious attendance, that’s not in the crosstab but would certainly add flavor to a full-blown analysis.

*Ryan Burge delves into relgious/demographic/political data quite regularly, his blog is well worth the follow. to

Reporting this way is certainly better than a plain old cross tab, and the error bars should keep us somewhat in line. And there are situations where we have to take this route either because of time constraints or because this is the only data we have. Ideally, we’d set up a regression model that would let us know how much each demographic contributes.

*Note to aspiring data analysts: someday, you’ll get an emergency request for data, likely a crosstab. Providing even basic linear regression, t-tests, etc. can help avoid some of the sample size issues we’re trying to stay away from.

But at a fundemental level, knowing the questions to ask will help keep you from being led astray by bad studies. I used R for this, but there’s nothing here that couldn’t be done in Excel or Google Sheets. I’m not saying you need to crunch the numbers yourself each time you see a study, but slowing down and thinking it through will benefit you in the long run.