Analyzing Supreme Court Arguments

I was listening to the Advisory Opinions podcast when they mentioned some stats available at Empirical SCOTUS around word counts for the current Supreme Court session. While the hosts didn’t put much stock into word counts (which are neat but ultimately just trivia), my curiosity was piqued enough to see if the data was available for download. Sure enough, it’s there (big kudos to that team!), so time to play ball.

First, I wanted to measure positive/negative sentiment by each justice. Sentiment analysis is a bit of a blunt tool, it’s not great at picking up context, sarcasm*, etc., but it’s a starting point. It’s been a while since I’ve done any text mining, so “Text Mining with R” was an invaluable resource. Note this will only return words we can flag, so the word count here will drop a lot of “stop words” - i.e. “the”, “and”, etc.

*I would never be sarcastic, of course..

Show the code

library(hrbrthemes)
library(gt)
library(gtExtras)
library(tidytext)
library(tidyverse)

sc_data <- readxl::read_xlsx('scotus_transcripts_23.xlsx')

sc_df <- sc_data %>%
  unnest_tokens(word, text)

sc_sentiment <- sc_df %>%
  inner_join(get_sentiments("bing"))

sc_sentiment %>%
  mutate(speaker = str_to_title(speaker)) %>%
  mutate(speaker = case_when (grepl('Justice', speaker) ~ speaker, TRUE ~ 'Lawyer')) %>%
  group_by(speaker, sentiment) %>%
  summarise(n = n()) %>%
  pivot_wider(names_from = sentiment, values_from = n) %>%
  mutate(total = negative + positive,
         perc_negative = negative / total,
         perc_positive = positive / total) %>%
  ungroup() %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_percent(c(perc_negative, perc_positive), decimals = 1) %>%
  fmt_number(c(positive, negative, total), decimals = 0)

speaker	negative	positive	total	perc_negative	perc_positive
Chief Justice Roberts	66	98	164	40.2%	59.8%
Justice Alito	113	145	258	43.8%	56.2%
Justice Barrett	89	111	200	44.5%	55.5%
Justice Gorsuch	102	123	225	45.3%	54.7%
Justice Jackson	224	210	434	51.6%	48.4%
Justice Kagan	116	150	266	43.6%	56.4%
Justice Kavanaugh	83	115	198	41.9%	58.1%
Justice Sotomayor	114	94	208	54.8%	45.2%
Justice Thomas	30	22	52	57.7%	42.3%
Lawyer	1,206	1,432	2,638	45.7%	54.3%

The lawyers tend to be more positive, though this level of aggregation really should split out the defense and prosecution. The sample sizes overall are pretty small - I especially wouldn’t read much into Justice Thomas’ 52 words - but on the whole it’s a relatively positive group. What if we look by case instead?

Show the code

sc_sentiment %>%
  group_by(case_name, speaker_type, sentiment) %>%
  summarise(n = n()) %>%
  group_by(case_name, speaker_type) %>%
  mutate(total_n = sum(n),
         perc = n / total_n) %>%
  filter(sentiment == 'positive') %>%
  ggplot(aes(x = perc, y = case_name, color = speaker_type))+
  geom_point()+
  theme_ipsum()+
  scale_x_percent()+
  scale_color_manual(values = c('grey', 'black'))+
  xlab("")+
  ylab("")+
  ggtitle(label = 'Positive % by Supreme Court Case')+
  labs(caption = "Data courtesy empiricalscotus.com")

Not too far apart, except for the Consumer Financial Protection case - the lawyers were more upbeat than the Justices by a wide margin. But we skipped a step - how good are the positive/negative word choices? Let’s break down the top five words by justice and see what we get.

Show the code

data(stop_words)

sc_justice <- sc_sentiment %>%
  filter(speaker_type == 'Justice') %>%
  mutate(speaker = str_to_title(speaker))

sc_justice %>%
  anti_join(stop_words) %>%
  group_by(speaker, word) %>%
  summarise(n = n()) %>%
  inner_join(get_sentiments("bing")) %>%
  arrange(desc(n)) %>%
  group_by(speaker) %>%
  mutate(rank = rank(desc(n))) %>%
  slice_max(n, n = 5) %>% 
  ungroup() %>%
  mutate(word = reorder_within(word, n, speaker)) %>%
  ggplot(aes(x = n, y = word, fill = sentiment))+
  geom_col(show.legend = FALSE)+
  facet_wrap(~speaker, scales = "free_y")+
  theme_ipsum()+
  scale_fill_manual(values = c('#852545', '#295396'))+
  scale_y_reordered()

A lot of talk on discrimination, issues, etc. that our sentiment dictionary views as negative, but you can’t have a trial and only discuss rainbows and unicorns. So while our model thinks they’re being gloomy, in reality it’s just legalese, which by nature has to deal with that sort of thing. We could use a different sentiment library, or we could tune this a bit more, but for a quick look we can call it good.

Finally, let’s see what separates our cases so far.

Show the code

sc_sentiment %>%
  anti_join(stop_words) %>%
  group_by(case_name, word) %>%
  summarise(n = n()) %>%
  inner_join(get_sentiments("bing")) %>%
  arrange(desc(n)) %>%
  group_by(case_name) %>%
  mutate(rank = rank(desc(n))) %>%
  slice_max(n, n = 10) %>% 
  ungroup() %>%
  mutate(word = reorder_within(word, n, case_name)) %>%
  ggplot(aes(x = n, y = word, fill = sentiment))+
  geom_col(show.legend = FALSE)+
  facet_wrap(~case_name, scales = "free_y")+
  theme_ipsum()+
  scale_fill_manual(values = c('#852545', '#295396'))+
  scale_y_reordered()

This all checks out - Laufer deals with alleged discrimination against the disabled at hotels and there are questions on if it’s still relevant (if not, it’d be moot, hence its inclusion), South Carolina with redistricting (hence partisan and Trump), CFB with how agencies can enforce law without Congress, and so on.

This is pretty high quality text data, so we can get some decent stuff out of it. This isn’t always the case, I’ve worked with extremely sparse text data in the past so it’s fun to get a good dataset like this. Given enough time, one could probably build a model to forecast how a Justice may vote in a case given their history + the current case.