Pitch Clock Impact on Batters

statistics
r
baseball
Author

Mark Jurries II

Published

May 4, 2023

MLB’s new pitch clock has sped games up considerably this year, dropping the average length of the game by almost half an hour. One can’t help but wonder how this is impacting batters*. We could just look at league averages and compare March/April of 2022 to March/April of 2023, but we could miss some individual variation.

*Pitchers too, but that’s another study for another day.

There are some other confounders this year - bigger bases, shift bans, limits on pickoffs - that could also impact a player’s outcomes. So we’ll compare strikeout rate, walk rate, swing rate, ourside swing rate, and inside the zone swing rate for all players with plate appearances in March/April of both 2022 and 2023. These are all stats we could reasonably expect to be impacted by the pitch clock. We’ll also look at wOBA, because we want some idea of the impact on offensive production. Data is from the always-excellent Fangraphs.

Before we get going, we need a rough idea of batter pace last year. And just to make sure there’s a difference, we’ll compare to this year.

Show the code
library(hrbrthemes)
library(gt)
library(gtExtras)
library(tidyverse)
library(plotly)

set.seed(20230502)

fg_2022 <- read.csv('fg_batters_2022_april.csv') %>%
  mutate(season = 'Mar_April_2022')

fg_2023 <- read.csv('fg_batters_2023_april.csv') %>%
  mutate(season = 'Mar_April_2023')

fg_2022 %>%
  rbind(fg_2023) %>%
  ggplot(aes(x = Pace, fill = season))+
  geom_density(alpha = 0.5)+
  scale_fill_manual(values = c('grey', '#0076B6'))+
  theme_ipsum()

Well, that’s interesting right there. We expected faster pacing, and with a ceiling now in place there’s also less spread. There’s some overlap, but so little we may as well ignore it. To set our groups, we’ll put batters into four buckets based on 2022 pace- slowest (above 75th percentile), kinda slow (50-75), kinda fast (25-50, and fastest (bottom 25%) . Those breakpoints are:

Show the code
quantile(fg_2022$Pace, c(0.25, 0.5, 0.75))
  25%   50%   75% 
22.25 23.00 24.15 

First, we’ll look at all players, colors by the bins we just made, by plotting their change in pace against their change in other stats.

Show the code
last_year_pacing <- fg_2022 %>%
  mutate(last_year_pace_bin = case_when(Pace <= 22.25 ~ '1. Fastest (22.25 and below)',
                                        Pace < 23.00 ~ '2. Kinda Fast (22.26 to 22.99)',
                                        Pace < 24.15  ~ '3. Kinda Slow (23.00 to 24.14)',
                                        TRUE ~ '4. Slowest (24.15 and above)')) %>%
  select(playerid, last_year_pace_bin)

#csv adds percentage to some stats, use this function to remove and convert to numeric
perc_to_double <- function(x){
  y <- as.numeric(gsub('%', '', x)) / 100
  y
}

fg_all <- fg_2022 %>%
  rbind(fg_2023) %>%
  select(season, playerid, Name, PA, BB., K., wOBA, O.Swing., Z.Swing., Swing., Pace) %>%
  rename(BB_Rate = BB.,
         K_Rate = K.,
         O_Swing = O.Swing.,
         Z_Swing = Z.Swing.,
         Swing = Swing.) %>%
  mutate(BB_Rate = perc_to_double(BB_Rate),
         K_Rate = perc_to_double(K_Rate),
         O_Swing = perc_to_double(O_Swing),
         Z_Swing = perc_to_double(Z_Swing),
         Swing = perc_to_double(Swing)
         ) %>%
  pivot_longer(cols = c('PA', 'BB_Rate', 'K_Rate', 'O_Swing', 'Z_Swing', 'Swing', 'wOBA', 'Pace')) %>%
  pivot_wider(names_from = season, values_from = value) %>%
  filter(!is.na(Mar_April_2022) & !is.na(Mar_April_2023)) %>%
  inner_join(last_year_pacing) %>%
  mutate(dif = Mar_April_2023 - Mar_April_2022)

scatter_stats <- fg_all %>%
  group_by(playerid) %>%
  mutate(pace_dif = mean(case_when(name == 'Pace' ~ dif, TRUE ~ NA), na.rm = TRUE)) %>%
  filter(name != 'Pace' & name != 'PA') %>%
  ggplot(aes(x = dif, y = pace_dif))+
  geom_point(aes(color = last_year_pace_bin, text = paste(Name, '<br>Metric: ', name, '<br>Difference: ', round(dif, digits = 3), '<br>Pace Dif: ', pace_dif, '<br>Mar/Apr 2022: ', round(Mar_April_2022, 3), '<br>Mar/Apr 2023: ', round(Mar_April_2023, 3))))+
  stat_smooth()+
  theme_ipsum()+
  facet_wrap(name ~ ., scales = 'free')+
  scale_color_manual(values = c('#a2d7d8', '#bfe1bf', '#fcd059', '#de5842'))+
  theme(legend.position = 'bottom')+
  labs(color = "2022 Pace Grouping")

scatter_stats_plotly <- ggplotly(scatter_stats, tooltip = 'text') %>%
  layout(legend = list(orientation = 'h'))

scatter_stats_plotly

We don’t see a lot of association here. Curiously, though, the hitters who were already pretty quick have seen an increase in swing rate, mostly inside the zone. That could be a sampling issue, or the result of perceived pressure. Regardless, it doesn’t seem to be impacting strikeout or walk rates.

We’ll a correlation just to validate our eye test above.

Show the code
fg_all %>%
  group_by(playerid) %>%
  mutate(pace_dif = mean(case_when(name == 'Pace' ~ dif, TRUE ~ NA), na.rm = TRUE)) %>%
  filter(name != 'Pace' & name != 'PA') %>%
  group_by(name) %>%
  summarise(correlation = cor(dif, pace_dif),
            r_squared = correlation ^ 2) %>%
  gt() %>%
  fmt_number(columns = 2:3, decimals = 3) %>%
  tab_style(
    locations = cells_column_labels(columns = everything()),
      style = list(
        cell_text(weight = "bold")
      )
    ) %>%
  gt_theme_espn()
name correlation r_squared
BB_Rate −0.047 0.002
K_Rate 0.120 0.014
O_Swing 0.209 0.044
Swing 0.247 0.061
Z_Swing 0.279 0.078
wOBA −0.002 0.000

Our highest r-squared is a mere 0.078, suggesting that so far anyway there’s no relationship between change in pace and the stats shown. What if we just group players by last year’s speed - will that show anything?

Show the code
fg_all %>%
  filter(name != 'PA' & name != 'Pace') %>%
  ggplot(aes(x = dif, color =  last_year_pace_bin))+
  geom_density()+
  facet_wrap(name ~ ., scales = 'free')+
  theme_ipsum()+
  geom_vline(xintercept = 0)+
  scale_color_manual(values = c('#a2d7d8', '#bfe1bf', '#fcd059', '#de5842'))+
  theme(legend.position = 'bottom')+
  labs(color = "2022 Pace Grouping")+
  guides(colour = guide_legend(nrow = 2))

Nothing new here, either. Players either got better or worse, regardless of last year’s pace. These are still small samples so perhaps things will shift over the season, but so far there’s not much reason to think the rule change is having an impact on most individual players.