Shohei vs The Babe

statistics
r
baseball
Author

Mark Jurries II

Published

December 15, 2023

Shohei Ohtani just signed with the Dodgers for $700 million over 10 years. With deferred money, the present value is only $46 million a year (great writeup at Fangraphs with details), which is fair since he’s the first player to both hit and pitch at an elite level since Babe Ruth.

So naturally, we want to see how they compare to each other. We’ll get data from the aforementioned Fangraphs. This doesn’t include Ohtani’s time in Japan, so bear in mind we’re comparing MLB careers, not baseball careers. We’ll start with WAR - which player added the most estimated value both by season and in total? We’ll also look at batting and pitching separately, as well as view our trends both by seasons played and age.

Show the code
library(hrbrthemes)
library(gridExtra)
library(gt)
library(gtExtras)
library(patchwork)
library(tidyverse)

batting_data <- read.csv('batting.csv')
pitching_data <- read.csv('pitching.csv')

batting_long <- batting_data %>%
  select(-NameASCII, -PlayerId, -MLBAMID) %>%
  pivot_longer(-c(Season, Name, Team, Age), names_to = 'metric', values_to = 'value') %>%
  mutate(type = 'Batting')

pitching_long <- pitching_data %>%
  select(-NameASCII, -PlayerId, -MLBAMID) %>%
  pivot_longer(-c(Season, Name, Team, Age), names_to = 'metric', values_to = 'value') %>%
  mutate(type = 'Pitching')

WAR <- batting_long %>%
  filter(metric == 'WAR') %>%
  rbind(pitching_long %>% filter(metric == 'WAR')) %>%
  pivot_wider(names_from = type, values_from = value, values_fill = list(value = 0)) %>%
  mutate(Total = Batting + Pitching) %>%
  arrange(Name, Season) %>%
  group_by(Name) %>%
  mutate(Batting_to_Date = cumsum(Batting),
         Pitching_to_Date = cumsum(Pitching),
         Total_to_Date = cumsum(Total),
         season_number = dense_rank(Season))

season_war_plot <- WAR %>%
  ggplot(aes(x = season_number, y = Total_to_Date, color = Name))+
  geom_line()+
  theme_ipsum()+
  scale_color_manual(values = c('#C4CED3', '#005A9C'))+
  ylab('Total WAR to Date')+
  xlab('Seasons Played')

age_war_plot <- WAR %>%
  ggplot(aes(x = Age, y = Total_to_Date, color = Name))+
  geom_line()+
  theme_ipsum()+
  scale_color_manual(values = c('#C4CED3', '#005A9C'))+
  ylab('Total WAR to Date')+
  xlab('Age')

season_bat_war_plot <- WAR %>%
  ggplot(aes(x = season_number, y = Batting_to_Date, color = Name))+
  geom_line()+
  theme_ipsum()+
  scale_color_manual(values = c('#C4CED3', '#005A9C'))+
  ylab('Batting WAR to Date')+
  xlab('Seasons Played')

age_bat_war_plot <- WAR %>%
  ggplot(aes(x = Age, y = Batting_to_Date, color = Name))+
  geom_line()+
  theme_ipsum()+
  scale_color_manual(values = c('#C4CED3', '#005A9C'))+
  ylab('Batting WAR to Date')+
  xlab('Age')

season_pitch_war_plot <- WAR %>%
  ggplot(aes(x = season_number, y = Pitching_to_Date, color = Name))+
  geom_line()+
  theme_ipsum()+
  scale_color_manual(values = c('#C4CED3', '#005A9C'))+
  ylab('Pitching WAR to Date')+
  xlab('Seasons Played')

age_pitch_war_plot <- WAR %>%
  ggplot(aes(x = Age, y = Pitching_to_Date, color = Name))+
  geom_line()+
  theme_ipsum()+
  scale_color_manual(values = c('#C4CED3', '#005A9C'))+
  ylab('Pitching WAR to Date')+
  xlab('Age')

season_war_plot + age_war_plot + season_bat_war_plot + age_bat_war_plot + season_pitch_war_plot + age_pitch_war_plot + plot_layout(guides = "collect", ncol = 2) & theme(legend.position = "top")

Show the code
WAR %>%
  filter(season_number == 6) %>%
  ungroup() %>%
  select(Name, Age, Batting_to_Date, Pitching_to_Date, Total_to_Date) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = 3:5, decimals = 1) %>%
  tab_header(title = 'WAR After 6 Seasons')
WAR After 6 Seasons
Name Age Batting_to_Date Pitching_to_Date Total_to_Date
Babe Ruth 24 18.3 12.3 30.5
Shohei Ohtani 28 19.9 11.8 31.8
Show the code
WAR %>%
  filter(Age == 28) %>%
  ungroup() %>%
  select(Name, Age, Batting_to_Date, Pitching_to_Date, Total_to_Date) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = 3:5, decimals = 1) %>%
  tab_header(title = 'WAR After Age 28')
WAR After Age 28
Name Age Batting_to_Date Pitching_to_Date Total_to_Date
Babe Ruth 28 66.3 12.1 78.4
Shohei Ohtani 28 19.9 11.8 31.8

If we go by seasons played, they’re neck and neck - in fact, Shohei has a slight edge. But Ruth was younger by 4 years, and while he was just about done pitching at this point he had a long and productive career behind the plate coming up. Ohtani isn’t pitching in 2024 as he recovers from surgery, but presumably will be on the mound again after. If he keeps going, he’ll likely pass Ruth in that aspect of the game.

Of course, Ruth is known more for being a homer run hitter than he is for being a two-way player. So how does Ohtani compare there?

Show the code
hr_totals <- batting_long %>%
  filter(metric == 'HR') %>%
  arrange(Name, Season) %>%
  group_by(Name) %>%
  mutate(HR_to_Date = cumsum(value),
         season_number = dense_rank(Season))

season_hr_totals_plot <- hr_totals %>%
  ggplot(aes(x = season_number, y = HR_to_Date, color = Name))+
  geom_line()+
  theme_ipsum()+
  scale_color_manual(values = c('#C4CED3', '#005A9C'))+
  ylab('Home Runs to Date')+
  xlab('Seasons Played')

age_hr_totals_plot <- hr_totals %>%
  ggplot(aes(x = Age, y = HR_to_Date, color = Name))+
  geom_line()+
  theme_ipsum()+
  scale_color_manual(values = c('#C4CED3', '#005A9C'))+
  ylab('Home Runs to Date')+
  xlab('Age')

season_hr_totals_plot + age_hr_totals_plot + plot_layout(guides = "collect", ncol = 2) & theme(legend.position = "top")

Show the code
hr_totals %>%
  filter(season_number == 6) %>%
  ungroup() %>%
  select(Name, Age, HR_to_Date) %>%
  gt() %>%
  gt_theme_espn() %>%
  tab_header(title = 'HR after 6 Seasons')
HR after 6 Seasons
Name Age HR_to_Date
Babe Ruth 24 49
Shohei Ohtani 28 171
Show the code
hr_totals %>%
  filter(Age == 28) %>%
  ungroup() %>%
  select(Name, Age, HR_to_Date) %>%
  gt() %>%
  gt_theme_espn() %>%
  tab_header(title = 'HR at Age 28')
HR at Age 28
Name Age HR_to_Date
Babe Ruth 28 238
Shohei Ohtani 28 171

If we just look at seasons played, Ohtani is looking pretty good. But Ruth had a 4 year head start, which puts Shohei behind in the age comparison. We can add in the 48 home runs he hit in the Japanese leagues to bring him up to 219, but he still trails the Bambino.

Of course, we’re dealing with two vastly different eras of baseball. Ruth played when hitting home runs was rare, the league wasn’t integrated, and the sport was in many ways still figuring itself out. That he’s still regarded as the best player of all time is a testament to his prowess, that Ohtani is being legitimately compared to him is all the more reason to watch him play. Even if it’s with the Dodgers.