Range Defense Added and OAA- Outfield Edition

TL;DR It massively cheats and it’s bad, just ignore it.

First, OAA finally lets us compare all outfielders to each other regardless of position and without need for a positional adjustment.  Range Defense Added unsolves that problem and goes back to comparing position-by-position.  It also produces some absolutely batshit numbers.

From 2022:

Name Position Innings Range Out Score Fielded Plays
Giancarlo Stanton LF 32 -21.5 6
Giancarlo Stanton RF 280.7 -6.8 90

Stanton was -2 OAA on the year in ~300 innings (like 1/4th of a season).  An ROS of -21.5 over a full season is equivalent to pushing -50 OAA.  The worst qualified season in the Statcast era is 2016 Matt Kemp (-26 OAA in 240 opportunities), and that isn’t even -*10*% success probability added (analogous to ROS), much less -21.5%.  The worst seasons at 50+ attempts (~300 innings) are 2017 Trumbo and 2019 Jackson Frazier at -12%.  Maybe 2022 Yadier Molina converted to a full-time CF could have pulled off -21.5%, but nobody who’s actually put in the outfield voluntarily for 300 innings in the Statcast era is anywhere near that terrible.  That’s just not a number a sane model can put out without a hell of a reason, and 2022 Stanton was just bad in the field, not “craploads worse than end-stage Kemp and Trumbo” material.

Name Position Innings Range Out Score Fielded Plays
Luis Barrera CF 1 6.1 2
Luis Barrera LF 98.7 2 38
Luis Barrera RF 101 4.6 37

I thought CF was supposed to be the harder position.  No idea where that number comes from.  Barrera has played OF quite well in his limited time, but not +6.1% over the average CF well.

As I did with the infield edition, I’ll be using rate stats (Range Out Score and OAA/inning) for correlations, each player-position-year combo is treated separately, and it’s important to repeat the reminder that BP will blatantly cheat to improve correlations without mentioning anything about what they’re doing in the announcements, and they’re almost certainly doing that again here.

Here’s a chart with year-to-year correlations broken down by inning tranches (weighted by the minimum of the two paired years)

LF OAA to OAA ROS to ROS ROS to OAA Lower innings Higher Innings Inn at other positions year T Inn at other positions year T n
0 to 10 -0.06 0.21 -0.11 6 102 246 267 129
10 to 25 -0.04 0.43 0.08 17 125 287 332 128
25 to 50 0.10 0.73 0.30 35 175 355 318 135
50 to 100 0.36 0.67 0.23 73 240 338 342 120
100 to 200 0.27 0.78 0.33 142 384 310 303 121
200 to 400 0.49 0.71 0.37 284 581 253 259 85
400+ inn 0.52 0.56 0.32 707 957 154 124 75
RF OAA to OAA ROS to ROS ROS to OAA Lower innings Higher Innings Inn at other positions year T Inn at other positions year T n
0 to 10 0.10 0.34 0.05 5 91 303 322 121
10 to 25 0.05 0.57 0.07 16 140 321 299 128
25 to 50 0.26 0.59 0.14 36 186 339 350 101
50 to 100 0.09 0.75 0.16 68 244 367 360 168
100 to 200 0.38 0.72 0.42 137 347 376 370 83
200 to 400 0.30 0.68 0.43 291 622 245 210 83
400+ inn 0.60 0.58 0.32 725 1026 120 129 92
CF OAA to OAA ROS to ROS ROS to OAA Lower innings Higher Innings Inn at other positions year T Inn at other positions year T n
0 to 10 0.00 0.16 0.09 5 161 337 391 83
10 to 25 0.00 0.42 -0.01 17 187 314 362 95
25 to 50 0.04 0.36 0.03 34 234 241 294 73
50 to 100 0.16 0.56 0.09 70 305 299 285 100
100 to 200 0.34 0.70 0.42 148 434 314 305 95
200 to 400 0.47 0.66 0.25 292 581 228 230 86
400+ inn 0.48 0.45 0.22 754 995 134 77 58

Focus on the left side of the chart first.  OAA/inning behaves reasonably, being completely useless for very small numbers of innings and then doing fine for players who actually play a lot.  ROS is simply insane.  Outfielders in aggregate get an opportunity to make a catch every ~4 innings (where opportunity is a play that the best fielders would have a nonzero chance at, not something completely uncatchable that they happen to pick up after it’s hit the ground).

ROS is claiming meaningful correlations on 1-2 opportunities and after ~10 opportunities, it’s posting year to year correlations on par with OAA’s after a full season.  That’s simply impossible (or beyond astronomically unlikely) to do with ~10 yes/no outcome data points with average talent variation well under +/-10%.  The only way to do it is by using some kind of outside information to cheat (time spent at DH/1B?, who knows, who cares).

I don’t know why the 0-10 inning correlations are so low- those players played a fair bit at other positions (see the right side of the table), so any proxy cheat measures should have reasonably stabilized- but maybe the model is just generically batshit nonsense at extremely low opportunities at a position for some unknown reason as happened with the DRC+ rollout (look at the gigantic DRC+ spread on 1 PA 1 uBB pitchers in the cheating link above).

Also, once ROS crosses the 200-inning threshold, it starts getting actively worse at correlating to itself.  Across all three positions, it correlates much better at lower innings totals and then shits the bed once it starts trying to correlate full-time seasons to full-time seasons.  This is obviously completely backwards of how a metric should behave and more evidence that the basic model behavior here is “good correlation based on cheating (outside information) that’s diluted by mediocre correlation on actual play-outcome data.”

They actually do “improve” on team switchers here relative to nonswitchers- instead of being the worst as they were in the infield, again likely due to overfitting to a fairly small number of players- but it’s still nothing of note given how bad they are relative to OAA’s year-to year for regular players even with the cheating.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: