Revisiting the DRC+ team switcher claim

The algorithm has changed a fair bit since I investigated that claim- at the least, it’s gotten rid of most of its park factor and regresses (effectively) less than it used to.  It’s not impossible that it could grade out differently now than it did before, and I told somebody on twitter that I’d check it out again, so here we are.  First of all, let’s remind everybody what their claim is.  From https://www.baseballprospectus.com/news/article/45383/the-performance-case-for-drc/, Jonathan Judge says:


Table 2: Reliability of Team-Switchers, Year 1 to Year 2 (2010-2018); Normal Pearson Correlations[3]

Metric Reliability Error Variance Accounted For
DRC+ 0.73 0.001 53%
wOBA 0.35 0.001 12%
wRC+ 0.35 0.001 12%
OPS+ 0.34 0.001 12%
OPS 0.33 0.002 11%
True Average 0.30 0.002 9%
AVG 0.30 0.002 9%
OBP 0.30 0.002 9%

With this comparison, DRC+ pulls far ahead of all other batting metrics, park-adjusted and unadjusted. There are essentially three tiers of performance: (1) the group at the bottom, ranging from correlations of .3 to .33; (2) the middle group of wOBA and wRC+, which are a clear level up from the other metrics; and finally (3) DRC+, which has almost double the reliability of the other metrics.

You should pay attention to the “Variance Accounted For” column, more commonly known as r-squared. DRC+ accounts for over three times as much variance between batters than the next-best batting metric. In fact, one season of DRC+ explains over half of the expected differences in plate appearance quality between hitters who have switched teams; wRC+ checks in at a mere 16 percent.  The difference is not only clear: it is not even close.

Let’s look at Predictiveness.  It’s a very good sign that DRC+ correlates well with itself, but games are won by actual runs, not deserved runs. Using wOBA as a surrogate for run-scoring, how predictive is DRC+ for a hitter’s performance in the following season?

Table 3: Reliability of Team-Switchers, Year 1 to Year 2 wOBA (2010-2018); Normal Pearson Correlations

Metric Predictiveness Error
DRC+ 0.50 0.001
wOBA 0.37 0.001
wRC+ 0.37 0.002
OPS+ 0.37 0.001
OPS 0.35 0.002
True Average 0.34 0.002
OBP 0.30 0.002
AVG 0.25 0.002

If we may, let’s take a moment to reflect on the differences in performance we see in Table 3. It took baseball decades to reach consensus on the importance of OBP over AVG (worth five points of predictiveness), not to mention OPS (another five points), and finally to reach the existing standard metric, wOBA, in 2006. Over slightly more than a century, that represents an improvement of 12 points of predictiveness. Just over 10 years later, DRC+ now offers 13 points of improvement over wOBA alone.


 

Reading that, you’re pretty much expecting a DIPS-level revelation.  So let’s see how good DRC+ really is at predicting team switchers.  I put DRC+ on the wOBA scale, normalized each performance to the league-average wOBA that season (it ranged from .315 to .326), and measured the mean absolute error (MAE) of wOBA projections for the next season, weighted by the harmonic mean of the PAs in each season.  DRC+ had a MAE of 34.2 points of wOBA for team-switching position players.  Projecting every team-switching position player to be exactly league average had a MAE of 33.1 points of wOBA.  That’s not a mistake.  After all that build-up, DRC+ is literally worse at projecting team-switching position players than assuming that they’re all league average.

If you want to say something about pitchers at the plate…
i-dont-think-so-homey-dont-play-that

 

Even though Jonathan Judge felt like calling me a total asshole incompetent troll last night, I’m going to show how his metric could be not totally awful at this task if it were designed and quality-tested better.  As I noted yesterday, DRC+’s weightings are *way* too aggressive on small numbers of PAs.  DRC+ shouldn’t *need* to be regressed after the fact- the whole idea of the metric is that players should only be getting credit for what they’ve shown they deserve (in the given season), and after a few PAs, they barely deserve anything, but DRC+ doesn’t grasp that at all and its creator doesn’t seem to realize or care that it’s a problem.

If we regress DRC+ after the fact to see what happens in an attempt to correct that flaw, it’s actually not a dumpster fire.  All weightings are harmonic means of the PAs.  Every position player pair of consecutive 2010-18 seasons with at least 1 PA in each is eligible.  All tables are MAEs in points of wOBA trying to project year T+1 wOBA..

First off, I determined the regression amounts for DRC+ and wOBA to minimize the weighted MAE for all position players, and that came out to adding 416 league average PAs for wOBA and 273 league average PAs for DRC+.  wOBA assigns 100% credit to the batter.  DRC+ *still* needs to be regressed 65% as much as wOBA.  DRC+ is ridiculously overaggressive assigning “deserved” credit.

Table 1.  MAEs for all players

lgavg raw DRC+ raw wOBA reg wOBA reg DRC+
33.21 31.00 33.71 29.04 28.89

Table 2. MAEs for all players broken down by year T PAs

Year T PA lgavg raw DRC+ raw wOBA reg wOBA reg DRC+ T+1 wOBA
1-99 PAs 51.76 48.84 71.82 49.32 48.91 0.284
100-399 PA 36.66 36.64 40.16 34.12 33.44 0.304
400+ PA 30.77 27.65 28.97 25.81 25.91 0.328

Didn’t I just say DRC+ had a problem with being too aggressive in small samples?  Well, this is one area where that mistake pays off- because the group of hitters who have 1-99 PA over a full season are terrible, being overaggressive crediting their suckiness pays off, but if you’re in a situation like now, where the real players instead of just the scrubs and callups have 1-99 PAs, being overaggressive is terribly inaccurate.  Once the population mean approaches league-average quality, the need for- and benefit of- regression is clear. If we cheat and regress each bucket to its population mean, it’s clear that DRC+ wasn’t actually doing anything special in the low-PA bucket, it’s just that regression to 36 points of wOBA higher than the mean wasn’t a great corrector.

Table 3. (CHEATING) MAEs for all players broken down by year T PAs, regressed to their group means (same regression amounts as above).

Year T PA lgavg raw DRC+ raw wOBA reg wOBA reg DRC+ T+1 wOBA
1-99 PAs 51.76 48.84 71.82 46.17 46.30 0.284
100-399 PA 36.66 36.64 40.16 33.07 33.03 0.304
400+ PA 30.77 27.65 28.97 26.00 25.98 0.328

There’s very little difference between regressed wOBA and regressed DRC+ here.  DRC+ “wins” over wOBA by 0.00015 wOBA MAE over all position players, clearly justifying the massive amount of hype Jonathan Judge pumped us up with.  If we completely ignore the trash position players and only optimize over players who had 100+PA in year T, then the regression amounts increase slightly- 437 PA for wOBA and 286 for DRC+, and we get this chart:

Table 4. MAEs for all players broken down by year T PAs, optimized on 100+ PA players

Year T PA lgavg raw DRC+ raw wOBA reg wOBA reg DRC+ T+1 wOBA
100+ PA 32.55 30.37 32.36 28.32 28.19 0.321
100-399 PA 36.66 36.64 40.16 34.12 33.45 0.304
400+ PA 30.77 27.65 28.97 25.81 25.91 0.328

Nothing to see here either, DRC+ with a 0.00013 MAE advantage again.  Using only 400+PA players to optimize over only changes the DRC+ entry to 25.90, so regressed wOBA wins a 0.00009 MAE victory here.

In conclusion, regressed wOBA and regressed DRC+ are so close that there’s no meaningful difference, and I’d grade DRC+ a microscopic winner.  Raw DRC+ is completely awful in comparison, even though DRC+ shouldn’t need anywhere near this amount of extra regression if it were working correctly to begin with.

I’ve slowrolled the rest of the team-switcher nonsense.  It’s not very exciting either.  I defined 3 classes of players, Stay = played both years entirely for the same team, Switch = played year T entirely for 1 team and year T+1 entirely for 1 other team, Midseason = switched midseason in at least one of the years.

Table 5. MAEs for all players broken down by stay/switch, any number of year T PAs

stay/

switch

lgavg raw DRC+ raw wOBA reg wOBA reg DRC+ T+1 wOBA
stay 33.21 29.86 32.19 27.91 27.86 0.325
switch 33.12 34.20 37.89 31.57 31.53 0.312
mid 33.29 33.01 36.47 31.67 31.00 0.305
sw+mid 33.21 33.60 37.17 31.62 31.26 0.309

It’s the same story as before.  Raw DRC+ sucks balls at projecting T+1 wOBA and is actually worse than “everybody’s league average” for switchers, regressed DRC+ wins a microscopic victory over regressed wOBA for stayers and switchers.  THERE’S (STILL) LITERALLY NOTHING TO THE CLAIM THAT DRC+, REGRESSED OR OTHERWISE, IS ANYTHING SPECIAL WITH RESPECT TO PROJECTING TEAM SWITCHERS.  These are the same conclusions I found the first time I looked, and they still hold for the current version of the DRC+ algorithm.

 

 

DRC+ weights TTO relatively *less* than BIP after 10 games than after a full season

This is a cut-out from a longer post I was running some numbers for, but it’s straightforward enough and absurd enough that it deserves a standalone post.  I’d previously looked at DRAA linear weights and the relevant chart for that is reproduced here.  This is using seasons with 400+PA.

relative to average PA 1b 2b 3b hr bb hbp k bip out
old DRAA 0.22 0.38 0.52 1.16 0.28 0.24 -0.24 -0.13
new DRAA 0.26 0.45 0.62 1.17 0.26 0.30 -0.24 -0.15
wRAA 0.44 0.74 1.01 1.27 0.27 0.33 -0.26 -0.27

 

I reran the same analysis on 2019 YTD stats, with all position players and with a 25 PA minimum, and these are the values I recovered.  Full year is the new DRAA row above, and the percentages are the percent relative to those values.

1b 2b 3b hr bb hbp k BIP out
YTD 0.13 0.21 0.29 0.59 0.11 0.08 -0.14 -0.10
min 25 PA 0.16 0.27 0.37 0.63 0.12 0.09 -0.15 -0.11
Full Year 0.26 0.45 0.62 1.17 0.26 0.30 -0.24 -0.15
YTD %s 48% 47% 46% 50% 41% 27% 57% 64%
min 25PA %s 61% 59% 59% 54% 46% 30% 61% 74%

So.. this is quite something.  First of all, events are “more-than-half-deserved” relative to the full season after only 25-50 PA.  There’s no logical or mathematical reason for that to be true, for any reasonable definition of “deserved”, that quickly.  Second, BIP hits are discounted *LESS* in a small sample than walks are, and BIP outs are discounted *LESS* in a small sample than strikeouts are.  The whole premise of DRC+ is that TTO outcomes belong to the player more than the outcomes of balls in play, and are much more important in small samples, but here we are, with small samples, and according to DRC+, the TTO OUTCOMES ARE RELATIVELY LESS IMPORTANT NOW THAN THEY ARE AFTER A FULL SEASON.  Just to be sure, I reran with wRAA and extracted almost the exact same values as chart 1, so there’s nothing super weird going on here.  This is complete insanity- it’s completely backwards from what’s actually true, and even to what BP has stated is true.  The algorithm has to be complete nonsense to “come to that conclusion”.

Reading the explanation article, I kept thinking the same thing over and over.  There’s no clear logical or mathematical justification for most steps involved, and it’s just a pile of junk thrown together and tinkered with enough to output something resembling a baseball stat most of the time if you don’t look too closely. It’s not the answer to any articulable, well-defined question.  It’s not a credible run-it-back projection (I’ll show that unmistakably in the next post, even though it’s already ruled out by the.. interesting.. weightings above).

Whenever a hodgepodge model is thrown together like DRC+ is, it becomes difficult-to-impossible to constrain it to obey things that you know are true.  At what point in the process did it “decide” that TTO outcomes were relatively less important now?  Probably about 20 different places where it was doing nonsense-that-resembles-baseball-analysis and optimizing functions that have no logical link to reality.  When it’s failing basic quality testing- and even worse, when obvious quality assurance failures are observed and not even commented on (next post)- it’s beyond irresponsible to keep running it out as something useful solely on the basis of a couple of apples-to-oranges comparisons on rigged tests.

 

A new look at the TTOP, plus a mystery

I had the bright idea to look at the familiarity vs. fatigue TTOP debate, which has MGL on the familiarity side and Pizza Cutter on the fatigue side, by measuring performance based on the number of pitches the batter had seen previously and the number of pitches that the pitcher had thrown to other players in between the PAs in question.  After all, a fatigue effect on the TTOP shouldn’t be from “fatigue”, but “relative change in fatigue”, and that seemed like a cleaner line of inquiry than just total pitch count.  Not a perfect one, but one that should pick up a signal if it’s there.  Then I realized MGL had already done the first part of that experiment, which I’d somehow completely forgotten even though I’d read that article and the followup around the time they came out.  Oh well.  It never hurts to redo the occasional analysis to make sure conclusions still hold true.

I found a baseline 15 point PA1-PA2 increase as well as another 15 point PA2-PA3 increase.  I didn’t bother looking at PA4+ because the samples were tiny and usage is clearly changing.  In news that should be surprising to absolutely nobody reading this, PAs given to starters are on the decline overall and the number of PA4+ is absolutely imploding lately.

Season Total PAs 1st TTO 2nd 3rd 4th 5th
2008 116960 42614 40249 30731 3359 7
2009 116963 42628 40186 30736 3406 7
2010 119130 42621 40457 32058 3990 4
2011 119462 42588 40458 32333 4080 3
2012 116637 42506 40336 30741 3050 4
2013 116872 42570 40422 31026 2851 3
2014 117325 42612 40618 31235 2856 4
2015 114797 42658 40245 29580 2314 0
2016 112480 42461 40128 28193 1698 0
2017 110195 42478 39912 26476 1329 0
2018 106051 42146 38797 24057 1051 0

Looking specifically at PA2 based on the number of pitches seen in PA1, I found a more muted effect than MGL did using 2008-2018 data with pitcher-batters and IBB/sac-bunt PAs removed.  My data set consisted of (game,starter,batter,pa1,pa2,pa3) rows where the batter had to face the starter at least twice, the batter wasn’t the pitcher, and any ibb/sac bunt PA in the first three trips disqualified the row (pitch counts do include pitches to non-qualified rows where relevant).  For a first pass, that seemed less reliant on individual batter-pitcher projections than allowing each set of PAs to be biased by crap hitters sac-bunting and good hitters getting IBBd would have been.

Pitches in PA 1 wOBA in PA 2 Expected** n
1 0.338 0.336 39832
2 0.341 0.335 69761
3 0.336 0.335 79342
4 0.334 0.335 82847
5 0.339 0.337 74786
6 0.347 0.338 51374
7+ 0.349 0.337 36713

MGL found a 15 point bonus for seeing 5+ pitches the first time up (on top of the baseline 10 he found), but I only get about an 11 point bonus on 6+ pitches and 3 points of that are from increased batter/worse pitcher quality (“Expected” is just a batter/pitcher quality measure, not an actual 2nd TTO prediction). The SD of each bucket is on the order of .002, so it’s extremely likely that this effect is real, and also likely that it’s legitimately smaller than it was in MGL’s dataset, assuming I’m using a similar enough sampling/exclusion method, which I think I am.  It’s not clear to me that that has to be an actual familiarity effect, because I would naively expect to see more of a monotonic increase throughout the number of pitches seen instead of the J-curve, but the buckets have just enough noise that the J-curve might simply be an artifact anyway, and short PAs are an odd animal in their own right as we’ll see later.

Doing the new part of the analysis, looking at the wOBA difference in PA2-PA1 based on the number of intervening pitches to other batters, I wasn’t sure I was going to find much fatigue evidence early in the game, but as it turns out, the relationship is clear and huge.

intervening pitches wOBA PA2-PA1 vs base .015 TTOP n
<=20 -0.021 -0.036 9476
21 -0.005 -0.020 5983
22 -0.005 -0.020 8652
23 0.004 -0.011 11945
24 0.000 -0.015 15683
25 0.004 -0.011 19592
26 0.001 -0.014 23057
27 0.005 -0.010 26504
28 0.009 -0.006 29690
29 0.015 0.000 31453
30 0.021 0.006 32356
31 0.014 -0.001 32250
32 0.020 0.005 30723
33 0.018 0.003 28390
34 0.027 0.012 25745
35 0.028 0.013 22407
36 0.023 0.008 18860
37 0.030 0.015 15429
38 0.025 0.010 12420
39 0.012 -0.003 9558
40 0.045 0.030 7362
41-42 0.032 0.017 9241
43+ 0.027 0.012 7879

That’s a monster effect, 2 points of TTOP wOBA per intervening pitch with an unmistakable trend.  Jackpot.  Hareeb’s a genius.  That’s big enough that it should result in actionable game situations all the time.  Let’s look at it in terms of actual 2nd time wOBAs (quality-adjusted).

intervening pitches PA2 wOBA (adj)
<=20 0.339
21 0.346
22 0.343
23 0.344
24 0.340
25 0.341
26 0.339
27 0.339
28 0.337
29 0.340
30 0.341
31 0.338
32 0.347
33 0.336
34 0.345
35 0.344
36 0.336
37 0.340
38 0.328
39 0.335
40 0.340
41-42 0.338
43+ 0.344

Wait what??!?!? Those look almost the same everywhere.  If you look closely, the higher-pitch-count PA2 wOBAs even average out to be a tad (4-5 points) *lower* than the low-pitch-count ones (and the same for PA1-PA3, though that needs a closer look). If I didn’t screw anything up, that can only mean..

intervening pitches PA1 wOBA (adj)
<=20 0.361
21 0.351
22 0.348
23 0.339
24 0.340
25 0.336
26 0.338
27 0.335
28 0.327
29 0.325
30 0.320
31 0.325
32 0.326
33 0.319
34 0.318
35 0.316
36 0.312
37 0.311
38 0.303
39 0.323
40 0.295
41-42 0.306
43+ 0.318

Yup.  The number of intervening pitches TO OTHER BATTERS between somebody’s first and second PA has a monster “effect on” the PA1 wOBA.  I started hand-checking more rows of pitch counts and PA results, you name it.  I couldn’t believe this was possibly real.  I asked one of my friends to verify that for me, and he did, and I mentioned the “effect” to Tango and he also observed the same pattern.  This is actually real.  It also works the same way between PA2 and PA3. I couldn’t keep looking at other TTOP stuff with this staring me in the face, so the rest of this post is going down this rabbit hole showing my path to figuring out what was going on.  If you want to stop here and try to work it out for yourself, or just think about it for awhile before reading on, I thought it was an interesting puzzle.

It’s conventional sabermetric wisdom that the box-score-level outcome of one PA doesn’t impart giant predictive effects, but let’s make sure that still holds up.

Reached base safely in PA1 PA2 wOBA (adj) Batter quality Pitcher quality
Yes 0.348 0.338 0.339
No 0.336 0.334 0.336

That’s a 12 point effect, but 7 of it is immediately explained by talent differences, and given the plethora of other factors I didn’t control for, all of which will also skew hitter-friendly like the batter and pitcher quality did, there’s just nothing of any significance here.    Maybe the effect is shorter-term than that?

Reached base safely in PA1 Next batter wOBA (adj) Next batter quality Pitcher quality
Yes 0.330 0.337 0.339
No 0.323 0.335 0.336

A 7 point effect where 5 is immediately explained by talent.  Also nothing here.  Maybe there’s some effect on intervening pitch count somehow?

Reached base safely in PA1 Average intervening pitches intervening wOBA (adj)
Yes 30.58 0.3282
No 30.85 0.3276

Barely, and the intervening batters don’t even hit quite as well as expected given that we know the average pitcher is 3 points worse in the Yes group.  Alrighty then.  There’s a big “effect” from intervening pitch count on PA1 wOBA, but PA1 wOBA has minimal to no effect on intervening pitch count, intervening wOBA, PA2 wOBA, or the very next hitter’s wOBA.  That’s… something.

In another curious note to this effect,

intervening pitches intervening wOBA (adj)
<=20 0.381
21 0.373
22 0.363
23 0.358
24 0.351
25 0.344
26 0.343
27 0.335
28 0.333
29 0.328
30 0.324
31 0.322
32 0.319
33 0.316
34 0.316
35 0.312
36 0.310
37 0.310
38 0.307
39 0.311
40 0.308
41-42 0.309
43+ 0.311

Another monster correlation, but that one has a much simpler explanation: short PAs show better results for hitters

Pitches in PA wOBA (adj) n
1 0.401 133230
2 0.383 195614
3 0.317 215141
4 0.293 220169
5 0.313 198238
6 0.328 133841
7 0.347 57396
8+ 0.369 37135

Throw a bunch of shorter PAs together and you get the higher aggregate wOBA seen in the table right above this one. It seems like the PA length effect has to be a key.  Maybe there’s a difference in the next batter’s pitch distribution depending on PA1?

Pitches in PA Fraction of PA after reached base Fraction of PA  after out wOBA after reached base wOBA after out OBP after reached base OBP after out
1 0.109 0.089 0.394 0.402 0.362 0.359
2 0.164 0.158 0.375 0.376 0.348 0.343
3 0.183 0.182 0.308 0.303 0.284 0.278
4 0.186 0.191 0.289 0.276 0.299 0.281
5 0.165 0.174 0.311 0.301 0.339 0.323
6 0.112 0.120 0.323 0.32 0.367 0.360
7 0.049 0.052 0.346 0.339 0.393 0.386
8+ 0.032 0.034 0.356 0.36 0.401 0.405

Now we’re cooking with gas.  That’s a huge likelihood ratio difference for 1-pitch PAs, and using our PA1 OBP of about .324, we’d expect to see a PA1 OBP of .370 given a 1-pitch PA followup, which is exactly what we get, and the longer PAs are more weighted to previous outs because of the odds ratio favoring outs after we get to 4 pitches.

Next PA pitches This PA1 OBP This PA1 wOBA
1 0.370 0.373
2 0.333 0.332
3 0.326 0.325
4 0.319 0.318
5 0.313 0.313
6 0.311 0.313
7 0.314 0.310
8 0.313 0.309

It seems like this should be a big cause of the observed effect. I used the 2nd/6th and 3rd/7th columns from two tables up to create a process that would “play through” the next 8 PAs starting after an out or a successful PA, deciding on the number of pitches and then whether it was an out or not based on the average values.  Then I calculated the expected OBP for PA1 based on the likelihood ratios of each number of total pitches to happen (the same way I got .370 from the odds ratio for a 1-pitch followup PA).

As it turns out, that effect alone can reproduce the shape and a little over half the spread

intervening pitches PA1 OBP (adj) model PA1 OBP
<=20 0.366 0.340
21 0.351 0.336
22 0.349 0.329
23 0.339 0.338
24 0.343 0.332
25 0.336 0.328
26 0.335 0.327
27 0.335 0.328
28 0.328 0.328
29 0.325 0.326
30 0.320 0.326
31 0.324 0.323
32 0.324 0.323
33 0.318 0.321
34 0.318 0.324
35 0.317 0.323
36 0.312 0.317
37 0.313 0.318
38 0.307 0.320
39 0.320 0.310
40 0.300 0.317
41-42 0.308 0.309
43+ 0.320 0.317

and that simple model is deficient at a number of things (correlations longer than 1 pa, different batters, base-out states, etc).  I don’t know everything that’s causing the effect, but I have a good chunk of it, and that reverse pitch count selection bias isn’t something I’ve ever seen mentioned before.  This is also a caution to any kind of analysis involving pitch counts to be very careful to avoid walking into this effect.

 

A look at DRC+’s guts (part 1 of N)

In trying to better understand what DRC+ changed with this iteration, I extracted the “implied” run values for each event by finding the best linear fit to DRAA over the last 5 seasons.  To avoid regression hell (and the nonsense where walks can be worth negative runs when pitchers draw them), I only used players with 400+ PA.  To make sure this should actually produce reasonable values, I did the same for WRAA.

relative to average out 1b 2b 3b hr bb hbp k bip out
old DRAA 0.419 0.416 0.75 1.37 0.44 0.41 -0.08 0.03
new DRAA 0.48 0.57 0.56 1.36 0.44 0.49 -0.06 0.02
wRAA 0.70 1.00 1.27 1.53 0.54 0.60 0.01 0.00

Those are basically the accepted linear weights in the wRAA row, but DRAA seems to have some confusion around the doubles.  In the first iteration, doubles came out worth fewer runs than singles, and in the new iteration, triples come out worth fewer runs than doubles.  Pepsi might be ok, but that’s not.

If we force the 1b/2b/3b ratio to conform to the wRAA ratios and regress again (on 6 free variables instead of 8), then we get something else interesting.

relative to average PA 1b 2b 3b hr bb hbp k bip out
old DRAA 0.22 0.38 0.52 1.16 0.28 0.24 -0.24 -0.13
new DRAA 0.26 0.45 0.62 1.17 0.26 0.30 -0.24 -0.15
wRAA 0.44 0.74 1.01 1.27 0.27 0.33 -0.26 -0.27

Old DRAA was made up of about 90% of TTO runs and 50% of BIP runs, and that changed to about 90% of TTO runs and 60% of BIP runs in the new iteration.  So it’s like the component wOBA breakdown Tango was doing recently, except regressing the TTO component 10% and the BIP part 40% (down from 50%).

I also noticed that there was something strange about the total DRAA itself.  In theory, the aggregate runs above average should be 0 each year, but the new version of DRAA managed to uncenter itself by a couple of percent (that’s about -2% of total runs scored each season)

year old DRAA new DRAA
2010 210.8 -559.1
2011 127.9 -550
2012 226.8 -735.9
2013 190.4 -447.5
2014 33.7 -659.9
2015 60.1 -89.1
2016 63.3 -401.2
2017 -37.8 -318.3
2018 -50.2 -240.4

Breaking that down into full-time players (400+ PA), part-time position players (<400 PA), and pitchers, we get

2010-18 runs old DRAA new DRAA WRAA
Full-time 13912 11223 15296
part-time -6033 -7850 -9202
pitchers -7054 -7369 -6730
total 825 -3996 -636

I don’t know why it decided players suddenly deserved 4800 fewer runs, but here we are, and it took 520 offensive BWARP (10% of their total) away from the batters in this iteration too, so it didn’t recalibrate at that step either.  This isn’t an intentional change in replacement level or anything like that. It’s just the machine going haywire again without sufficient internal or external quality control.

 

US sports unions are so, so screwed

TL;DR It’s always a good time to be a billionaire, but when you get to exploit people with super-short prime earning periods, it’s even better.

I’ve been seeing chatter about potential upcoming labor unrest in the NFL, the NHL and NBA both had a stoppage this decade, and baseball players haven’t been very happy about the lack of progress on the Harper and Machado fronts.  Furthermore, this is an era where norms have been giving way to the raw exercise of power, so I thought it would be interesting to look at upcoming negotiations under the assumption that the owners were going to try to make more money and that the players were willing to be extremely antagonistic.

Sports labor negotiations are positive sum- if the games are played, owners and players alike are much better off, over a wide range of revenue splits, than if the games weren’t played.  Under an absolute take-it-or-leave-it-forever ultimatum, the players would be willing to play for far less, and the owners would be willing to pay the players more.  The former is true because the four leagues mentioned are destination leagues- there’s nowhere else to play baseball, football, basketball, or hockey for nearly as much money.  The owners would be willing to pay more because less profit is still better than no profit.  If there were alternative markets (MLS is nowhere close to a destination league for soccer, for example), the following analysis wouldn’t be relevant.

None of the owners are going to suggest a revenue split anywhere near the minimum players might accept in a pure ultimatum (KHL might pay 20% of NHL at the top end, NPB, KBO, and EuroLeague are much worse).  That would be reducing player revenue share from ~50% to ~5-10%.  Nobody’s stupid enough to even float a proposal like that.  How much higher the owners would be willing to go is a much more interesting question.

There are reports of team revenue and operating income (profit), but if you’re skeptical of those numbers, there’s a fairly safe way to estimate an upper bound on profit.  Whatever a franchise valuation is, would the owners still be happy to own it if they also had to dump X% of the valuation into a black hole every year? If X is 0.01%, sure- that’s a 400k/year extra cost to own the Yankees (4 billion franchise value), and that’s not going to move the needle at all.  They make far, far more than that.  If X is 20%, hell no- 800MM/year down the toilet to own the Yankees would be completely insane.  Even 5% (200MM) seems like a bad idea in normal times, but let’s run with that and see where it gets us.

League averages (millions)
Franchise Valuation
Revenue
Profit
5% valuation
Profit % of Revenue
Payroll %
Previous CBA %
MLB 1645 315 29 82.25 9 54 N/A
NFL 2500 412.5 101 125 24 ~48 53
NBA 1650 246.7 52 82.5 21 ~50 50/57
NHL 630 157 25 31.5 16 50 57

(Source: Forbes articles)

MLB is structured differently, so maybe the profit % actually is lower because teams bid against each other with no hard cap, or maybe it’s fudged lower because it’s not a number that has to be signed off on by the players, but players could attempt to capture 60% of revenue as payroll, and outside of MLB (and maybe even in MLB), the owners would say yes on an ultimatum- it’s not that far above previous CBA levels.  Let’s create a hypothetical league that’s an amalgam of the non-MLB leagues to work with and assume that the owners come with some proposal around 45% of revenue to payroll in the next CBA and the players counter with 60%

League averages (millions) Franchise Valuation Revenue Profit Profit % of Revenue Payroll % Non-payroll expenses
Amalgam (current) 1800 300 60 20 50 30% / 90M
Owner offer 1800 300 75 25 45 30% / 90M
Player offer 1800 300 30 10 60 30% / 90M

If the owners cancel a season and win- the players come back the next year at 45%- then in 4 years total time, they’ll make -90M*4 (expenses) -45%*300M*3 (payroll) + 300*3 (revenue) = 135MM in profit, and the players threw away 45%*300M= 135MM by holding out and then folding (and cost the owners 165MM). If the owners had just accepted the player offer from the start, 4*30 = +120MM in profit.  So they make up for this really fast if they win.

On the player side, if they hold out and win- the owners agree to 60%- then in 4 years total time, they earn 3*60%*300 = 540MM, and if they’d just accepted the owner offer initially, they would have earned 4*45%*300 = 540MM, and the owners threw away 120MM by holding out and then folding.  The players also make up for this really fast if they win. (ignoring “harm to the game” effects which hurt both sides)

This looks like it might be a difficult kind of battle to handicap, but it’s not, for two main reasons.  The first is that the owner timescale is clearly longer than 4 years.  They can make decisions to maximize profit or future franchise value that far down the road, easily.  The sports unions however are not in the business of maximizing the amount of money that goes to players- they’re in the business of maximizing the amount of money that goes to *current voting members*, or more precisely, a bloc of current voting members large enough to certify a new agreement.

The last column in the top table shows the change from the previous CBA.  The NBA had a stoppage in 2011 when the owners tried to drop revenue from 57% to 47% with a harder salary cap, and after the stoppage, the players settled for 50% and a worse-but-not-as-bad-as-originally-offered cap change.  The NHL lockout in 2012 was close to what’s being discussed- the league was trying to drop salary from 57% to 43% (the reverse of a 32% increase) with a bunch of player-unfriendly contract terms as well, and settled for 50% without the contract issues.  The NFL lockout in 2011 (no games missed) was an attempt to drop from 53% to 42% and lengthen the season.  They settled for 48%.

Given that the average career length ranges from 3.5 years (NFL) to 5.6 (MLB) and medians are lower, it’s actually impressive leadership- or, more likely, complete player delusion about their expected future career length and anger about something they had being taken away- to get many takers on a threatened holdout that only pays off if you’re still playing 4+ years later. The owners won- huge- in all three lockouts. NHL and NBA owners got an extra 7% of revenue over 10 years at the cost of a few percent of revenue in the first year.  NFL owners got 5% extra for 10 years for nothing.

Perhaps, if NBA and NHL had aimed a little more conservatively (say, proposing 57% to 50% with the intent of settling at 52% with no games missed), they could have come out even better, but it’s not clear that they would have.  As it was, the NHLPA offered settlements at 54% instead of even trying to fully defend its territory, as did the NBPA at 53%, and they might have stuck harder to those numbers in the face of a more reasonable proposal.

It’s hard to find any example of the players outright winning a labor dispute or CBA negotiation since 1990.  Even following 1994 MLB, the players conceded ground- they averted disaster, in that they avoided the salary cap and hard-line revenue sharing, but they agreed to luxury tax numbers, and that’s just one of a number of anti-spending measures MLB has adopted since.  They can’t directly negotiate salary percentages down, so instead they reduce the club-level financial rewards of winning to limit salary growth.  Every form of revenue sharing, luxury tax, lost free agent compensation, etc. decreases the marginal revenue from spending and thereby works to suppress payroll.

Players might be able to fight back and get a consensus somewhere around a 50% jump (40% of revenue to 60%) if it were guaranteed to succeed, but of course, it’s not.  The owners would say yes to a pure ultimatum, but how do the players make it an ultimatum? It’s well-known that the best strategy in a game of chicken between non-suicidal players is to be the first one to throw your steering wheel out the window where the other person can see it.  By visibly taking away your options, you’ve left the other player in a swerve or die scenario, and you win.  Unfortunately for the players, they have no way to do that, and they’ve been demonstrably weak in every sport even when they’ve taken it to a holdout.  Against that backdrop, the owners haven’t quite thrown their steering wheel away, but the players should have absolutely no expectation that the owners will be in any kind of a hurry to use it.

The closest to strong the players have been is 1994 MLB, and that was the league trying to unilaterally impose a salary cap and revenue sharing and preceded by the owners blatantly colluding to suppress free-agent contracts.  Not “collusion” in quotes, but literally the commissioner publicly telling teams that long contracts were bad and the owners paying out multiple settlements for hundreds of millions of 1990s dollars.  And in the face of all of that, the players only stayed where they were and then conceded ground shortly after.  “Winning”, for a modern sports union, is now defined as “not losing ground horribly”.

The takeaway from that is that even if player share of revenue continues to drop to the 40% range where the players appear to have a reasonably credible ultimatum-level threat, they still don’t because they’ll just fold to a lesser offer.  If players were trying to go from 40% to 60%, and the ownership (miraculously) countered with 55%, the players would trip over themselves to ratify that agreement.  And they’d do the same thing at 50% and 45%.  Assuming they have the self-awareness to understand that in advance (the owners certainly do), they know they don’t have a credible threat at 40% (sitting out a season to go from 40% to 45% is moronic even with guaranteed success).  And in the same vein, sitting out a season to avoid going from 50% to 45% feels worse, because they’re benchmarked at 50%, but it’s equally moronic.

It’s also moronic for the owners to actually follow through with it, but because the players have folded so many times in a row, the players are acting far more strongly against their individual self-interests dollar-wise because of their shorter timescale, and the players’ marginal utility of money is much higher than that of the zillionaire owners/conglomerates, they’re likely only going to stay irrational for so long, and it’s a well-calculated risk at this point that they’re just going to fold again before too much damage is done.  The true floor of what the players will play for is still nowhere in sight IMO.

The upcoming NFL renegotiation in 2021 has all the makings of a total bloodbath for the players.  The NFL is in the worst position to defend itself, with the lowest career length, and yet the union is already-saber-rattling, two years in advance, with talk of reclaiming what they lost in the last CBA, and players like Richard Sherman are saying players have to be willing to strike.  That’s true, but… being willing to strike doesn’t mean you’re actually going to get your money back, and if players really are willing to strike without realizing that there’s a good chance that it’s just going to completely blow up in their faces, and a very high chance that most individuals come out worse even if they somehow fully win the dispute after only missing half a season… well, good luck with that.  The players are going to come in thinking they’re going to make gains, and if the owners channel their inner Nate Diaz and give the players the double birds while they wait for the inevitable tapout at something under 45% of revenue.. well, I guess I can pat myself on the back.  !remindme 2 years.

Their best hope, and it’s a slim one at that, is that the NFL owners simply aren’t in a mood for a fight.  The NBPA skated through a negotiation period in 2017 with minimal changes (and the ones approved look to me to be more like “good governance of league operations” agreements than one side trying to get over on the other), most likely because leaguewide revenues were absolutely exploding along with attendance, TV ratings, merchandise sales, etc. and neither side wanted to battle when they were both making more money than they’d even dreamed of a couple of years prior.  Maybe the NFLPA knows it has no chance in a lockout and is just trying to bluff the owners into not fighting or into aiming for fewer concessions- after all, the head of the union isn’t getting elected over and over by telling the membership that they’re all going to bend over and take it every time the owners come looking for more, even if he knows that’s true.

On the other hand, MLB players who’ve spoken out appear to be confused on a different level.  They think owners have started colluding again, and while I can’t rule that out, especially given their history, the situation appears to me to be explainable by a confluence of three factors.  First, teams are much smarter analytically and realize that big free-agent contracts to older players have been piss-poor investments (and may actually be getting worse post-steroid-era).  Second, teams are spending with more of an eye to marginal revenue than ever before.  Third, the anti-spending measures MLB has been winning concessions on for at least the last 25 years have really started coming home to roost.  Teams have been explicitly not spending money because of the luxury tax, and it should have been obvious that this sort of thing would happen more.  The owners wouldn’t have been harping on anti-spending measures for longer than most of the players have been alive if they hadn’t expected it to yield dividends.

That being said, MLB players are *still* in a better position than the other three leagues, although it’s likely to keep decaying, and trying to get much more money is like blood from a stone at this point, especially if the operating revenue estimates above are close to accurate.  MLB is harder to understand than “bargain for X% of revenue, then talk about how it gets divided” leagues, but the players- or at least enough of them that an informed union can negotiate on reality-based terms- need to understand that they’re 100% “getting screwed” currently by the concessions they’ve repeatedly made to the owners since the 1994 stoppage and most likely not getting screwed harder by a sudden recurrence of prohibited behavior.

 

2/05/19 DRC+ update- some partial fixes, some new problems

BP released an update to DRC+ yesterday purporting to fix/improve several issues that have been raised on this blog.  One thing didn’t change at all though- DRC+ still isn’t a hitting metric.  It still assigns pitchers artificially low values no matter how well they hit, and the areas of superior projection (where actually true) are largely driven by this.  The update claimed two real areas of improvement.

Valuation

The first is in treating outlier players.  As discussed in C’mon Man- Baseball Prospectus DRC+ Edition by treating player seasons individually and regressing them, instead of treating careers, DRC+ will continually fail to realize that outliers are really outliers. Their fix is, roughly, to make a prior distribution based on all player performances in surrounding years, and hopefully not regress the outliers as much because it realizes something like them might actually exist.  That mitigates the problem a little, sometimes, but it’s still an essentially random fix.  Some cases previously mentioned look better, and others, like Don Kessinger vs. Larry Bowa still don’t make any sense at all.  They’re very similar offensive players, in the same league, overlapping in most of their careers, and yet Kessinger gets wRC-DRC bumped from 72 to 80 while Bowa only goes from 70 to 72, even though Kessinger was *more* TTO-based.

To their credit- or at least to credit their self-awareness, they seem to know that their metric is not reliable at its core for valuation.  Jonathan Judge says

“As always, you should remember that, over the course of a career, a player’s raw stats—even for something like batting average—tend to be much more informative than they are for individual seasons. If a hitter consistently seems to exceed what DRC+ expects for them, at some point, you should feel free to prefer, or at least further account for, the different raw results.”

Roughly translated, “Regressed 1-year performance is a better estimation of talent that 1-year raw performance, but ignoring the rest of a player’s career and re-estimating talent 1 year at a time can cause discrepancies, and if it does, trust the career numbers more.” I have no argument with that.  The question remains how BP will actually use the stat- if we get more fluff pieces on DRC+ outliers who are obviously just the kind career discrepancies Judge and I talked about, that’s bad.  If it is mainly used to de-luck balls in play for players who haven’t demonstrated that they deserve much outlier consideration, that’s basically fine and definitely not the dumbest thing I’ve seen lately.

 

This, on the other hand, well might be.

NAME YEAR PA BB DRC+ DRC+ SD DRAA
Mark Melancon 2011 1 1 -3 2 -0.1
Dan Runzler 2011 1 1 -17 2 -0.1
Matt Guerrier 2011 1 1 -13 2 -0.1
Santiago Casilla 2011 1 1 -12 2 -0.1
Josh Stinson 2011 1 1 -15 2 -0.1
Jose Veras 2011 1 1 -14 2 -0.1
Javy Guerra 2011 1 1 -15 2 -0.1
Joey Gathright 2011 1 1 81 1 0

Not just the blatant cheating (Gathright is the only position player on the list), but the DRC+ SDs make no sense.  Based on one identical PA, DRC+ claims that there’s a 1 in hundreds of thousands chance that Runzler is a better hitter than Melancon and also assigns negative runs to a walk because a pitcher drew it.  The DRC+ SDs were pure nonsense before, but now they’re a new kind of nonsense. These players ranged from 9-31 SD in the previous iteration of DRC+, and while the low end of that was still certainly too low, SDs of 1-2 are beyond absurd, and the fact that they’re that low *only for players with almost no PAs* is a huge red flag that something inside the black box is terribly wrong.  Tango recently explored the SD of wRC+/WAR and found that the SDs should be similar for most players with the same number of PA.  DRC+ SDs done correctly could legitimately show up as slightly lower, because they’re the SD of a regressed stat, but that’s with an emphasis on slightly.  Not SDs of 1 or 2 for anybody, and not lower SDs for pitchers and part-time players who aren’t close to a season full of PAs.

Park Adjustments

I’d observed before that DRC+ still contains a lot of park factor and they’ve taken steps to address this.  They adjusted Colorado hitters more in this iteration while saying there wasn’t anything wrong with their previous park factors.  I’m not sure exactly how that makes sense, unless they just weren’t correcting for park factor before, but they claim to be park-isolated now and show a regression against their park factors to prove it.  Of course the key word in that claim is THEIR park factors.  I reran the numbers from the linked post with the new DRC+s, and while they have made an improvement, they’re still correlated to both Fangraphs park factor and my surrounding-years park factor estimate at the r=0.17-0.18 level, with all that entails (still overrating Rockies hitters, for one, just not by as much).

 

DRC+ and Team Wins

A reader saw a television piece on DRC+, googled and found this site, and asked me a simple question: how does a DRC+ value correlate to a win? I answered that privately, but it occurred to me that team W-L record was a simple way to test DRC+’s claim of superior descriptiveness without having to rely on its false claim of being park-adjusted.

I used seasons from 2010-2018, with all stats below adjusted for year and league- i.e. the 2018 Braves are compared to the 2018 NL average.  Calculations were done with runs/game and win% since not all seasons were 162 games.

Team metric r^2 to team winning %
Run Differential 0.88
wRC+ 0.47
Runs Scored 0.43
OBP 0.38
wOBA 0.37
OPS 0.36
DRC+ 0.35

Run differential is cheating of course, since it’s the only one on the list that knows about runs allowed, but it does show that at the seasonal level, scoring runs and not allowing them is the overwhelming driver of W-L record and that properly matching RS to RA- i.e. not losing 5 1-run games and winning a 5-run game to “balance out”- is a distant second.

Good offense is based on three major things- being good, sequencing well, and playing in a friendly park.  Only the first two help you to outscore your opponent who’s playing the game in the same park, and Runs Scored can’t tell the difference between a good offense and a friendly park.  As it turns out, properly removing park factor noise (wRC+) is more important than capturing sequencing (Runs Scored).

Both clearly beat wOBA, as expected, because wRC+ is basically wOBA without park factor noise, and Runs Scored is basically wOBA with sequencing added.  OBP beating wOBA is kind of an accident- wOBA *differential* would beat OBP *differential*- but because park factor is more prevalent in SLG than OBP, offensive wOBA is more polluted by park noise and comes out slightly worse.

And then there’s DRC+.  Not only does it not know sequencing, it doesn’t even know what component events (BB, 1B, HR, etc) actually happened, and the 25% or so of park factor that it does neutralize is not enough to make up for that.  It’s not a good showing for the fancy new most descriptive metric ever when it’s literally more valuable to know a team’s OBP than its DRC+ to predict its W-L record, especially when wRC+ crushes the competition at the same task.

 

Mashers underperform xwOBA on air balls

Using the same grouping methodology as The Statcast GB speed adjustment seems to capture about 40% of the speed effect, except using barrel% (barrels/batted balls), I got the following for air balls (FB, LD, Popup):

barrel group FB BA-xBA FB wOBA-xwOBA n
high-barrel% 0.006 -0.005 22993
avg 0.006 0.010 22775
low-barrel% -0.002 0.005 18422

These numbers get closer to the noise range (+/- 0.003), but mashers simultaneously OUTPERFORMING on BA while UNDERPERFORMING on wOBA while weak hitters do the opposite is a tough parlay to hit by chance alone because any positive BA event is a positive wOBA event as well.  The obvious explanation to me, which Tango is going with too, is that mashers just get played deeper in the OF, and that that alignment difference is the major driver of what we’ve each measured.

 

The Statcast GB speed adjustment seems to capture about 40% of the speed effect

Statcast recently rolled out an adjustment to its ground ball xwOBA model to account for batter speed, and I set out to test how well that adjustment was doing.  I used 2018 data for players with at least 100 batted balls (n=390).  To get a proxy for sprint speed, I used the average difference between the speed-unadjusted xwOBA and the speed-adjusted xwOBA for ground balls.  Billy Hamilton graded out fast.  Welington Castillo didn’t.  That’s good.  Grouping the players into thirds by their speed-proxy, I got the following

 

speed Actual GB wOBA basic xwOBA speed-adjusted xwOBA Actual-basic Actual- (speed-adjusted) n
slow 0.215 0.226 0.215 -0.011 0.000 14642
avg 0.233 0.217 0.219 0.016 0.014 16481
fast 0.247 0.208 0.218 0.039 0.029 18930

The slower players seem to hit the ball better on the ground according to basic xwOBA, but they still have worse actual outcomes.  We can see that the fast players outperform the slow ones by 50 points in unadjusted wOBA-xwOBA and only 29 points after the speed adjustment.

 

DRC+ isn’t even a hitting metric

At least not as the term is used in baseball.  Hitting metrics can adjust for nothing (box score stats, AVG, OBP, etc), league and park (OPS+, wRC+, etc), or more detailed conditions (opposing pitcher and defense, umpire, color of the uniforms, proximity of Snoop Dogg, whatever).  They don’t adjust for the position played.  Hitting is hitting, regardless of who does it.  Unless it’s not.  While fooling around with the data for DRC+ really isn’t any good at predicting next year’s wOBA for team switchers and The DRC+ team-switcher claim is utter statistical malpractice some more, it looked for all the world like DRC+ had to be cheating, and it is.

To prove that, I looked at seasons with exactly 1 PA and 1 unintentional walk for the entire season, and the DRC+ for those seasons.

NAME
YEAR
TEAM
DRC+
DRC+ SD
Audry Perez
2014
Cardinals
104
20
Spencer Kieboom
2016
Nationals
96
29
John Hester
2013
Angels
93
16
Joey Gathright
2011
Red Sox
89
24
J.c. Boscan
2010
Braves
78
25
Mark Melancon
2011
Astros
15
14
George Sherrill
2010
Dodgers
4
23
Antonio Bastardo
2014
Phillies
3
22
Dan Runzler
2011
Giants
2
19
Jose Veras
2011
Pirates
1
15
Matt Reynolds
2010
Rockies
1
12
Tony Cingrani
2016
Reds
0
25
Antonio Bastardo
2017
Pirates
-1
17
Javy Guerra
2011
Dodgers
-2
31
Josh Stinson
2011
Mets
-10
11
Aaron Thompson
2011
Pirates
-12
14
Brandon League
2013
Dodgers
-13
17
J.j. Hoover
2014
Reds
-14
32
Santiago Casilla
2011
Giants
-15
12
Jason Garcia
2015
Orioles
-16
12
Chris Capuano
2016
Brewers
-17
17
Edubray Ramos
2016
Phillies
-19
15
Matt Guerrier
2011
Dodgers
-22
9
Liam Hendriks
2015
Blue Jays
-24
15
Phillippe Aumont
2015
Phillies
-28
20
Randy Choate
2015
Cardinals
-28
52
Joe Blanton
2017
Nationals
-30
12
Jacob Barnes
2017
Brewers
-31
26
Sean Burnett
2012
Nationals
-33
20
Robert Carson
2013
Mets
-43
7

That’s a pretty good spread.  The top 5 are position players, the rest are pitchers.  DRC+ is blatantly cheating by assigning pitchers very low DRC+ values even when their offensive performance is good and not doing the same for 1-PA position players.  wOBA and wRC+ don’t do this, as evidenced by Kieboom (#5) right there with 3 pitchers with the same seasonal stat line.  It’s also not using data from prior seasons because that was Kieboom’s only career PA to date, and when Livan Hernandez debuted in 1996 for one game with 1 PA and 1 single, he got a DRC+ of -14 for his efforts.  It’s just cheating, period.  And it doesn’t learn either.  Even when Bumgarner was hitting in 2014-2017, his DRC+s were -15, 4, -17, and -19.

I also included the DRC+ SDs here just to show that they’re complete nonsense.  Pitcher Mark Melancon (15 +/- 14) has one career PA. Pitcher Robert Carson (-43 +/- 7) also has one career PA. Pitcher Randy Choate (-28 +/- 52) had one PA that year and 5 a decade earlier.  What in the actual fuck?

The entire DRC+ project is a complete farce at this point.  The outputs are a joke***  The SD values are nonsense (table above). The pillars it stands on are complete bullshit.  It’s more descriptive of the current season than park adjusted stats because it’s not anywhere near a park-adjusted stat, even though it claims to be.  It’s more predictive than park-adjusted stats for next year’s team because it’s somewhat regressed, meaning it basically can’t lose, and it’s also cheating the same way descriptiveness does by keeping a bunch of park factor.  Its claimed “substantial improvement over predicting wOBA for team switchers” is statistical malpractice to begin with, and now we see that the one area where it did predict significantly better than regressed wOBA, very-low-PA players, is driven by (almost) ignoring actual results for pitchers and saying they sucked at the plate no matter how well they really hit (and treating low-PA position players with the exact same stat lines as average-ish).

***Check out DRA- land where Billy Wagner is 26 percent more valuable on a per-inning basis than Mariano Rivera and almost as valuable for his career.  I love Billy Wagner, but still, come on.

RIP 12/29/2018.  Comment F to pay respects.