C’mon Man- Baseball Prospectus DRC+ Edition

Required knowledge: A couple of “advanced” baseball stats.  If you know BABIP, wRC+, and WAR, you shouldn’t have any trouble here.  If you know box score stats, you should be able to get the gist.

Baseball Prospectus recently introduced its Deserved Runs Created offensive metric that purports to isolate player contribution to PA outcomes instead of just tallying up the PA outcomes, and they’re using that number as an offensive input into their version of WAR.  On top of that, they’re pushing out articles trying to retcon the 2012 Trout vs. Cabrera “debate” in favor of Cabrera and trying to give Graig Nettles 15 more wins out of thin air. They appear to be quite serious and all-in on this concept as a more accurate measure of value.  It’s not.

The exact workings of the model are opaque, but there’s enough description of the basic concept and the gigantic biases are so obvious that I feel comfortable describing it in broad strokes. Instead of measuring actual PA outcomes (like OPS/wOBA/wRC+/etc) or being a competitive forecasting system (Steamer/ZIPS/PECOTA), it’s effectively just a shitty forecast based on one hitter-season of data at a time****.

It weights the more reliable (K/BB/HR) components more and the less reliable (BABIP) components less like projections do, but because it’s wearing blinders and can’t see more than one season at a time, it NEVER FUCKING LEARNS**** that some players really do have outlier BABIP skill and keeps over-regressing them year after year.  This is methodologically fatal.  It’s impossible to salvage a one-year-of-stats-regressed framework.  It might work as a career thing, but then year X WAR would change based on year X+1 performance.

Addendum for clarity: If DRC+ regresses each season as though that’s all the information it knows, then adds those regressed seasons up to determine career value, that is *NOT* the same as correctly regressing the total career.  If, for example, BABIP skill got regressed 50% each year, then DRC+ would effectively regress the final career value 50% as well (as the result of adding up 50%-regressed seasons), even though the proper regression after 8000 PAs is much, much less.  This is why the entire DRC+ concept and the other similarly constructed regressed-season BP metrics are broken beyond all repair.  /addendum

 

****The description is vague enough that it might actually use multiple years and slowly learn over a player’s career, but it definitely doesn’t understand that a career of outlier skill means that the outlier skill (likely) existed the whole time it was presenting, so the general problem of over-regressing year after year would still apply, just more to the earlier years. Trout has 7 full years and he’s still being underrated by 18, 18, and 11 points the last 3 years compared to wRC+ and 17 points over his whole career.

DRC+ loves good hitters with terrible BABIPs and particularly ones with bad BABIPs and lots of HRs.  Graig Nettles and his career .245 +/- .005 BABIP / 390 HRs looks great to DRC+ (120 vs 111 wRC+, +14.7 wins at the plate), as do Mark McGwire (164 vs 157, +8.5 wins), Harmon Killebrew (150 vs 142, +16.2 wins), Ernie Banks (129 vs 118, +20.8 wins), etc.  Guys who beat the hell out of the ball and run average-ish BABIPs are rated similarly to wRC+, Barry Bonds (175 vs 173), Hank Aaron (150 vs 153), Willie Mays (150 vs 154), Albert Pujols (147 vs 146), etc.

The flip side of that is that DRC+ really, really hates low-ISO/high BABIP quality hitters.  It underrates Tony Gwynn (119 vs 132, -12.9 wins) because it can’t figure out that the 8-time batting champ can hit. In addition, it hates Roberto Alomar (110 vs 118, -10.4 wins) Derek Jeter (105 vs 119, -17.9 wins), Rod Carew (112 v 132, -18.7 wins), etc.  This is simply absurd.

C’mon man.

 

One thought on “C’mon Man- Baseball Prospectus DRC+ Edition”

  1. This is a really good analysis that identifies the core structural flaw in DRC+ and DRA-. The one thing that I would add is that the same reliance on a single season of data, coupled with strong regression, also results in radically shrunk park factors. So Coors is rated at 104 in 2018 when it actually increased scoring by about 17%, and pitchers’ parks like Petco and Citi get rated as effectively neutral. For most players this won’t be a huge deal, but it badly distorts the value of players in extreme parks.

    To get a sense of the impact, DRA- says that San Diego pitchers have been 9% better than average over the past two decades — which would rank them as one of the very best staffs in MLB, equivalent to having Cuellar or Radke on the mound — but they have a 4.70 road ERA over the same time. Similarly, the Rockies have a 102 DRC+ over their history, which would make them one of the top 3-4 hitting teams since 1993 (league average DRC+ is about 96), but their road wOBA ranks 30th. These are just nonsensical results, and tell us more about the parks these teams played in than their actual performance.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.