Required knowledge: MUST HAVE READ/SKIMMED DRC+ really isn’t any good at predicting next year’s wOBA for team switchers and a non-technical knowledge of what a correlation coefficient means wouldn’t hurt.

In doing the research for the other post, it was baffling to me what BP could have been doing to come up with the claim that DRC+ was a revolutionary advance for team-switchers. It became completely obvious that there was nothing particularly meaningful there with respect to switchers and that it would take a totally absurd way of looking at the data to come to a different conclusion. With that in mind, I clicked some buttons and stumbled into figuring out what they had to be doing wrong. One would assume that any sophisticated practitioner doing a correlation where some season pairs had 600+ PA each and other season pairs had 5 PA each would weight them differently… and one would be wrong.

I decided to check 4 simple ways of weighting the correlation- unweighted, by year T PA, by year T+1 PA, and by the harmonic mean of year T PA and year T+1 PA.

### Table 1. Correlation coefficients to year T+1 wOBA% by different weighting methods, minimum 400 PAs year T.

400+ PA | Harmonic | Year T PA | Year T+1 PA | unweighted | N | |

switch | wOBA | 0.34 | 0.35 | 0.34 | 0.34 | 473 |

switch | DRC+ | 0.35 | 0.35 | 0.34 | 0.35 | 473 |

same | wOBA | 0.55 | 0.53 | 0.55 | 0.51 | 1124 |

same | DRC+ | 0.57 | 0.55 | 0.57 | 0.54 | 1124 |

The way to read this chart is to compare the wOBA and DRC+ correlations for each group of hitters- switch to switch (lines 1 and 2) and same to same (lines 3 and 4). It’s obvious that wOBA should correlate much better for same than switch because it contains the entire park effect which is maintained in “same” and lost in “switch”, but DRC+ behaves the same way because DRC+ also contains a lot of park factor even though it shouldn’t

In the 400+ year T PA group, the choice of weighting method is almost completely irrelevant. DRC+ correlates marginally better across the board and it has nothing to do with switch or stay. Let’s add group 2 to the mix and see what we get.

### Table 2. Correlation coefficients to year T+1 wOBA% by different weighting methods, minimum 100 PAs year T.

100+ PA | Harmonic | Year T PA | Year T+1 PA | unweighted | N | |

switch | wOBA | 0.31 | 0.29 | 0.29 | 0.26 | 1100 |

switch | DRC+ | 0.33 | 0.31 | 0.32 | 0.29 | 1100 |

same | wOBA | 0.51 | 0.47 | 0.50 | 0.44 | 2071 |

same | DRC+ | 0.54 | 0.51 | 0.53 | 0.47 | 2071 |

The values change, but DRC+’s slight correlation lead doesn’t, and again, nothing is special about switchers except that they’re overall less reliable. Some of the gaps widen by a point or two, but there’s no real sign of the impending disaster when the low-PA stuff that favors DRC+ comes in. But what a disaster there is….

### Table 3. Correlation coefficients to year T+1 wOBA% by different weighting methods, all season pairs.

1+ PA | Harmonic | Year T PA | Year T+1 PA | unweighted | N | |

switch | wOBA | 0.45 | 0.41 | 0.38 | 0.37 | 1941 |

switch | DRC+ | 0.54 | 0.47 | 0.58 | 0.57 | 1941 |

same | wOBA | 0.62 | 0.58 | 0.53 | 0.52 | 3639 |

same | DRC+ | 0.67 | 0.62 | 0.66 | 0.66 | 3639 |

The two weightings (Harmonic and Year T) that minimize the weight of low-data garbage projections stay saner, and the two methods that don’t (year T+1 and unweighted) go bonkers and diverge by around what BP reports, If I had to guess, I have more pitchers in my sample for a slightly bigger effect and regressed DRC+% correlates a bit better. And to repeat yet again, the effect has nothing to do with stay/switch. It’s entirely a mirage based on flooding the sample with bunches of low-data garbage projections based on handfuls of PAs and weighting them equally to pairs of qualified seasons.

You might be thinking that that sounds crazy and wondering why I’m confident that’s what really happened. Well, as it turns out- and I didn’t realize this until after the analysis- they actually freaking told us that’s what they did. The caption for the chart is “**Table 3: Reliability of Team-Switchers, Year 1 to Year 2 wOBA (2010-2018); Normal Pearson Correlations”. **Normal Pearson correlations are unweighted. Mystery confirmed solved.

If you play poker and would like to support the site, read about the new PKC poker app.

Is the correct way to read these tables to compare each row of wOBA performance to the next row of DRC+ performance? i.e. row 1 vs. 2, row 3 vs. 4, in each of the above tables and for each of the above weighting methods?

LikeLike

Yes. I’ll put in a note to that effect.

LikeLike

I see. And higher correlation coefficients are obviously better. I thought I was missing something when I read the tables.

So after all this talk about statistical malpractice and fraud, what you actually found is that DRC+ at least equals, and 95% of the time outperforms, wOBA in every category of player and by every weighting method that you tried?

LikeLike

No, in actually making projections- using one regression for DRC+ and one regression for wOBA to project all players- DRC+ projects full-time team-switchers not only less accurately than wOBA, but also *less accurately than assuming that every full-time team switcher will hit league average*, which is about the lowest bar possible.

The correlations in this post were to address the specific claim that DRC+ correlated *much better* than wOBA, which it clearly doesn’t for full-time position players and regular bench position players (if DRC+ were described honestly, as a somewhat regressed and not-at-all-park-neutral metric, taking a slight correlation victory over wOBA in this specific test is not even a surprise), and only does for low-PA players when 5PA projections and 500PA projections are weighted ~equally. If anybody’s takeaway from the right-hand side of table 3 is “YAY HIGHER CORRELATION THIS IS GREAT”, and not “this is a spurious result caused by weighting 5-PA seasons ~equally to 500-PA seasons, and we all know that’s a silly thing to do”, then I’m not sure what to tell you. The purpose of productive analysis is to provide an answer to an interesting or useful question, not to do a bunch of silly things until you find a high r and then define whatever you’re measuring at the time to be the question.

LikeLike