Saturday, September 22, 2012

The OPSBI Fallacy

Yes, I realize I'm a year too late to this one.  But the MVP talk is a-startin', and I just know some idiot sportswriter is going to reference this "stat" when explaining their vote, so I thought it was worth another look.  I realize it's been torn apart by saberists before.  But I wanted to take a different look at OPSBI.

For those who don't know, OPSBI was created by Jim Bowden, the former Reds/Nats GM.  The basic idea is this:  you take the players on-base percentage, and add his slugging percentage (thus, the OPS).  Now, eliminate the decimal point (or multiply by 1000, however you prefer to think about it).  Then, add the player's RBI.  That's it.

So, first, I want to point out why this is stupid.  First of all, it doesn't measure anything.  At all.  It arbitrarily adds a rate stat (two rate stats, actually) to a counting stat, without any consideration for why.  It doesn't correlate to anything.  As we know, OPS has a reasonable correlation with run-scoring.  RBI represent actual runs batted in (though not runs scored).  So, you're adding something which correlates to run-scoring with something that actually is one-half of run-scoring.  Now, someone clever could probably make the argument that if you added Runs to this, you'd solve some of that problem.  But here's the question - why add anything to OPS at all?  I don't get it.

Anyway, Bill James says, "For a statistic to have value, it has to be meaningful with reference to something other than its own formula" (The New Bill James Historical Abstract, in the player comment on Craig Biggio).  OPSBI fails that test.

Of course, there are still defenders out there.  I don't feel like looking for articles right now, but I remember reading at least two of them last offseason.  Here's the thing they'll say, more or less:  "Who cares if it doesn't measure anything - it gives the right answer!"  Okay, well, in my mind, it gives the "right" answer - that is, it affirms (much of the time) the conclusion sportswriters have already made.  But I personally believe that I can throw out a lot of other BS stats that will do the same thing (more or less).  Anyway, I'll be looking at the last five years of data (not including 2012, of course, since we're still underway), and looking at the top MVP-finishers (non-pitchers only) for each season, and comparing them by different metrics.  Those metrics are:

MVP Finish - where did the player finish in subjective MVP voting?
OPSBI - of course.
RunAvg - (R+RBI)/AB
ButTheKitchenSink*Games - Games*(RBI+R+TB+BB+HBP+SB)/(3*AB) ; I already posted about this before.
 ButTheKitchenSink(rate) - (RBI+R+TB+BB+HBP+SB)/(3*AB) ; same thing, but as a rate stat (scaled to batting average); really, what this is, is RunAvg+BattingAvg+SecondaryAvg, and divided by three so that it looks like the players batting average.
StolenHomes - SB+HR
TripleCrowns - (3*HR)+((1000*Avg-100)/2)+(RBI)
HitByBallSacs - because I couldn't resist:  HBP+BB+SF+SH
rWAR - because, wouldn't it be fun if I used a metric that actually seemed to represent real value?

One last note before presenting:  in 2008, Manny Ramirez only played 53 games in the NL, so only his NL stats are included.  Also, I hope this formats okay.  Here come the charts:

2011 NL MVP OPSBI RunAvg BTKS BTKSr StoHos 3Crowns BallSacs rWAR
Braun 1 1105 .391 57.9 .386 66 343.0 66 7.7
Kemp 2 1112 .400 63.7 .395 79 366.0 87 7.8
Fielder 3 1101 .378 62.2 .384 39 345.5 123 4.3
Upton 4 986 .326 54.2 .341 52 294.5 82 5.7
Pujols 5 1005 .352 50.0 .340 46 322.5 72 5.1










2011 AL MVP OPSBI RunAvg BTKS BTKSr StoHos 3Crowns BallSacs rWAR
Ellsbury 1 1033 .339 54.9 .347 71 329.5 69 8.0
Bautista 2 1159 .405 64.6 .433 52 340.0 142 7.7
Granderson 3 1035 .437 62.3 .400 66 332.0 108 5.3
Cabrera 4 1138 .378 62.3 .387 32 337.0 116 7.3
Cano 5 1000 .356 52.1 .327 36 325.0 58 5.2










2010 NL MVP OPSBI RunAvg BTKS BTKSr StoHos 3Crowns BallSacs rWAR
Votto 1 1137 .400 60.4 .403 53 349.0 101 6.7
Pujols 2 1129 .397 63.6 .400 56 358.0 113 7.3
C. Gonzalez 3 1091 .388 53.3 .367 60 353.0 49 5.8
A. Gonzalez 4 1005 .318 52.8 .330 31 362.0 101 4.1
Tulowitzki 5 1044 .391 44.6 .365 38 306.5 59 6.5










2010 AL MVP OPSBI RunAvg BTKS BTKSr StoHos 3Crowns BallSacs rWAR
Hamilton 1 1144 .376 49.6 .373 40 343.5 53 8.4
Cabrera 2 1168 .432 61.4 .409 41 366.0 100 6.1
Cano 3 1023 .339 52.3 .327 32 326.5 70 7.8
Bautista 4 1119 .409 66.3 .412 63 362.0 114 6.6
Konerko 5 1088 .365 54.1 .363 39 345.0 83 4.3










2009 NL MVP OPSBI RunAvg BTKS BTKSr StoHos 3Crowns BallSacs rWAR
Pujols 1 1236 .456 72.6 .454 63 392.5 132 9.4
H. Ramirez 2 1060 .359 53.9 .357 51 325.0 76 7.1
Howard 3 1072 .399 59.5 .372 53 370.5 87 3.5
Fielder 4 1155 .413 65.9 .407 48 382.5 128 6.0
Tulowitzki 5 1022 .355 54.6 .362 52 304.5 85 6.3










2009 AL MVP OPSBI RunAvg BTKS BTKSr StoHos 3Crowns BallSacs rWAR
Mauer 1 1107 .363 50.9 .369 32 3028.5 83 7.6
Teixeira 2 1070 .369 56.7 .363 41 346.0 98 5.1
Jeter 3 937 .273 46.3 .302 48 269.0 82 6.4
Cabrera 4 1045 .326 53.4 .334 40 333.0 74 4.7
Morales 5 1032 .343 50.8 .334 37 329.0 56 4.0










2008 NL MVP OPSBI RunAvg BTKS BTKSr StoHos 3Crowns BallSacs rWAR
Pujols 1 1230 .412 63.5 .429 44 368.5 117 9.0
Howard 2 1027 .411 59.0 .364 49 367.5 90 1.5
Braun 3 994 .324 49.3 .326 51 322.5 52 4.3
M. Ramirez 4 1285 .476 25.3 .478 19 285.0 42 3.4
Berkman 5 1092 .397 62.9 .396 47 320.0 111 6.6










2008 AL MVP OPSBI RunAvg BTKS BTKSr StoHos 3Crowns BallSacs rWAR
Pedroia 1 952 .308 48.1 .306 37 280.0 73 6.8
Morneau 2 1002 .363 53.7 .330 23 325.0 89 3.9
Youkilis 3 1073 .383 52.9 .365 32 329.0 83 6.0
Mauer 4 949 .341 46.4 .318 10 267.0 97 5.3
Quentin 5 1065 .408 50.8 .391 43 316.0 89 5.1










2007 NL MVP OPSBI RunAvg BTKS BTKSr StoHos 3Crowns BallSacs rWAR
Rollins 1 969 .325 53.5 .331 71 314.0 62 6.0
Holliday 2 1149 .404 60.2 .381 47 379.0 77 5.8
Fielder 3 1132 .398 63.2 .400 52 363.0 108 3.4
Wright 4 1070 .364 60.4 .377 64 329.5 107 8.1
Howard 5 1112 .435 59.2 .411 48 364.0 119 2.8










2007 AL MVP OPSBI RunAvg BTKS BTKSr StoHos 3Crowns BallSacs rWAR
Rodriguez 1 1223 .513 73.6 .466 78 421.0 125 9.2
Ordonez 2 1168 .430 60.9 .388 32 376.5 83 6.9
Guerrero 3 1075 .373 53.1 .354 29 341.0 86 4.3
Ortiz 4 1183 .424 62.6 .420 38 353.0 118 6.1
Lowell 5 999 .338 48.2 .313 24 324.0 64 4.6

So, what often happens with a chart like this is, people either skip it and wait for the conclusion, or they read it and go, "so what?"  If you're in the former group, that's annoying, because charts take the most time to make.  So authors are upset at you for not reading the chart.  If that's you, go back and look at them.  We'll wait.

Okay, now that everyone's caught up, what the heck does any of this mean?

Well, if the goal of OPSBI is to correlate to a player's true value, we have to ask, "how do we best measure a player's true value?"  Thus, the first and last columns of the chart.  The last, WAR, is a statistical measure.  The first, MVP voting, is a completely subjective measure.  If OPSBI is so good at perceiving value, we'd really like it to have some correlation to one of these two or the other.

The problem is, it doesn't.  Not at all.  So either both the MVP voters and WAR are wrong, while OPSBI is the true best measure... or OPSBI is crap.  Now, I'll admit freely that both WAR and voters take defense into account, which OPSBI doesn't... still though, that's not (for the most part) how MVPs are won, or how WAR is decided, since offense bears so much more weight.

For instance, if OPSBI is such a good measure, why would Jacoby Ellsbury have finished above Jose Bautista last year in MVP voting?  Bautista was the better offensive player... even by these made up metrics.  I just don't get what OPSBI is doing that different from... ANY of these other things I made up.  Seriously.  Well, maybe not HitByBallSacs, but that's just too hilarious to not include.  Anyway, the others do just as good of a job predicting WAR and predicting MVPs... maybe better.  So why OPSBI?  I see no reason, since it doesn't even reflect voter tendencies.

No comments:

Post a Comment