Saturday, January 28, 2012

Baseball Hall of Fame Rankings

For a while, I've been envisioning a system to rank players by Hall of Fame worthiness. There are a couple of problems, though. First, I had no access to good data. Second, the market is already flooded with good systems - Bill James Hall of Fame Monitor and Hall of Fame Standards (which rank likelihood of entering the Hall, not merit), Jay Jaffe's JAWS system at Baseball Prospectus, Mike Hoban's CAWS system at Seamheads, and Adam Darowski's wWAR system at Baseball Think Factory. They each use different mathematical models and bases - James uses standard, "newspaper" statistics; Jaffe uses Baseball Prospectus' WARP (Wins Above Replacement Player); Hoban uses Bill James Win Shares; and Darowski uses rWAR (or, as some would call it, bWAR) - the system invented by Rally (Sean Smith) and the most common WAR system, thanks to being hosted at baseball-reference.com. While the systems are different (and I don't really want to get into the nitty-gritty here - go to their sites to find out if you're really interested), they reach largely the same conclusions. So why would I want to do it myself when I don't have the data and there's already good stuff out there?

Well, for one, I have always found value in doing something myself, even if others have already come up with a way to do things. Second of all, these systems all do some things with which I disagree. Third, I found a way to get the data I needed. Over at one of the sites I frequent, The Baseball Gauge, the proprietor, Dan Hirsch, has all of his data free for download. I actually needed a little extra help, but he's super nice about stuff, and we corresponded over e-mail and he helped to give me what I needed. He developed a WAR system over there (Base Runs for offense; Runs Saved, similar to Win Shares, for defense; DIPS 2.0 for pitching), and it's free to use.

So now that I had the data, what was my problem with the other systems. Well, several of them (and others online) use some arbitrary cutoffs - something along the line of [peakWAR+careerWAR]/2. That's the basic formula. Of course, there's really nothing wrong with that. However, how does one define "peak" WAR? 4 seasons? 5 seasons? 7 seasons? 10 seasons? I've seen all of those iterations. So I went to work on the problem. Here are the basic principles to which I stuck through this process.

1.) 10 seasons is the key. Why? Well, it's not arbitrary, for one. The Baseball Hall of Fame requires 10 seasons played for entry. Since that's one of the few rules, it makes sense to me to stick to it.

2.) Big seasons are better than consistency. Imagine two guys with 8 WAR. One of them has 7 WAR one year, 1 the next. The second guy has 4 each year. I prefer the first guy, because with a season that big, you're almost guaranteed to make the playoffs, and have a shot at the World Series. With the second guy, well, lots of guys manage 4 WAR. That may not help the team win. And sure, the first guys may not help at all the second year (with only 1 WAR, he'd be a sub-average player), but, as they say, "Flags fly forever" - in other words, the goal is to win, and you can't take a win a pennant. So the more a player helps to win a pennant, the more it counts. Of course, I'm not counting actual pennants, so I'm going with the seasons that give you the best chance of winning pennants - the biggest seasons.

3.) More different things will give a better estimate than just one thing. I used three different systems, each consisting of two parts. I'll explain.

So, here's the actual system. First, rank the seasons, descending from best WAR to worst. Then...

#1: Add up total WAR. (=A)
#2: Add up total WAR in top ten seasons. (=B)
#3: Add up total WAR, counting the top season 30 times, the next season 29 times, the next season 28 times, the next season 27 times, etc., until you've exhausted all the player's seasons. Then divide by 30. (=C)
#4: Add up total WAR in top ten seasons, counting the top season 10 times, the next season 9 times, the next season 8 times, the next season 7 times, etc., until you've used up all ten seasons. Then divide by 10. (=D)
#5: Add up total WAR, counting the top season 45 times, the next season 44 times, the next season 43 times, the next season 42 times, etc., until you've exhausted all the player's seasons. Then divide by 45. (=E)
#6: Add up total WAR in top ten seasons, counting the top season 15 times, the next season 14 times, the next season 13 times, the next season 12 times, etc., until you reach the tenth season, which will count six times. Then divide by 15. (=F)
#7: You'll now have six numbers, any of which could be the best indicator. So now, we take the harmonic mean of the six numbers:

6/[(1/A)+(1/B)+(1/C)+(1/D)+(1/E)+(1/F)]

Voila!

For example, here are some lists. The top 11 at third base:

schmimi01 Mike Schmidt 94.7 70.9 75.5 41.3 81.9 51.2 63.9
matheed01 Eddie Mathews 88.2 68.9 71.1 40.7 76.8 50.1 61.6
boggswa01 Wade Boggs 76.5 61.9 62.8 38.5 67.3 46.3 55.8
brettge01 George Brett 78.0 59.4 63.1 36.6 68.1 44.2 54.5
jonesch06 Chipper Jones 71.9 52.8 56.7 31.2 61.7 38.4 48.1
bakerfr01 Frank Baker 56.5 54.4 49.6 35.4 51.9 41.7 47.0
santoro01 Ron Santo 56.1 53.4 48.9 34.1 51.3 40.6 46.0
hackst01 Stan Hack 58.6 49.4 48.9 30.8 52.1 37.0 44.0
evansda01 Darrell Evans 59.1 45.5 47.8 28.4 51.6 34.1 41.7
mcgrajo01 John McGraw 48.0 46.9 42.8 32.4 44.5 37.3 41.2
rolensc01 Scott Rolen 53.7 47.7 45.2 28.7 48.0 35.0 41.1

The top 12 at catcher (because I was part of the discussion over at Baseball: Past and Present):

benchjo01 Johnny Bench 75.2 63.5 63.0 39.8 67.1 47.7 56.7
cartega01 Gary Carter 68.7 60.2 57.6 37.0 61.3 44.7 52.5
berrayo01 Yogi Berra 75.2 58.4 60.4 34.6 65.3 42.6 52.3
piazzmi01 Mike Piazza 61.0 54.7 52.2 35.3 55.1 41.8 48.3
dickebi01 Bill Dickey 67.5 52.9 54.9 32.7 59.1 39.4 48.1
cochrmi01 Mickey Cochrane 58.7 53.8 50.0 33.0 52.9 40.0 46.2
fiskca01 Carlton Fisk 67.8 47.4 53.2 29.5 58.1 35.5 44.8
hartnga01 Gabby Hartnett 64.4 47.5 51.3 29.1 55.6 35.2 43.8
rodriiv01 Ivan Rodriguez 65.2 47.0 51.2 28.3 55.9 34.5 43.4
torrejo01 Joe Torre 55.3 46.2 46.2 29.5 49.2 35.1 41.6
simmote01 Ted Simmons 52.5 47.9 45.3 28.6 47.7 35.0 41.0
tenacge01 Gene Tenace 48.0 44.3 41.4 28.5 43.6 33.7 38.6

How about the only 3 DHs I checked, because that's a short list:

molitpa01 Paul Molitor 64.9 49.9 51.8 28.8 56.2 35.9 44.4
martied01 Edgar Martinez 57.9 49.5 48.1 29.2 51.4 36.0 42.9
baineha01 Harold Baines 36.9 28.1 30.0 17.4 32.3 21.0 25.9

One last one. How about top 11 CF:

cobbty01 Ty Cobb 153.2 95.1 112.3 57.3 126.0 69.9 91.4
speaktr01 Tris Speaker 138.5 88.5 103.7 52.3 115.3 64.4 83.9
mantlmi01 Mickey Mantle 120.4 87.8 95.1 53.4 103.6 64.8 81.1
mayswi01 Willie Mays 136.5 85.0 101.0 49.0 112.8 61.0 80.4
dimagjo01 Joe DiMaggio 80.8 69.3 67.6 42.0 72.0 51.1 60.7
griffke02 Ken Griffey 74.4 59.6 60.2 35.5 65.0 43.8 52.9
hamilbi01 Billy Hamilton 70.8 61.2 58.9 36.0 62.9 44.4 52.8
snidedu01 Duke Snider 62.5 54.5 52.8 34.9 56.0 41.4 48.4
wynnji01 Jimmy Wynn 57.4 55.5 50.1 35.3 52.6 42.1 47.4
careyma01 Max Carey 67.7 52.6 54.5 31.3 58.9 38.4 47.2
edmonji01 Jim Edmonds 59.2 52.0 49.7 31.9 52.9 38.6 45.3

Hope that was interesting. I'd love to hear thoughts about this kind of stuff.

As one final thing, I'd just like to thank Dan Hirsch for all his help, and to recommend The Baseball Gauge to anyone out there reading this. It's a great resource.

1 comment:

  1. By the way, I've decided that this needed an acronym. How could I not have thought of that Earlier? I'm going to go with WARSCOR.

    Wins Above Replacement Score for Completely Objective Ranking. Yup. That'll do. Plus it rhymes.

    ReplyDelete