Saturday, September 29, 2012

Fun with Runs Created

I've been rather prolific lately.  It's been fun while it's lasted, though I don't know how much longer that'll be.

Regardless, when I was looking at all of that Chipper Jones stuff yesterday, I couldn't help but start looking at Runs Created.  Why?  Well, when I was thinking "What might Chipper Jones have 1500 of?" one of the things that crossed my mind was RC.  And, in fact, Chipper does have over 1500 RC.

For the uninitiated, Runs Created has many versions, but the basic version is this:

RC=((H+BB)*TB)/(AB-H)

It's a handy formula, because it only takes four variables.  Just for funsies, I checked six teams this year, to see how closely these factors matched with what we'd expect them to be.  I tried to get teams from different types of parks, overachievers, underachievers, good teams, and bad.  And I didn't want to do all 30 teams.  So anyway, you can see how ridiculously accurate RC is.  The team is listed, and then actual Runs/RC:

Milwaukee:  752/748
Pittsburgh:  639/626
Los Angeles Dodgers: 616/606
Colorado Rockies:  745/777
New York Mets:  639/640
New York Yankees:  765/789
Boston Red Sox: 721/714

So you can see that Colorado's been a pretty extreme example... but they've been awful, so it makes sense that they've been underachieving what components would have you believe they'd do.

Anyway, RC doesn't actually work great for individual hitters.  The reason is that the formula assumes that these four factors (AB, H, BB, TB) actually interact with one another.  Well, obviously, Chipper Jones doesn't just interact with his own AB/H/BB/TB... he actually interacts with other people's... and those other people weren't as good as Chipper Jones.  So RC naturally overestimates the abilities of good hitters.  But who cares?  It's still fun, and it's a good, summative way to look at basically all parts of hitting while still being easy enough to calculate yourself.  So keep this in your pocket, because it's fun to pull out sometimes.

And on to the crux of this post...

So, after goofin' around with RC a little, I looked at the leaderboard on Baseball-Reference (mad shout-out to them... that's where I've gotten the stats for the last few posts I've done, but I completely forgot to credit them - sorry, Sean and Neil!), and realized that they used a more complex version of RC*.  No biggy.  I just used their top 100 players, and recalculated the "basic" version of RC for each player.

*Under normal circumstances, the various "technical" versions of RC don't differ that much from one another, but they do for a couple of people.  Specifically, they crush the guys who have mad base-stealing skills.  Barry Bonds loses 246 Runs using the basic version, and so does Joe Morgan - seriously, both lost 246, on the nose.  Rickey Henderson lost - get this - 334 runs.  Those are HUGE differences, and I'm sorry to have to not include them.  But it does take away from the beautiful simplicity of RC the more you add.  And most other players were affected by 40 runs or fewer.  That sounds huge, but since we're dealing with guys who, for the most part, had 20-year careers, we're talking 2 R/year, which is pretty insignificant.  74/100 were affected by 60 runs or fewer, and no one had 60 or more runs added by using the less-technical version.  Besides, while it does affect the number, it only rarely has a significant effect on the order of players, so I decided to do this as simply as possible.

Anyway, this is really just for fun.  So I calc'd it, and looked at the results.  Interesting, no doubt.  But then I decided to look at it as a rate stat.  Actually, as a ratio stat, because I didn't want to estimate PAs, or use ABs, or use AB+BB (which would have been fine, I guess).  So I used batting outs (AB-H).  My question was, who has "hit" .300, using RC/O (that's runs created/out)?

Well, of these 100 guys, less than 1/3 of them.  Ruth tops the list, with nearly 1 RC for every 2 outs (.495).  Hank Aaron "hit" exactly .300 by this calculation.  As always, the list was populated by pre-integration guys, and guys who played in the Selig Era.  Every one of the 32 guys fits into one of those 3 categories, or is Hank Aaron, Willie Mays, or Mickey Mantle.  I was expecting Mike Schmidt, but he finished lower at .273 than (get this) Will Clark (.274).  In case you're curious, the worst finisher by rate was Lou Brock, who "hit" .198.  Now, I know what you're thinking - "but Brock was a basestealer!  So he probably got robbed by using the basic formula!"  Nope.  Brock lost only 46 runs by using the basic version of RC.  Still a hefty load, yes, but only enough to push him up from #100 to #98.  So, yeah.

But then I thought, well, why don't I take the rate, and multiply it by the raw number.  That should give a nice compromise.  Of course, it does do this.  If you're into algebra, write out the equation, look at the cancelling, and be amused.  I was.  But it doesn't really matter, because it pretty much looks right.  Anyway, by this reckoning, here are the top ten hitters of all-time, balancing rate and total.  They're presented as Runs Created/RCRate/RC*RCRate, with the third of those being the organizing principle.

Babe Ruth 2733/.495/1352
Ted Williams 2347/.468/1091
Barry Bonds 2646/.383/1013
Lou Gehrig 2250/.426/959
Stan Musial 2551/.348/887
Ty Cobb 2510/.346/870
Jimmie Foxx 2119/.386/818
Rogers Hornsby 2030/.387/786
Hank Aaron 2576/.300/772
Willie Mays 2333/.307/716

Something about having Aaron and Mays next to one another feels really right about this.

In case you're curious, since we've been talking Chipper lately, he ranks #19, for now.  I say "for now" not just because he's active, but because he's two spots behind A-Rod, one ahead of Todd Helton, and two ahead of Jim Thome.  Actually by B-R RC, RC rate, or RCRate+RC, Thome and Jones end up right next to each other, so I guess they belong together.  But anyway, there could still be some movement among those guys, even by the end of the season, so nothing's really set in stone there.  Manny Ramirez (#12) is the highest ranking "active" player.  Among "active" players who actually are active, Albert Pujols tops the list (#14).  The lowest ranking player among these 100 was Steve Finley.  At #99 was Lou Brock.  Derek Jeter ranks 1 point behind Lance Berkman (432-433).  I wouldn't have guessed that.  Edgar Martinez ranks at #37 - and people say he's not a HOFer.  Milwaukee's own Al Simmons ranks #22.  Speaking of Milwaukee, Robin Yount is all over the map.  He ranks #55 by RC (behind another Milwaukee connection, Eddie Mathews), by rate he ranks #98 (ahead of Finley and Brock), and by overall, he ranks #90.  Frankly, it's not bad for a SS, I think.  Molly fairs better, #56 overall; and since I mentioned Mathews, he's one spot ahead of Molitor.

So, that's pretty much it, I guess.

No comments:

Post a Comment