Ryan Theriot is popular with the fans in the stands and with his manager, Lou Piniella, and when he won the Cubs’ shortstop job early last season he was dubbed “The Riot” and was popular with everybody, including the blogerati that influence serious baseball discussion on the web and in the papers. Now that he has a season’s worth of statistics to his name, however, there is a lot of murmuring and grumbling about him. He takes a bit of a beating on the web every day for not measuring up to the two standards that are applied to every hitter, OBP (on-base percentage) and SLG (slugging percentage). A low slugging percentage is not a big issue for Theriot, since he’s not that kind of hitter; but the statistical-minded crowd expects him to compensate with a high on-base number. His .326 in that department is about 25 points too low.
OBP is like batting average (BA) except that it includes walks and the at-bats that produce walks. In other words, a walk is as good as a hit. Some players receive a lot of walks, and they always have fans and supporters.
I disagree with this stress on the importance of OBP, especially for a speed player like Theriot. OBP is a game that beefy sluggers play, not singles hitters who can’t buy a walk for the simple reason that opposing pitchers and managers are not stupid. The OBP of a base-stealing singles hitter will always be pegged pretty closely to his BA. MVP voters in the National League must have realized this when they gave the award in 2007 to a base-stealing shortstop with an OBP below .350.
So I’m inclined to mount a defense for Ryan Theriot. I won’t be defending Theriot in the way that a Chicago writer tried to a couple of weeks ago, by citing Theriot’s penchant for getting his uniform dirty along with other so-called intangibles. I’ll try never to use the word “scrappy.” If Theriot is valuable, there are numbers that can illustrate his worth. The numbers may simply not be in the public domain.
Let me give an example of baserunning numbers that the statistical community has designated “not in stock, no delivery date.” Suppose I wanted to know the number of times Theriot scored–and his scoring percentage–after reaching first with none out. This would be an interesting stat to me, because in the same way that a power hitter’s job is to drive in runs, a skilled baserunner’s job is to score after reaching base early in an inning. His job is not merely to steal bases; that’s a means to an end.
The place to look for a stat like this is baseball-reference.com. If there’s an offensive number, they have it. I can go there, pull up Theriot’s page, and look at “splits” for 2007–a large set of game situations to which the usual set of offensive numbers can be applied.
Sure enough, I find a “split” for Theriot’s performance when “leading off an inning.” A couple of columns are blank, though: runs and stolen bases. I can tell everything that Theriot did leading off an inning except what he did once he got on base.
Is this a conspiracy against baserunning numbers? No, it’s just the nature of baseball offensive stats in general. Everything is geared toward a single at bat. In this one at bat, “A” did this or that; and “B” scored, ran into a force at second or stole a base. But if a player singles with none out and scores on a groundout two outs and three at bats later, there’s nothing in the database to correlate the leadoff single with the scoring.
I am convinced that the poor state of baserunning stats is the reason that baserunning is undervalued in baseball today, and why statistically minded fans scratch their heads every day and wonder what the manager is thinking, putting a fast guy with a low OBP at the top of the lineup.
You have to first put the right data into a database before you can pull it out. So I did exactly that. Experienced with software and databases, I set about generating, using readily available game logs, a database of the Cubs 2007 season at the plate and on the bases, one base at a time.
Armed with this new data, I am proposing five speed-related metrics for judging how well a player helps his team score runs. I use Cub data from 2007 to illustrate what those metrics might look like when applied. Here are five scenarios that I used to separate the runners from the sloggers:
1) runner on first, second unoccupied, batter hits a single. Does the runner get to third or does he stop at second?
2) runner on first, batter hits a double. Does the runner reach home or stop at third?
3) runner on second, batter hits a single. Does the runner score or stop at third?
4) runner on third with less than two outs. Does the runner score in the inning?
5) runner on first with none out and no one on second. Does the runner score in the inning?
I especially like number five because it’s where a player’s speed and savvy are the most highly leveraged. A speed player has this insight into one of baseball’s mysterious secrets: if you’re on first with none out, you can score even if the three hitters behind you make outs.
Number five is different from the other four in that it overlaps with them. Ideally it’s nice to have a set of numbers where you don’t count the same event twice. But tell that to the guys who lay out the current popular hitting stats. When you hit a home run, practically every other offensive number gets larger except doubles and triples.
The number columns are given in sets of threes, where the first number indicates success, the second, opportunities, and the third, percent success against opportunities. So for example, column 1 is number of times going from first to third on a single; column 2 is number of times on first when the batter singled; and column 3 is #1 (x100) divided by #2. The final three columns on the right are total successes, total opportunities, and cumulative rate of success.
Cubs 2007 baserunning:
Mostly this table shows that fast runners do more with their baserunning opportunities, which we knew already, but it’s good to see the numbers in black and white. A couple of the numbers are worth mulling over, and could change the way someone thinks about the game. A low-OBP hitter (Theriot or Soriano) with speed gets to third with less than two out–here I’m referring to the middle column under “score from 3rd”–more often than a high-OBP guy like DeRosa. As every Cub fan has heard by now, Theriot’s OBP was .326, Soriano’s .337, DeRosa’s .371. If Theriot is on third with less than two out, he scores 57% of the time. In baseball this is good odds. It suggests that stealing second is not just a gimmick but part of a strategy to get to third with one out after you have led off an inning with a single. By whatever means, Theriot scored 48% of the time after reaching first with nobody out and no one on second. Let’s ponder this number a moment.
If Theriot has less than a 30% chance of getting a hit in the first place, but once on first, an almost 50% chance of scoring, you’re not going to throw him too many pitches out of the strike zone if you can help it. No matter how good his eye, his OBP will never exceed his BA by very much. It’s not his fault; he couldn’t help it, he was born fast. Now take a slugger who doesn’t run well. His chance of getting a hit–let’s say he has a batting average in the .270 to .295 range–may be about the same as his score-from-first-nobody-out ratio. (See Aramis Ramirez in the table above.) If there are outs already in the inning, he is even less likely to score from first, but just as likely to drive the ball. From a defensive point of view, you’re going to be hesitant to throw strikes, especially when any pitch could end up against the outfield wall or over it, which was not a worry with Theriot.
Of course, Theriot’s 48% number is with none out. His scoring percentage when he reached first with one out (and nobody on second) fell to 38%. This is still much higher than his 27% batting average, and still an excellent reason not to walk him.
Another reason to put a slow man on base is that he gets in the way of faster men, as funny as that sounds. A couple of years ago, former Cub manager Dusty Baker was ridiculed for saying that a lot of men on base was not necessarily a good thing, because they might clog the bases. As a disgruntled fan at the time I joined in the laughter. But consider the following two scenarios. Suppose Theriot and Albert Pujols, a majestic slugger but a slow runner, are on the same team. Scenario A, Pujols on second and Theriot on first. Scenario B, Theriot is on first with nobody on second or third. In which scenario is the team more likely to score one run as opposed to none?
The answer is obviously B, since Theriot will want to steal second, with an 87.5% likelihood of success if he tries, at which point the team will have a hare instead of a tortoise in scoring position.
Take this concept a couple steps further. If clogging is real, then aren’t pitchers and their managers likely to factor it into their strategy? Doesn’t this imply that a base-on-balls can be a net plus or a net minus for both sides, depending whether the hitter is fast or slow and whether his bat is a cannon or a popgun? I point this out because in current baseball thinking where OBP is king, a walk is always a plus for the offense. If a walk can be a minus, then a speed-challenged slugger with a high OBP may not be the offensive force that he is reckoned by those who admire his slugging and on-base percentages.
Take another hypothetical. Suppose you have a player who is twice the hitter that Pujols is, but even slower. Call this prodigy Ruthols. Who would you rather have come to bat with the bases empty in the eighth inning and your team down a run, Pujols or Ruthols? Since either is likely to be given a free pass to first, and neither is likely to be pinch-run for, the answer should be the weaker hitter but relatively swifter runner.
To state it as simply as possible, you can usually pitch around Bonds or Pujols. But how do you pitch around Rollins or Reyes or Hanley Ramirez?
When a pitcher intentionally (or semi-intentionally) walks a man, is he failing to “challenge the hitter,” as the phrase is, or is he simply issuing a different challenge? “Here’s first base, now go die on the bases.” Slugging and OBP, which had seemed independent of each other, begin to seem correlated. While plodding sluggers are given first base and challenged to run, which inflates their OBP, speedy non-sluggers are still challenged to hit, and OBP falls until it can’t fall any lower, somewhere in the neighborhood of BA. Sluggers who walk a lot also see a lot of hitter’s counts, 2-0 and 3-1, which doesn’t hurt their SLG.
If you doubt that OBP tracks inversely with speed, here is a list of OBP overachievers in the NL in 2007, the only hitters I found with a season’s worth of at-bats who had OBPs that were at least a hundred points higher than their BA.
This is also a list, obviously, of the hitters who draw the most walks. And it’s a list of the slowest first-baseman-left-fielder types in the league: sluggers who run like slugs.
Obviously walks lead to runs, no matter how slow you are. All of these hitters except Bonds and Burrell scored more than Theriot’s 80 runs. If we look at runs as a percentage of plate appearances, however, Theriot scores higher than Burrell and Helton and is only eight to twelve points behind Berkman, Howard and Pujols. Soriano, meanwhile, another low OBP guy, is ahead of Pujuls and trails only Fielder, Dunn and Bonds in RSP–that is, runs-scored percentage, or runs divided by plate appearances. See the table below.
A companion stat to RSP would be something like RDIP, or runs-driven-in percentage, your RBI divided by plate appearances. One little problem with these two run-related metrics side by side is that you’ve counted home runs twice. I solve this by splitting the baby in half: half of a run and half an RBI for each homer.
If you compute RSP this way–runs by plate appearances, with a home run counted as half a run–the chart gets really interesting, because Theriot and Soriano, the speed guys, rise to the top. The two players with the lowest OBP score the most runs per plate appearance. They are also the two skilled baserunners in the group. The formula for RSP here is (R-(HR/2))/PA.
I realize that a home run does count, and should count, as a run for the hitter, but we were discussing the relevance of OBP–of walks, essentially–to run-scoring. The thirty-five home runs, on average, belted by these walking-and-slugging Clydesdales do not strengthen the correlation between walks and runs, since a walk never leads to a home run and only rarely leads to an RBI for the walker. Statistically, a walk can only contribute to (and therefore correlate with) a few totals: Walks, OBP, Stolen Bases, and Runs Minus Home Runs, to name the obvious ones. A walk could push a runner over, and he could score. But is anybody counting such events, or looking critically at OBP in terms of runs actually produced?
RSP is a simple formula with a clear meaning, and yet I’ve never seen it invoked. I had to make it up, actually. OBP is a simple formula with a clear meaning, and one sees it used everywhere. Isn’t this a little strange, since RSP is just the run-related version of OBP? Wouldn’t a runner crossing the plate be more significant than a batter attaining first?
The explanation, I think, is that a runner crossing the plate is messy, because there are so many variables. How did he get home, and to third before that, and to second, and to first? Where did he hit in the batting order? Who was behind him? Who are his teammates, his manager? Who deserves credit, and how much, for the run?
OBP is a relatively neutral stat, as is SLG. They are fairly pure, not all muddied up, like runs. Runs are the currency of baseball, but money is considered dirty, too, I suppose. It depends how much of a purist you are.
If runs are the currency of baseball, OBP and SLG are the currency of baseball statistics. This may be an example of the perfect being the enemy of the good. OBP and SLG tried to remain uncontaminated by any reference to actual runs scored. If you think about it, OBP is just a run-scoring stat, and SLG a run-producing stat, without the runs. They may be pure and perfect in that sense, but as a measure of offensive performance, one of them, at least, OBP, is not particularly good.
I’ll go with runs and with the players who knock them in and with those who steal a base and take the extra base in pursuit of runs; and with their old-school managers. I’ve never heard a manager say, “Well, we lost, but at least we left twelve men on base. I’ll take that any day!”