Ryan Theriot is popular with the fans in the stands and with his manager, Lou Piniella, and when he won the Cubs’ shortstop job early last season he was dubbed “The Riot” and was popular with everybody, including the blogerati that influence serious baseball discussion on the web and in the papers. Now that he has a season’s worth of statistics to his name, however, there is a lot of murmuring and grumbling about him. He takes a bit of a beating on the web every day for not measuring up to the two standards that are applied to every hitter, OBP (onbase percentage) and SLG (slugging percentage). A low slugging percentage is not a big issue for Theriot, since he’s not that kind of hitter; but the statisticalminded crowd expects him to compensate with a high onbase number. His .326 in that department is about 25 points too low.
OBP is like batting average (BA) except that it includes walks and the atbats that produce walks. In other words, a walk is as good as a hit. Some players receive a lot of walks, and they always have fans and supporters.
I disagree with this stress on the importance of OBP, especially for a speed player like Theriot. OBP is a game that beefy sluggers play, not singles hitters who can’t buy a walk for the simple reason that opposing pitchers and managers are not stupid. The OBP of a basestealing singles hitter will always be pegged pretty closely to his BA. MVP voters in the National League must have realized this when they gave the award in 2007 to a basestealing shortstop with an OBP below .350.
So I’m inclined to mount a defense for Ryan Theriot. I won’t be defending Theriot in the way that a Chicago writer tried to a couple of weeks ago, by citing Theriot’s penchant for getting his uniform dirty along with other socalled intangibles. I’ll try never to use the word “scrappy.” If Theriot is valuable, there are numbers that can illustrate his worth. The numbers may simply not be in the public domain.
Let me give an example of baserunning numbers that the statistical community has designated “not in stock, no delivery date.” Suppose I wanted to know the number of times Theriot scored–and his scoring percentage–after reaching first with none out. This would be an interesting stat to me, because in the same way that a power hitter’s job is to drive in runs, a skilled baserunner’s job is to score after reaching base early in an inning. His job is not merely to steal bases; that’s a means to an end.
The place to look for a stat like this is baseballreference.com. If there’s an offensive number, they have it. I can go there, pull up Theriot’s page, and look at “splits” for 2007–a large set of game situations to which the usual set of offensive numbers can be applied.
Sure enough, I find a “split” for Theriot’s performance when “leading off an inning.” A couple of columns are blank, though: runs and stolen bases. I can tell everything that Theriot did leading off an inning except what he did once he got on base.
Is this a conspiracy against baserunning numbers? No, it’s just the nature of baseball offensive stats in general. Everything is geared toward a single at bat. In this one at bat, “A” did this or that; and “B” scored, ran into a force at second or stole a base. But if a player singles with none out and scores on a groundout two outs and three at bats later, there’s nothing in the database to correlate the leadoff single with the scoring.
I am convinced that the poor state of baserunning stats is the reason that baserunning is undervalued in baseball today, and why statistically minded fans scratch their heads every day and wonder what the manager is thinking, putting a fast guy with a low OBP at the top of the lineup.
You have to first put the right data into a database before you can pull it out. So I did exactly that. Experienced with software and databases, I set about generating, using readily available game logs, a database of the Cubs 2007 season at the plate and on the bases, one base at a time.
Armed with this new data, I am proposing five speedrelated metrics for judging how well a player helps his team score runs. I use Cub data from 2007 to illustrate what those metrics might look like when applied. Here are five scenarios that I used to separate the runners from the sloggers:
1) runner on first, second unoccupied, batter hits a single. Does the runner get to third or does he stop at second?
2) runner on first, batter hits a double. Does the runner reach home or stop at third?
3) runner on second, batter hits a single. Does the runner score or stop at third?
4) runner on third with less than two outs. Does the runner score in the inning?
5) runner on first with none out and no one on second. Does the runner score in the inning?
I especially like number five because it’s where a player’s speed and savvy are the most highly leveraged. A speed player has this insight into one of baseball’s mysterious secrets: if you’re on first with none out, you can score even if the three hitters behind you make outs.
Number five is different from the other four in that it overlaps with them. Ideally it’s nice to have a set of numbers where you don’t count the same event twice. But tell that to the guys who lay out the current popular hitting stats. When you hit a home run, practically every other offensive number gets larger except doubles and triples.
The number columns are given in sets of threes, where the first number indicates success, the second, opportunities, and the third, percent success against opportunities. So for example, column 1 is number of times going from first to third on a single; column 2 is number of times on first when the batter singled; and column 3 is #1 (x100) divided by #2. The final three columns on the right are total successes, total opportunities, and cumulative rate of success.
Cubs 2007 baserunning:

Theriot 
8 
21 
38% 
7 
11 
63% 
11 
16 
68% 
20 
35 
57% 
22 
45 
48% 
68 
128 
53% 


Soriano 
8 
13 
61% 
5 
5 
100% 
11 
16 
68% 
17 
36 
47% 
23 
53 
43% 
64 
123 
52% 


DeRosa 
5 
18 
27% 
5 
10 
50% 
13 
16 
81% 
11 
30 
36% 
17 
41 
41% 
51 
115 
44% 


Pie 
2 
6 
33% 
2 
3 
66% 
2 
6 
33% 
10 
22 
45% 
4 
10 
40% 
20 
47 
42% 


Lee 
6 
34 
17% 
4 
12 
33% 
21 
24 
87% 
10 
29 
34% 
16 
47 
34% 
57 
146 
39% 


Jones 
2 
19 
10% 
3 
6 
50% 
8 
10 
80% 
8 
20 
40% 
14 
33 
42% 
35 
88 
39% 


Izturis 
2 
6 
33% 
1 
2 
50% 
5 
7 
71% 
2 
10 
20% 
9 
29 
31% 
19 
54 
35% 


Murton 
1 
7 
14% 
3 
3 
100% 
5 
10 
50% 
6 
20 
30% 
0 
10 
0% 
15 
50 
30% 


Ramirez 
3 
21 
14% 
5 
9 
55% 
8 
14 
57% 
6 
27 
22% 
11 
40 
27% 
33 
111 
29% 


Barrett 
2 
9 
22% 
1 
4 
25% 
1 
3 
33% 
3 
7 
42% 
6 
22 
27% 
13 
45 
28% 


Mostly this table shows that fast runners do more with their baserunning opportunities, which we knew already, but it’s good to see the numbers in black and white. A couple of the numbers are worth mulling over, and could change the way someone thinks about the game. A lowOBP hitter (Theriot or Soriano) with speed gets to third with less than two out–here I’m referring to the middle column under “score from 3rd”–more often than a highOBP guy like DeRosa. As every Cub fan has heard by now, Theriot’s OBP was .326, Soriano’s .337, DeRosa’s .371. If Theriot is on third with less than two out, he scores 57% of the time. In baseball this is good odds. It suggests that stealing second is not just a gimmick but part of a strategy to get to third with one out after you have led off an inning with a single. By whatever means, Theriot scored 48% of the time after reaching first with nobody out and no one on second. Let’s ponder this number a moment.
If Theriot has less than a 30% chance of getting a hit in the first place, but once on first, an almost 50% chance of scoring, you’re not going to throw him too many pitches out of the strike zone if you can help it. No matter how good his eye, his OBP will never exceed his BA by very much. It’s not his fault; he couldn’t help it, he was born fast. Now take a slugger who doesn’t run well. His chance of getting a hit–let’s say he has a batting average in the .270 to .295 range–may be about the same as his scorefromfirstnobodyout ratio. (See Aramis Ramirez in the table above.) If there are outs already in the inning, he is even less likely to score from first, but just as likely to drive the ball. From a defensive point of view, you’re going to be hesitant to throw strikes, especially when any pitch could end up against the outfield wall or over it, which was not a worry with Theriot.
Of course, Theriot’s 48% number is with none out. His scoring percentage when he reached first with one out (and nobody on second) fell to 38%. This is still much higher than his 27% batting average, and still an excellent reason not to walk him.
Another reason to put a slow man on base is that he gets in the way of faster men, as funny as that sounds. A couple of years ago, former Cub manager Dusty Baker was ridiculed for saying that a lot of men on base was not necessarily a good thing, because they might clog the bases. As a disgruntled fan at the time I joined in the laughter. But consider the following two scenarios. Suppose Theriot and Albert Pujols, a majestic slugger but a slow runner, are on the same team. Scenario A, Pujols on second and Theriot on first. Scenario B, Theriot is on first with nobody on second or third. In which scenario is the team more likely to score one run as opposed to none?
The answer is obviously B, since Theriot will want to steal second, with an 87.5% likelihood of success if he tries, at which point the team will have a hare instead of a tortoise in scoring position.
Take this concept a couple steps further. If clogging is real, then aren’t pitchers and their managers likely to factor it into their strategy? Doesn’t this imply that a baseonballs can be a net plus or a net minus for both sides, depending whether the hitter is fast or slow and whether his bat is a cannon or a popgun? I point this out because in current baseball thinking where OBP is king, a walk is always a plus for the offense. If a walk can be a minus, then a speedchallenged slugger with a high OBP may not be the offensive force that he is reckoned by those who admire his slugging and onbase percentages.
Take another hypothetical. Suppose you have a player who is twice the hitter that Pujols is, but even slower. Call this prodigy Ruthols. Who would you rather have come to bat with the bases empty in the eighth inning and your team down a run, Pujols or Ruthols? Since either is likely to be given a free pass to first, and neither is likely to be pinchrun for, the answer should be the weaker hitter but relatively swifter runner.
To state it as simply as possible, you can usually pitch around Bonds or Pujols. But how do you pitch around Rollins or Reyes or Hanley Ramirez?
When a pitcher intentionally (or semiintentionally) walks a man, is he failing to “challenge the hitter,” as the phrase is, or is he simply issuing a different challenge? “Here’s first base, now go die on the bases.” Slugging and OBP, which had seemed independent of each other, begin to seem correlated. While plodding sluggers are given first base and challenged to run, which inflates their OBP, speedy nonsluggers are still challenged to hit, and OBP falls until it can’t fall any lower, somewhere in the neighborhood of BA. Sluggers who walk a lot also see a lot of hitter’s counts, 20 and 31, which doesn’t hurt their SLG.
If you doubt that OBP tracks inversely with speed, here is a list of OBP overachievers in the NL in 2007, the only hitters I found with a season’s worth of atbats who had OBPs that were at least a hundred points higher than their BA.

Bonds 
.480 
.276 
28 
66 
75 
132 
5 


Helton 
.434 
.320 
17 
91 
86 
116 
0 


Pujols 
.429 
.327 
32 
103 
99 
99 
2 


Burrell 
.400 
.256 
30 
97 
77 
114 
0 


Fielder 
.395 
.288 
50 
119 
109 
90 
2 


Howard 
.392 
.268 
47 
136 
94 
107 
1 


Dunn 
.386 
.264 
40 
106 
101 
101 
9 


Berkman 
.386 
.278 
34 
102 
95 
94 
7 


This is also a list, obviously, of the hitters who draw the most walks. And it’s a list of the slowest firstbasemanleftfielder types in the league: sluggers who run like slugs.
Obviously walks lead to runs, no matter how slow you are. All of these hitters except Bonds and Burrell scored more than Theriot’s 80 runs. If we look at runs as a percentage of plate appearances, however, Theriot scores higher than Burrell and Helton and is only eight to twelve points behind Berkman, Howard and Pujols. Soriano, meanwhile, another low OBP guy, is ahead of Pujuls and trails only Fielder, Dunn and Bonds in RSP–that is, runsscored percentage, or runs divided by plate appearances. See the table below.

Fielder 
.395 
.288 
50 
119 
109 
90 
2 
681 
.160 


Dunn 
.386 
.264 
40 
106 
101 
101 
9 
632 
.160 


Bonds 
.480 
.276 
28 
66 
75 
132 
5 
477 
.157 


Soriano 
.337 
.299 
33 
97 
97 
31 
19 
617 
.157 


Pujols 
.429 
.327 
32 
103 
99 
99 
2 
679 
.146 


Howard 
.392 
.268 
47 
136 
94 
107 
1 
648 
.145 


Berkman 
.386 
.278 
34 
102 
95 
94 
7 
668 
.142 


Theriot 
.326 
.266 
3 
45 
80 
49 
28 
597 
.134 


Burrell 
.400 
.256 
30 
97 
77 
114 
0 
598 
.129 


Helton 
.434 
.320 
17 
91 
86 
116 
0 
682 
.126 


A companion stat to RSP would be something like RDIP, or runsdrivenin percentage, your RBI divided by plate appearances. One little problem with these two runrelated metrics side by side is that you’ve counted home runs twice. I solve this by splitting the baby in half: half of a run and half an RBI for each homer.
If you compute RSP this way–runs by plate appearances, with a home run counted as half a run–the chart gets really interesting, because Theriot and Soriano, the speed guys, rise to the top. The two players with the lowest OBP score the most runs per plate appearance. They are also the two skilled baserunners in the group. The formula for RSP here is (R(HR/2))/PA.

Theriot 
.326 
.266 
3 
45 
80 
49 
28 
597 
.131 


Soriano 
.337 
.299 
33 
97 
97 
31 
19 
617 
.130 


Dunn 
.386 
.264 
40 
106 
101 
101 
9 
632 
.128 


Bonds 
.480 
.276 
28 
66 
75 
132 
5 
477 
.128 


Fielder 
.395 
.288 
50 
119 
109 
90 
2 
681 
.123 


Pujols 
.429 
.327 
32 
103 
99 
99 
2 
679 
.122 


Berkman 
.386 
.278 
34 
102 
95 
94 
7 
668 
.117 


Helton 
.434 
.320 
17 
91 
86 
116 
0 
682 
.114 


Howard 
.392 
.268 
47 
136 
94 
107 
1 
648 
.109 


Burrell 
.400 
.256 
30 
97 
77 
114 
0 
598 
.104 


I realize that a home run does count, and should count, as a run for the hitter, but we were discussing the relevance of OBP–of walks, essentially–to runscoring. The thirtyfive home runs, on average, belted by these walkingandslugging Clydesdales do not strengthen the correlation between walks and runs, since a walk never leads to a home run and only rarely leads to an RBI for the walker. Statistically, a walk can only contribute to (and therefore correlate with) a few totals: Walks, OBP, Stolen Bases, and Runs Minus Home Runs, to name the obvious ones. A walk could push a runner over, and he could score. But is anybody counting such events, or looking critically at OBP in terms of runs actually produced?
RSP is a simple formula with a clear meaning, and yet I’ve never seen it invoked. I had to make it up, actually. OBP is a simple formula with a clear meaning, and one sees it used everywhere. Isn’t this a little strange, since RSP is just the runrelated version of OBP? Wouldn’t a runner crossing the plate be more significant than a batter attaining first?
The explanation, I think, is that a runner crossing the plate is messy, because there are so many variables. How did he get home, and to third before that, and to second, and to first? Where did he hit in the batting order? Who was behind him? Who are his teammates, his manager? Who deserves credit, and how much, for the run?
OBP is a relatively neutral stat, as is SLG. They are fairly pure, not all muddied up, like runs. Runs are the currency of baseball, but money is considered dirty, too, I suppose. It depends how much of a purist you are.
If runs are the currency of baseball, OBP and SLG are the currency of baseball statistics. This may be an example of the perfect being the enemy of the good. OBP and SLG tried to remain uncontaminated by any reference to actual runs scored. If you think about it, OBP is just a runscoring stat, and SLG a runproducing stat, without the runs. They may be pure and perfect in that sense, but as a measure of offensive performance, one of them, at least, OBP, is not particularly good.
I’ll go with runs and with the players who knock them in and with those who steal a base and take the extra base in pursuit of runs; and with their oldschool managers. I’ve never heard a manager say, “Well, we lost, but at least we left twelve men on base. I’ll take that any day!”
Comments 1
Ryan Theriot is awesome. I say he is the best player ever. But he has 6 homeruns not 3. SO that even makes him better.
Posted 19 Jun 2009 at 1:13 pm ¶Post a Comment