With more than 1,200 games in a Major League season, no one sees the performance of every player every inning or even most players most innings. Nor is the average fan trained to evaluate players by eyewitness observation like a professional scout. Statistics are therefore a helpful tool to compare performances among players. A quickly calculated, easily understood statistic that attempts to combine several elements of a player’s performance is particularly welcome in this pursuit.
Fellow columnist Breedlove recently examined whether OPS — on-base percentage plus slugging average — is adequate for this purpose. He found it lacking for at least three reasons: it fails to distinguish individual characteristics, it ignores defense, and it omits baserunning. None of these assertions is false. But they imply that OPS is something more than it actually is: an all-around measure of a player’s contributions at the plate.
What Do The Numbers Say?
Is there any empirical value to OPS in this regard? The quantitative evidence indicates there is. One way to answer this question is to see how well OPS correlates with generating runs. Correlation is the extent to which two variables vary together. Does X rise as Y rises, does X rise as Y falls, or is there no discernable relationship? Measured on a scale of -1.00 to 1.00, the closer to 1.00, the stronger the positive correlation.
For all teams since 1955, a sample of 1,092 team seasons, OPS correlates with runs per game as well as any offensive measure:
Statistic Correlation ------------------------------- OPS .96 Slugging Average .93 On-Base Percentage .90 Batting Average .83 Isolated Power .83
If you put both numbers on the same scale, OPS deviates from runs per game by an average of 3 percent. For 59 percent of the teams, OPS varies from runs per game by less than 3 percent. For 83 percent of the teams, OPS varies from runs per game by less than 5 percent. And only 1 percent of the teams have an OPS that deviates from runs per game by more than 10 percent. On average, the divergence represents 20 runs per season or .13 runs per game.
Whether that degree of accuracy is sufficient is a matter of opinion. But OPS is by no means only weakly or moderately associated with run production. Another point of debate is whether the close relationship between OPS and runs per game for teams applies to individual players. Since the events that increase individual OPS — hits, walks, total bases — also increase team OPS, there is reason to believe they similarly contribute to run production.
Weaknesses Or Red Herrings?
What does won-loss record tell you about a pitcher’s individual characteristics? Does it tell you if he’s a power pitcher? A finesse pitcher? A knuckleballer? A spitballer? How about batting average? Is the player a line-drive hitter? A slugger? A slap hitter? Does he bunt well for base hits? Does he hit ’em where they ain’t? Neither of these measures, on their own, answer these questions. That doesn’t make them useless, in part because these aren’t the questions they set out to answer.
One virtue of OPS is that it takes different kinds of contributions at the plate and attempts to express them in a combined measure. Hitting .300 is a good thing. So is swatting 40 home runs or drawing 100 walks. How do these fit together? OPS tries to figure this out. A batter’s goal is not to be a prolific lead-off man or an efficient clean-up hitter. His goal is to produce runs for his team. And the most important things a hitter does to produce runs — reach base and hit for power — are included in OPS.
OPS doesn’t evaluate defensive performance. Nor is it intended to any more than earned-run average is designed to assess a pitcher’s contributions with the bat or the glove. OPS measures no less defense, however, than batting average, home runs, or stolen bases. You’d never hear someone say the problem with runs or RBI is they don’t tell you how good a fielder a player is. Why should OPS, likewise an offensive measure, be any different?
The Need For Speed
Baserunning is the most obvious offensive element ignored by OPS. How significant is this omission? Breedlove suggests that things like advancing on groundball and flyball outs, going from first to third on a single, and scoring from second on a single and from first on a double are statistically significant factors in run production. Unfortunately, data on these events is difficult to obtain.
Assume, for lack of a better approach, that stolen bases are an acceptable if imperfect proxy for the events cited by Breedlove. If baserunning were a crucial missing link in OPS, stolen bases would presumably correlate in some way with the divergence between OPS and runs per game. In other words, if a team’s OPS didn’t match its runs per game, baserunning would tend to be a reason why.
Yet for all teams since 1955, stolen bases — expressed both as a percentage of times on base and as a percentage of stolen-base attempts — show only faint correlation with the gap between OPS and runs per game. Remember, the scale is -1.00 to 1.00:
Statistic Correlation ------------------------------------------ Stolen Bases per Time on Base .09 Stolen Base Success Rate .03
While a low correlation exists for all teams as a group, it doesn’t necessarily apply to those at the extremes. For the 25 best-running teams since 1955, with an average of 232 stolen bases vs. 85 caught stealing, OPS underestimates their scoring by an average of 22 runs per season. For the 25 worst-running teams since 1955, with an average of 24 stolen bases vs. 26 caught stealing, OPS overestimates their scoring by an average of 15 runs per season.
Breedlove uses another measure to examine this issue, runs per time on base. This approach has appeal: a good baserunner should make better use of his times on base than a poor baserunner. Here are the top 10 and bottom 10 active major-leaguers in this category since 1994 with a minimum of 1,000 times on base. For these lists home runs are excluded from runs and from times on base:
Top 10 R OB R/OB -------------------------------- Kenny Lofton 666 1550 .430 Tom Goodwin 483 1127 .429 Chuck Knoblauch 683 1673 .408 Steve Finley 525 1295 .405 Ray Durham 532 1313 .405 Derek Jeter 527 1319 .400 Omar Vizquel 577 1465 .394 Craig Biggio 674 1716 .393 Johnny Damon 439 1119 .392 Alex Rodriguez 627 1118 .392
Bottom 10 R OB R/OB -------------------------------- Carlos Delgado 303 1124 .270 Todd Zeile 357 1327 .269 Robin Ventura 331 1241 .267 Wally Joyner 292 1124 .260 Eric Karros 322 1255 .257 Fred McGriff 365 1435 .254 Jeff Conine 279 1099 .254 Mo Vaughn 382 1509 .253 Harold Baines 244 1025 .238 Mark McGwire 274 1201 .228
No doubt the top 10 players are distinctly better baserunners than the bottom 10. And for all players since 1994 with at least 1,000 times on base, stolen bases per time on base has a .79 correlation with runs per time on base: a notable positive relationship. But other factors must be considered as well. The top 10 players above all bat chiefly at the top of the line-up. Below is the career runs per time on base for these 10 players as a group by position in the batting order:
Position R OB R/OB ----------------------------- Batting #1 3554 8710 .408 Batting #2 2115 5355 .395 Batting #3 337 923 .365 Batting #4 56 128 .438 Batting #5 128 390 .328 Batting #6 73 232 .315 Batting #7 137 382 .359 Batting #8 126 457 .276 Batting #9 365 1031 .354
The pattern here suggests that even for the best baserunners, line-up position is a factor in runs per time on base. So is the hitting quality of a runner’s teammates. Here’s how a number of statistics correlate with runs per time on base for all teams since 1955:
Statistic Correlation ----------------------------------------- Stolen Bases per Time on Base .18 Stolen Base Success Rate .35 Batting Average .78 On-Base Percentage .73 Slugging Average .76
While the stolen base categories share a positive correlation with runs per time on base, even better correlated are a team’s batting average, on-base percentage, and slugging average.
The correlation with batting average and slugging average is fairly intuitive. You might wonder, though, what on-base percentage has to do with scoring runners once they’re already on base. The answer is that on-base percentage measures the frequency of outs as well as how often someone reaches base. Teams with high on-base percentages make outs less often and give their runners a better chance to score.
Thus, although there’s little doubt that baserunning is reflected in runs per time on base, other factors, such as rank in the batting order and the batting average, on-base percentage, and slugging average of teammates, affect this measure, too. But the numbers generally support what Breedlove asserts: baserunning is a noteworthy aspect of run production that OPS ignores.
That’s Not All
Four additional shortcomings of OPS Breedlove failed to mention: it doesn’t consider situational hitting, playing time, park effects, or league offensive levels.
Given equal numbers of times on base and total bases, a player who hits better with runners on base is likely to generate more runs than one who doesn’t. This is not to say that players have clutch ability, rather that they should receive credit for situational performance.
Like any rate statistic, OPS doesn’t reflect gross output. Thus, it doesn’t make for easy comparisons among players with vastly different playing time. And OPS, like any unadjusted statistic, can’t distinguish between Coors Field and Dodger Stadium and between the 1930 National League, with an .808 league OPS, and the 1968 American League, with a .637 league OPS.
So while OPS is a handy indicator of overall performance at the plate, it doesn’t include a lot of things. The next time you feel inclined to compare player A to player B solely using OPS, do Breedlove and yourself a favor and, after using OPS as a starting point, bear in mind the following criteria as well:
1) individual characteristics;
2) defense;
3) baserunning;
4) situational performance;
5) playing time;
6) park effects; and
7) league offensive levels.
OPS isn’t wholly unreliable: it tracks runs per game within an average of 3 percent. But, like any statistic, it has its swings and misses.