As I explained in my August introduction post, I’m going to attempt to calculate FanGraphs WAR accurately for Chris Taylor’s 2017 season, in my own spreadsheet. To do this, I expect to make heavy use of FanGraphs’ documentation. I also have to give a big thanks to FanGraphs owner Dave Appelman as well as my sabermetric sage Matt Swartz. Here’s FanGraphs’ overview of WAR For Position Players. The basic formula is this:
WAR = (Batting Runs + Base Running Runs + Fielding Runs + Positional Adjustment + League Adjustment + Replacement Runs) / (Runs Per Win)
This doesn’t look too daunting. Add up the three different ways a position player can create value, make adjustments for position and league, and put it on the correct scale. OK, let’s calculate batting runs!
Show of hands, who knows anything about batting runs? Offhand, I couldn’t tell you how batting runs are tabulated, or what benchmarks for success are. So batting runs is a stat unto itself that requires a full exploration. Here’s the batting runs formula:
Batting Runs = wRAA + (lgR/PA – (PF*lgR/PA))*PA + (lgR/PA – (AL or NL non-pitcher wRC/PA))*PA
Huh. OK, when I look at that formula, the only acronym I’m familiar with is PA, which is plate appearances. We can all agree that we know what a plate appearance is.
I do not, however, know what wRAA is. FanGraphs says it stands for Weighted Runs Above Average. And, well, it has its own formula:
wRAA = ((wOBA – lgwOBA)/wOBA Scale) * PA
It seems that to calculate wRAA, we first need to calculate wOBA. Now, before I lose you in this sea of acronyms, wOBA is actually useful and fairly easy to understand. It stands for weighted on-base average. According to FanGraphs, wOBA “is a rate statistic that attempts to credit a hitter for the value of each outcome (single, double, etc) rather than treating all hits or times on base equally.” Intuitively, I find wOBA to be a simple and useful offensive statistic. At MLBTR, we often cite a batter’s “triple slash” line. Chris Taylor’s triple slash in 2017 was .288 (batting average)/.354 (on-base percentage)/.496 (slugging percentage). These days, people worry a lot less about batting average, since OBP counts a player’s hits, walks, and hit-by-pitches. But OBP fails to give a complete picture, since a walk is valued the same as a home run. That’s why we have slugging percentage, right? SLG is just total bases divided by at-bats, but it wrongly suggests a home run is worth four times as much as a single or twice as much as a double.
The purpose of that aside was to illustrate that wOBA is indeed a strong foundation for the batting runs component of WAR. Here’s the wOBA formula for 2017:
wOBA = (0.693×uBB + 0.723×HBP + 0.877×1B + 1.232×2B + 1.552×3B +
1.980×HR) / (AB + BB – IBB + SF + HBP)
In this formula, there are six things a batter can do to create value: draw an unintentional walk, get hit by a pitch, or hit a single, double, triple, or home run. As I learned from Appelman, and by just playing around with some example numbers, the batter also gets credit for intentional walks, by virtue of those being subtracted in the denominator.
You can see there is a weight assigned to each possibility, like 0.877 for a single or 1.980 for a home run. These weights change a little bit each year, and can be found here at FanGraphs. The concept of linear weights is explained well in this FanGraphs article. There are 24 different base-out states, such as “runner on second with one out” or “bases loaded, nobody out.” FanGraphs explains, “In order to calculate the run expectancy for that base-out state, we need to find all instances of that base-out state from the entire season (or set of seasons) and find the total number of runs scored from the time that base-out state occurred until the end of the innings in which they occurred. Then we divide by the total number of instances to get the average.” So if you know that the bases are loaded with nobody out in the year 2017, you should expect 2.32 runs to score. 50 years prior, you would have expected 2.13 runs to score in that situation.
We have 24 different run expectancy numbers, and each plate appearance moves the team from one box to another. The difference between the two is the run expectancy for that plate appearance. With this information, we can get the linear weights for each of the six batting outcomes. This concept dates back well before FanGraphs and is worth exploring.
One thing to note, from Neil Weinberg of FanGraphs: “the inventors of wOBA decided that it would probably be best to scale it to something familiar to make it easier to understand,” so they made the “aesthetic choice” to scale wOBA to on-base percentage. As we’ll see later in the wRAA calculation, this scaling choice has to be undone to get us back on a run scale. That seems needlessly convoluted, but I’m probably the only one trying to do this by hand.
In theory, one could create a version of wOBA that doesn’t just include these six positive batting outcomes, but rather every batting outcome. To quote Weinberg, “If you wanted to, you could build wOBA with more nuanced stats like fly ball outs, ground outs, strikeouts, etc; it would just get more complicated without much added value.” Well, hold up. First off, we shouldn’t care about making wOBA more complicated, since (this exercise aside), no one is computing it by hand. In fact, in a different FanGraphs wOBA explainer, the author says, “OBP or SLG might be easier to calculate with pencil and paper, but wOBA is extremely easy to find and use on our site, meaning any computational costs of moving to wOBA are minuscule.” I agree with that point, and since WAR is already a very complicated stat, why not incorporate the nuances of all batting events into it by using the most advanced wOBA possible? For example, take two players who have the exact same number of unintentional walks, HBPs, singles, doubles, triples, and home runs. Say those players each also made 400 outs in a season, but one player made every out by strikeout and the other made every out by flyball. Wouldn’t the flyball guy be a more valuable hitter?
In response to that question, Dave Appelman pointed me to this link, a seven-year-old Hardball Times article in which JT Jordan re-calculated wOBA with strikeouts included for batters. Jordan concluded, “The difference is incredibly small. So really, it’s not a big deal to ignore strikeouts when using a context-neutral method like linear weights and wOBA. But it can be done. When all is said and done, we’re talking about a run or two of difference.” Swartz remarked, “I have never gotten a beat on when sabermetricians deem it okay to call something ’close enough.'” Bottom line: wOBA could be made a tiny bit more accurate, but the keepers of the stat must feel that there is little added value in incorporating other batting outcomes.
Ultimately, a batter’s wOBA is a strong foundation for calculating his offensive value. Let’s calculate that number for Chris Taylor. If we want to cheat, we can just pull up his FanGraphs page to see that his wOBA was .361 in 2017. We don’t want to cheat, though.
wOBA = (0.693×50 + 0.723×3 + 0.877×88 + 1.232×34 + 1.552×5 + 1.980×21) / (514 + 50 – 0 + 1 + 3)
wOBA = 0.3613
Now, we need to turn wOBA into wRAA. wRAA is a counting stat that “measures the number of offensive runs a player contributes to their team compared to the average player.” Here’s the formula again:
wRAA = ((wOBA – lgwOBA)/wOBA Scale) * PA
I feel pretty good about my understanding of wOBA, which required only the number of unintentional walks, hit-by-pitches, singles, doubles, triples, and home runs Taylor hit, as well as the linear weights of each of those events in 2017. I can understand the league average wOBA as well, which FanGraphs shows was .321 in 2017. Keep in mind that lgwOBA does not refer to the National and American Leagues; it refers to all of MLB for that year.
Our next step, wRAA, isn’t that hard to comprehend either. It uses the aforementioned linear weights but presents its results in a cumulative manner, unlike wOBA. wRAA is also scaled such that zero is the league average, so it can be compared across different seasons. Finally, wRAA uses a number called the “wOBA scale” to undo the “scale to OBP” choice that is baked into wOBA. I know from Taylor’s player page that his wRAA in 2017 was 19.3.
wRAA = ((0.3613 – .321)/1.185) * 568
wRAA = 19.317
So far, we’ve found our way to the correct “weighted runs above average” amount for Chris Taylor. It’s worth pausing to appreciate that nothing overly complicated or debatable has been done so far: Taylor received the correct amount of credit (linear weights) for each of the positive batting outcomes (single, double, etc.) and that was scaled against the league’s offensive production since the value of a home run was very different in 2017 vs. 1917. We are most of the way to Batting Runs, which along with fielding and baserunning is one of the three pillars of WAR. What we need to do next is adjust these batting runs for Taylor’s ballpark and league. Here’s the batting runs formula again:
Batting Runs = wRAA + (lgR/PA – (PF*lgR/PA))*PA + (lgR/PA – (AL or NL non-pitcher wRC/PA))*PA
I believe the number we’re aiming for, based on Taylor’s FanGraphs player page, is 18.7, which suggests minimal adjustments were needed to his 19.3 wRAA.
- wRAA = 19.317
- lgR = all the runs scored in all of baseball in 2017 = 22,582
- PA = all the plate appearances in all of baseball in 2017 = 185,295
- lgR/PA = 0.1219
At this point, we need to pause and talk about park factors. Neil Weinberg wrote an informative beginner’s guide to park factors here. Intuitively, it’s logical to make an adjustment for the player’s home stadium. In the case of Taylor, Dodger Stadium suppressed overall run scoring by about 8% from 2013-17, so we apply half of that under the assumption that he played half his games at home. Taylor actually did play half of his games at home in 2017, but even if he didn’t, the park factor would be applied as if he did. Additionally, as Weinberg explains in his article, “parks don’t affect every player evenly and our park factors sort of assume that they do.” If for some reason Dodger Stadium actually improves Taylor’s hitting (due to handedness, batted ball profile, weather, or any number of things) he’d still get a boost in this WAR calculation to account for Dodger Stadium suppressing offense on average. An assumption is also being made that the player played his road games in “a pretty average setting,” which is not necessarily true.
Weinberg wrote his park factor article in January 2015, noting, “We want to know how parks influence each moment of the game, but we simply don’t have granular enough data to really get there. A ball hit at 15 degrees directly over the shortstop while traveling at 93 miles per hour will travel how far and land where? That’s basically what we want to know for every possible angle and velocity, but we just don’t have the data and we don’t have it for every type of weather in every park.” In 2018, we do have most of that data, due to Statcast. I asked Appelman about potential efforts to reboot the park factor component in WAR using Statcast data, and he replied, “I have not personally done much work on park factors. They are in my opinion, very annoying. I just don’t really like dealing with them and they make everything much more complicated. However, they’re obviously good to have.” Swartz was of the same mind, explaining that park factors are “very noisy” and while you could possibly improve them with Statcast or weather data, the precision gained would be minimal. Imperfect as park factors are, Swartz told me it would be “disastrous” to leave them out.
- PF = 2013-17 park factor for Dodgers Stadium = 0.955055 (Good luck finding a park factor this precise. FanGraphs’ Guts page just gives you .96 for the Dodgers. Were I not able to speak directly to Appelman, I wouldn’t know how to get the more precise figure, nor would I know that 2013-17 is the current time period used on the listed five-year park factor).
In this example we added a significant amount of batting runs to account for Taylor playing half his games in Dodgers Stadium – about 3, to the 19 we started with.
Now, we need to talk about one more mini-calculation, for which a custom FanGraphs league-level, non-pitcher leaderboard is needed.
- NL non-pitcher wRC = 11,282
- NL non-pitcher plate appearances = 87,753
Batting Runs = 19.317 + (.1219 – 11.64)*568 + (.1219 – .1286)*568
Batting Runs = 19.317 + 3.111 + (-3.803) = 18.625
That last part of the formula, where we ended up subtracting 3.8 batting runs? That comes from this part:
(lgR/PA – (AL or NL non-pitcher wRC/PA))*PA
I asked Swartz exactly what is being adjusted there, and why it exists. He answered, “What it appears to be doing is some sort of league adjustment (AL vs. NL), but I’m not sure it really makes sense.” He added, “It’s really a very specific approach, so I have to imagine whoever put that together had something in mind. And it needs to be some sort of league adjustment, even if the adjustment is only about the run environment of the league.” I’m left without a clear understanding of the purpose of this part of the batting runs formula.
In the end, I didn’t quite arrive at the 18.7 listed under the Batting section on Taylor’s FanGraphs page. While I used unrounded numbers wherever possible, I believe rounding is the reason I’m slightly off. Getting this close to the correct batting runs number was arduous. Perhaps that’s because WAR isn’t meant to be calculated by hand, but attempting to do so increased by understanding of batting runs well beyond just looking at the formula. It’s easy to read an explanation and think you understand, even when you don’t. I hope MLBTR readers will learn and ask questions along with me. We’ll tackle the baserunning component of FanGraphs WAR next time.