I have noticed during the course of the last two seasons there is a surprisingly large pocket of Atlanta Braves fans who like to question Freddie Freeman‘s performance, especially jumping on him for any period of poor play through the season. Part of this might be from overly high expectations (in which case, simply readjust your expectations from “MVP” to the more reasonable “consistently well above average”). Others, however, really get caught up in small pockets of poor performance. A great example of this trend occurred during the four day span from August 2 through August 5. A surprisingly large number of posts and comments targeted Freeman, who is easily the Braves best performer this season, giving him a hard time about his bucket full of strikeouts and poor plate appearances during the four day span.
I ask all fans, but particularly these fans, to consider the following and grow in appreciation for the cyclical nature of baseball:
- 4 day drought of despair (18 PAs): .063/.167/.125 (-17 wRC+) 11K 1BB
- 4 days before drought (18 PAs): .375/.444/.438 (146 wRC+) 0K 2BB
- 4 days after drought (19 PAs): .286/.474/.571 (183 wRC+) 2K 5BB
Averaging these 12 games, which is still a pretty small sample but does start to expand out a bit, we can see that Freeman has been a slightly above average hitter the past two weeks despite one of his worst four game stretches of the season:
- 12 days combined (55PAs): .239/.364/.370 (105 wRC+) 13K 8BB
What Does Regression Toward the Mean Mean?
Now, even within the last few games, Freeman had one game that was really great and another that was pretty poor, but that is baseball, isn’t it? It is practically a law in baseball for players to have a tendency to regress – or move – toward their average performance over time. (Although there are obviously exceptions, which is part of the fun of it.)
Regression toward the mean is a statistical term that basically means a specific data point might be well off the overall mean (in this case Freeman’s four-day average is well off his season stat line or career stat line), but over time, the points should move closer toward the overall mean that has lots of data points in it (which in this case constitutes his 487 season plate appearances or 3584 career plate appearances).
For example, looking at the overall mean of all recorded coin flips of all time, there is a pretty much 50/50 chance that flipping a coin will lead to a result of heads or tails. So the average over 10 coin flips should be .500, indicating a 50% chance of heads. However, if you’ve ever flipped coins, you know that it will rarely be exactly .500 across 10 flips. It could be four of 10 land heads (.400) or nine of 10 land heads (.900). But that doesn’t mean the mean is off by .100 points or .400 points. Instead it means the number of flips in each sample is small, and if you add that 10-flip sample mean to nine other 10-flip sample means (for a total of 10o flips), the average should be closer to the expected mean of all coin flips of all time. Maybe adding the 10 samples of 10 flips together results in 55 coins landing heads, or a .550 average. That still isn’t at the expected mean of .500, but with the larger sample, it has moved toward (or regressed) toward the expected mean of .500. Hence, we have regression toward the mean.
Regression Toward the Mean Applied to Baseball
So what does that mean for baseball? It means that when Freddie is in a pit of hitting despair for 20 plate appearances, he’s likely not going to stay there very long. Eventually the ups and downs of 20 plate appearance samples added together and averaged will trend back toward his career averages (remembering that those averages are impacted by sustained up and down streaks). Likewise, when Freddie is going crazy [Note: as in the four games since I originally wrote this article – August 10 – 13 -where he’s slashing a ridiculous .636/.737/1.909 with a 461 wRC+], he’s probably not going to stay on those highs very long, but unfortunately (for Braves fans like me, anyway) he will start regressing toward his mean then, too. The point here is that you can regress toward the mean from either direction; it’s not always a bad thing even thought we tend to think of the word “regress” in a negative way.
In a vacuum with robots, the expectation would be that any player would regress to his mean over a certain number of plate appearances. However, as you might have noticed, baseball is not played by robots in a vacuum; it is played by humans in a variable-rich environment. So when applying the statistical concept to baseball, it is not always perfect; it can be impacted by a number of other factors, meaning a change from the mean might indicate the player’s performance is changing, and therefore the mean will be pulled up or down with a sustained change.
A couple of obvious variables that could impact the mean are changes in swing mechanics, injury, or age, which is why age regressions are often considered in projections of a player performance. A variable impacting the regression doesn’t mean it’s all random. The variables can often be determined by watching the player and his stats, and comparing him to what happens, on average, to other players impacted by the same variables. For instance, while individual players vary, there are clear indicators from the mean of all players of how aging impacts performance, so there is an age regression as well. Some people who study these things even specify these aging trends a little further by considering trends in aging by position (catcher’s tend to age a bit faster for instance) or type of player. (This Talking Chop article references a Capital Avenue Club (RIP) article that basically nailed the issues with Uggla when the extension was announced. It did so by comparing him to performance of similar types of players at his ages during the contract.)
It is important to remember these regressions don’t just apply to players like Freddie Freeman, who goes through cycles that bring him down but likely mean he will bounce back. Another good example is a player whose first impression is very good, even though the means indicate that’s not his typical level of performance (cough, Jose Constanza, cough, cough). Fans tend to quickly become enamored with a player who is lighting the world on fire and not consider the bigger picture of the smaller sample mean in context of the larger mean or other variables such as age and position.
A good current example for the Atlanta Braves is Anthony Recker. There is a lot of uncertainty of who Atlanta’s catcher of the future will be, with no catching prospects anywhere close (which is one of the main reasons I was so frustrated with how the front office botched – in my opinion – the Christian Bethancourt situation). Within the larger conversation of the Braves catching future, I keep hearing the recurring thought that Anthony Recker is at least a viable short term option for next season. He is, after all, hitting .389/.477/.528 (170 wRC+) this season.
However, what we have to remember in this discussion of regression to the mean is that Recker has put up this amazing stat line in only 46 plate appearances. That is a very small sample on which to decide Recker should be the man going forward, even if for a season.
With Recker, we do have a reasonable career sample of about a season’s worth of plate appearances, or at least a “catcher’s season.” In 557 career plate appearances, he’s slashed .200/.278/.348 (76 wRC+). So he’s mean, even when included the en fuego 46 plate appearances, is well below what he’s doing now.
Could he get better and sustain a higher level of production? Well, there’s always a chance. But this is where other variables come in. Recker turns 33 this month and plays catcher, so that alone indicates he’s unlikely to all the sudden show sustained improvement. Consider also that his .483 batting average on balls in play (BABIP) is about 100 points higher than some of the best BABIPs of all time, and .211 points above his career .272 BABIP, then what we are seeing is a flash in the pan flashing at the right time for a team that is in desperate need of a catcher to finish off the season. He is not a person we should entrust the catching duties to with any expectations of even decent offensive output.
My goal here is not to recker expectations. (I thought that was punny). Enjoy it while it lasts. My goal instead is to get people to zoom out and consider how small changes over the course of 18, or even 180, plate appearances don’t necessarily indicate a player is fundamentally different than who they’ve been in the past. In fact, I would say this quip generally holds true: The best indicator of future performance is past performance, especially the most recent seasons. However, we also have to consider there are many other factors that might be influencing changing performance that could lead to sustained trends up or down in player’s overall performance.
The TL;DR Takeaway
The takeaways boil down to this:
- Players tend to regress toward their means with more plate appearances, so be cautious of small sample sizes in either direction.
- Additional variables could impact the means shifting up or down over time, and those need to be considered along with the actual change in performance when projecting a player.
That is why Freeman struggling for four games as a 26-year-old first baseman with a sustained history of well above average performance doesn’t bother me right now, while at the same time trusting the catching duties to a 33-year-old catcher with 500+ plate appearances of abysmal performance would be crazy, even if he is – at this moment in time – outhitting our best offensive player.
Do you have thoughts? Let me know here our join the discussion on the Outfield Fly Rule Facebook Page.