Thursday, February 25, 2010

The "Decline" of Johan Santana

We here at Fonzie Forever are beginning to gear up for yet another epic battle in our fantasy baseball league, and in our search came across an article about Johan Santana. The article, by Tristan Cockroft at ESPN, was entitled "Johan Santana's Decline Has Begun."

I wouldn't usually call a fantasy baseball article to everyone's attention, but Cockroft's conclusions in the article are obviously founded in reality - he believes that Santana's decline in real life will affect his fantasy value. Cockcroft, who usually does great work, concludes:
It's at this point I get on my soapbox: Johan Santana's downside is a finish outside the top 100 players of 2010 and easily out of the top 10 starting pitchers. Not only that, but I'd even argue that it's almost twice as probable an outcome as a season that puts his name into serious Cy Young discussion.

That's not based entirely upon concerns with his ongoing rehabilitation from surgery, though that's absolutely a factor. It's more than that: Santana's declining strikeout rate, diminished velocity and the New York Mets' weakened offense are all warning signs that a collapse might be coming.
I agree with much of his premise - it is of course possible that ANY player's downside is a finish outside the Top 100 fantasy players - but his main idea, that Santana is in linear decline, is incorrect.

It is a common mistake by people to look at a series of values and draw a conclusion from them. It pleases people to draw conclusions, even faulty ones, from information we have at hand. If you are a given a series of numbers "2-4-6-8..." you are going to guess the next number is 10. Similarly, if you are given "2-4-8-16..." you will probably end up with 32.

For any kind of rational, mathematical system, this makes sense. However in baseball, it is faulty to look at a set of numbers trending in one direction in a linear fashion and simply to conclude that the trend will continue in that direction. Here is an example from Cockcroft:
Santana's swing-and-miss percentage -- usually a good indicator of a pitcher's strikeout potential -- has been in precipitous decline since joining the Mets, especially last season. Here are his numbers and rankings among qualified major league pitchers in the category since 2004:

2004: 66.3 percent contact rate on swings (1st)
2005: 74.2 percent (2nd)
2006: 74.8 percent (1st)
2007: 73.2 percent (2nd)
2008: 77.0 percent (10th)
2009: 78.4 percent (21st)
In a vacuum, you could easily say that his contact rate on swings will increase yet again this season. However with real life, and baseball is a great example, there are an infinite number of variables which go into numbers such as "contact rate." Could Santana have been pitching to contact? Could the NL East be loaded disproportionately with contact hitters? Could Citi Field have an excellent batter's eye?

The point is, you cannot look at a simple numerical series and assume that it will continue. Doing this is how people get KILLED in the stock market. Sets of numbers -- like a contact rate, or a stock value -- are simply recordings of things that have happened in history. They carry little or no predictive value for the future.

If Santana's fastball was 90.5 MPH on average last season, what do we know about the future? Nothing. All we know is that it was 90.5 last year. Even if his fastball declined 1 MPH per season for three seasons in a row, the most reliable information that we have tells us that his fastball was 90.5 MPH last year. And since we are dealing with a human being - a flesh and bone creature which isn't beholden to any mathematical trend - my money is on his fastball being 90.5 MPH again.

Indeed, real life evidence indicates that Santana has a chance to be better this year than last. Santana himself has claimed that he feels fantastic this year, and that at times last season he couldn't even bend his elbow. He is healthier now than before - and smart money would be on him succeeding greatly this year.

As far as fantasy goes, I actually agree with Cockcroft that there are enough reasons in fantasy to not draft Santana within the first 50 picks. But rumors of Santana's decline have been greatly exaggerated[1].

[1] As for his claim that he's the best pitcher in the NL East... well... I like Santana and all, but I hear there is a new guy in Philly that might have something to say about that, not to mention a guy in Miami. It'll be very interesting to see how that shakes out.


Anonymous said...

Seems like you are rationalizing quite a bit here. There is compelling statistical evidence that Santana's skills are on the decline. It is understood that trends such as his declining velocity and increasing contact rate are no guarentee of the future, but these individual trends taken in the aggregate offer, at the very least, compelling evidence of a pitcher in decline.

I would also throw in the fact he moved from the AL to the NL and did not show an improvement in performance, the fact that he is coming off arm surgery, and above all else, his age, as additional reasons to expect a decline in performance from Johan.

Your argument just wasn't worth making. No one can dispute that apparent statistical trends are often just illusions of normal statistical variation. You have offered no compelling counter-evidence that shows Johan is not declining, you have simply questioned the reliablility of the evidence of his decline, and your reasons for dismissing it preclude us from using statistcal trends to make any conlusions whatsoever.

Also, you may want to re-think the following statement:

"Sets of numbers -- like a contact rate, or a stock value -- are simply recordings of things that have happened in history. They carry little or no predictive value for the future."

Perhaps you could better specify what you meant by this because it seems outrageously broad to me. You also appear to contradict it later in the piece:

"the most reliable information that we have tells us that his fastball was 90.5 MPH last money is on his fastball being 90.5 MPH again."

How is your method of reasoning any different from the espn writer's? You are both taking a sample of data about the past and making a judgement on what is probable for the future. You use the phrase "my money", which seems equivalent to saying "odds are" or "it is probable", all of which refer to some judgement about a future--understood to be unknowable--using data from the past. So, you have acknowledged, that it is more likely than not that at least some sets of numbers have some predictive value for the future.

Yet, if what you said previously is true, and that 90.5 number had "little or no predictive value" on what this year's number will be, why would it be a good idea for you to "put your money" on it being the same?

Brian said...

Thanks for the comment!

I'm actually mobile at the moment, but when I get the chance later today or tonight, I will most definitely respond to this. It's a hard concept to articulate properly, so I know I fudged it a little.

Brian said...

Let me try and distill what I meant in the original blog.

1. It is incorrect (and simplistic) to simply look at a series of numbers trending in one direction and project them to continue upon that trend.

Johan Santana, and his peripherals HAVE DECLINED in the last few years. However, it is a mistake to call him, as you do above "a pitcher in decline."

2. Those statistics are simply records of what has happened before. His average FB may have been 93 in 2007, 92 in 2008, and 91 in 2009.

But there is just as much (or more) reason to believe that his FB for 2010 will be 91 as it will be 90.

When a stock is valued at $5, and then $10, and then $15 - that is because at those precise moments in history, that was their value.

A stock value, just like a player's inherent ability to throw a baseball, or miss bats, or run a 100 meter dash, does not have its own inherent MOMENTUM.

His WHIP has risen six years in a row - that is true. But there is NO evidence to believe that there is ANY reason why his WHIP will be higher in 2010 than it was in 2009.

3. Trends are nice to look at, but projecting them to continue is simplistic. It is an excuse to abandon reason - to skip the step of making an honest evaluation of the future.

If Santana declines next year, then everyone who projected him to have this continual linear decline will be correct. If he does not decline again, those people will be able to point to the previous mathematical evidence and claim that his improvement was unexpected.

That's silly. Like I said above, statistics are not real.

4. And this is most important, because I need to clarify and the poster was right in calling me out for this.

When I said that stats "carry little or no predictive value for the future," I misspoke. What I meant is that the TREND established by those numbers carry no predictive value in the future.

You are what you are, on an objective level, in real life. Santana in 2010 will be Santana in 2010 -- he will not be an echo established by Santana 2003-2009.

Projecting for him to decline again is undoubtedly the safer bet because a) he is 31 and b) it is hard for ANY superstar to maintain his level of performance. But as far as I'm concerned, I'd prefer to use my knowledge and make a projection rather than proclaim him to be "in decline"