As previously noted, GoodEnough For Me is the companion series to Smells Like Pujols which runs on The Extrapolator.
Last week, we got down and dirty with Bill James Similarity Scores formula. We tore at it and eventually got a reasonable series of numbers from it. Well, we’ve decided those numbers aren’t good enough for us. Reasonable as they may be, they don’t intentionally reflect the results we want. They merely coincidentally give us appropriate scores. Given the nature of the coincidence, we feel that if we use Similarity Scores, even with our adjustments, we won’t get appropriate results in a number of cases.
And so, we move on and discover a statistical provided by Baseball Prospectus called “Stuff“, which measures the general dominance of a pitcher. Stuff takes rate stats for strikeouts, walks, and home runs, among others to get a number. Now, all of these rate stats are not your standard rate stats. They’ve been translated. For instance, if you take a year like 1968 in which hardly anyone got on base, the league averaged less than 1 hit per inning, 1 home run per game, etc. In prolific scoring years, the league averaged well over 9 hits/1 homer/3 walks, etc per game. As a result, Baseball Prospectus translates all the necessary stats into a hypothetical season with the following averages:
- 1 hit/inning
- 3 walks/9 innings
- 1 home run/9 innings
- 6 strikeouts/9 innings
- ERA = 4.50
- 6IP/ Game
Now, it takes a fair amount of time to translate stats. Time that, frankly, Mr. Thursday probably doesn’t have, unless he manages to find a source that organizes information in a way more conducive to his needs. So, we’ll be using, roughly, the Stuff formula, but we’re eliminating the translated stats.
Now, the Stuff formula expresses dominance, but not in terms of value for a full season, as opposed to one night. That is, Johan Santana can pitch better than almost everybody for a full year, but if a pitcher comes in and pitches great for one night (let’s say, 7IP, 11 strikeouts, 1 run scored on 1 home run, 2 walks, 2 hits) then our one night wonder’s score will probably exceed that of Santana.
As a result, we’re calibrating the scores to account for Innings Pitched relative to our man, Doc Gooden.
If the information is readily available, we may do some (extraordinarily) rough translation regarding league ERAs, but that’ll have to wait until later.
Let’s see our new formula in action, with our version of Stuff listed, for the time being, as Pitcher Rating.
Player – Games – IP – Hits – SO – Walks – HR – ERA – Pitcher Rating
MLB 2006 Avg – 25 – 144.33 – 151 – 101 – 47 – 18 – 4.50 – 7
MLB Ideal Avg – 30 – 180 – 180 – 120 – 60 – 20 – 4.50 – 8
Doc Gooden – 31 – 218 – 161 – 276 – 73 – 7 – 2.60 – 55
Johan Santana – 34 – 233.67 – 186 – 245 – 47 – 24 – 2.77 – 45
Josh Towers – 15 – 62 – 93 – 35 – 17 – 17 – 8.42 – -5
“Thursday” – 6 – 39 – 35 – 42 – 22 – 5 – 3.50 – 5
Now, it’s possible that Santana would overtake Gooden if their stats were both normalized. At the very least scores would be significantly closer. We can also see that our “Ideal” average and the 2006 average are very similar. Thus, any score below 8 is below average, and any score about 8 is above average. This, however, doesn’t mean that Thursday is a less than average pitcher; he’s well above-average. However, is limited service time means that he contributed less than a league average pitcher, pitching for a full season would.
To see each player’s value in context of a per appearance basis, let’s see their scores if we take away the value than innings pitched gives them:
Player ~ Player Rating ~Player Rating w/o IP
MLB 2006 ~ 11 ~ 7
MLB Ideal ~ 10 ~ 8
Gooden ~ 55 ~ 55
Santana ~ 42 ~ 45
Towers ~ -17 ~ -5
Thursday ~ 29 ~ 5
Once a week, throughout the course of the season, we will display a chart like the one above, giving our Stuff score for each player, while taking total IP into account, and again while ignoring total IP. Thus you’ll be able to see who’s pitching best vs. who’s just pitching most, and who really has a chance to compare to Gooden.
If we can get enough information regarding league averages, we will try to take them into account, and translated the overall stats, but no guarantees on that yet.