Posted by stevesommer05 on November 14, 2009
After a solid 4 months or so here I’ll be moving on, changing WordPress adresses if you will. Erik Manning has restarted Play a Hard Nine and has graciously allowed me to ride his coat tails over there. Starting now I’ll be posting over there full time (or at least as close to full time as I get). I’ll be doing exactly the same kind of stuff over there as I have been doing here, so update your RSS feeds accordingly (all 2 of you).
Thanks for visiting, and say hi over at PAH9 when you get the chance.
Posted in Uncategorized | Leave a Comment »
Posted by stevesommer05 on November 10, 2009
In the last post cautioning the position change for Mike Cameron, I neglected to mention that he’ll lose 1 WAR for the position change, but likely recapture some of back in runs saved for his defense as he’d be baselined against other LFers and not CFers. I believe with the way the position adjustments are constructed it would mathematically be a wash (i.e. he’d be a +20 LFer instead of a +10 CFer). Do I completely believe he’ll be a +20 LFer (ranking at or near the top of the league)? Eh probably not, but the 1 WAR decrease I talked about is probably a little overstated.
Posted in Uncategorized | Leave a Comment »
Posted by stevesommer05 on November 8, 2009
For those that have been beating the Mike Cameron drum as a left field option, remember that when you quote his WAR value, it’ll decrease by one when you consider the postion adjustment from CF to LF (from ~+2.5 to ~-7.5). Clearly some of that would be mitigated if he plays CF vs. LHP, and I’m not saying he wouldn’t be an ok option at the right price, just be wary if you think he can post ~4 WAR moving forward.
I’m still working up the similarity score stuff, I have RHP pretty well done, just have to put it into a coherent post. and I’m also in the process of finishing up the regressed UZR stuff for the rest of the positions. Whenever I get them both done I’ll also make the spreadsheets available in Google docs or something like that.
Posted in Uncategorized | Leave a Comment »
Posted by stevesommer05 on November 2, 2009
If you caught my previews of the first two NLDS games you saw that I did a couple of heat graphs (aka heat maps). I had what I thought was a decent idea for another version of that graph. The attempt is to capture a pitches effectiveness by movement. Without further ado, we have Carp’s curveball’s whiff rate.
The vertical axis is vertical movement in inches, and the horizontal axis is horizontal movement. The picture is from the catcher’s perspective. The basic takeaway is the more straight down the pitch broke, the higher percentage of whiffs he got. The chart doesn’t break out for batter handedness, and I also removed some periphery data points where the sample size skewed the chart. Anyway, what do ya’ll think? is there some value here, or is it simply a pretty picture for the sake of pretty pictures.
Posted in Graphs, Pitch F/X | Leave a Comment »
Posted by stevesommer05 on November 1, 2009
The guys over at Cardinals GM asked whether the Cards should look into resigning Joel Piniero. After some initial research into the topic I think I’ve come up with some parameters for what would make a profitable deal for the Cardinals. This quick look analysis will be fairly simplified to only include the next year and not years 2 and 3 of a potential deal (in other words not exactly looking at reality). First some projections for two of the key players in the discussion, Joel and John Smoltz. For these “projections I just used a 5/4/3 weighted average of the last 3 years of xFIP.
|
xFIP |
IP |
WAR |
| Piniero |
4.08 |
164 |
~2.5 |
| Smoltz |
3.53 |
94 |
~1.8 |
While these projections are not as robust as those that will come out in the near future from various sites, they are a place to start the conversation. We could easily take Piniero’s projection and conclude that he’d be worth ~11M next year on the open market; however, why should we pay market value when we have other needs to address (i.e. left field). With that in mind let’s look at two scenarios that could play out. Option 1 would be to sign Smoltz to be the 4th starter and spend the rest on LF. Option 2 would be to sign Piniero and spend the rest on LF. For the sake of this analysis we’ll assume Smoltz will sign for the same 5.5M he did last year. We’ll also assume the Cards haev 25M to spend between these two positions (any number would suffice here, 25 seemed nice and round). Under those working assumptions, and tinkering a little with the IP on the projections I can generate the following WAR chart

The 200 Projected Piniero line takes the Piniero projection and pushes it to 200 innings. The 3.5 WAR line assumes less regression for Piniero, and performance that is a little closer to this years, only still nowhere as good as this years. The numbers in front of Smoltz are the numbers of innings assumed for Smoltz. Clearly the higher the line the more WAR that combination produces (it assumes paying market value for a LFer with the remaining dollars). So if you take a leap of faith that Smoltz can pitch 130 innings then projected Piniero is only a better value at 6.5M. If we assume less of a regression of Piniero than he is the better value up until ~11M. Anyway, just a little more info for the conversation. I’ll save the charts and maybe update them with ZIPs and CHONE at a later date.
Posted in Uncategorized | 1 Comment »
Posted by stevesommer05 on October 29, 2009
After reading some discussions over at The Book blog about UZR and regression to scouting reports I thought it would be a good idea to use the fans scouting reports as a regressing factor for UZR.
My methodology was as follows: I binned players into groups based on their positional ranking within the scouting reports, and then calculating the weighted average of the UZR/150s of the players within the bins. The following table is the results using the data from 2007, 2008, and 2009. (Quick Edit, the below table is for SS only, sorry for any confusion)
| Rank |
AVG UZR |
| 1-10 |
5.6 |
| 11-20 |
2.5 |
| 21-30 |
-1.5 |
| 31-40 |
-3.3 |
| 41-50 |
-3.9 |
| 51+ |
-9.1 |
At this point my methodology diverges, as I wasn’t sure which method I like better. Method 1 is to regress each individual season’s data based on the players rank that season to get a new seasonal UZR, and then weight across the 3 years of data. Method 2 is to weight across the three years of data and then regress using the most recent fans scouting report ranking (in this case the interim 2009 results).
Method 1 is clearly sensitive to the ebb and flow of the fans, and is also a little more dependent on those rankings since the UZR’s being regressed have a smaller number of defensive games associated with them. Method 2 does not create “single season” stats as some people would probably like, and it only uses the most recent fan’s ranking. Overall I think I prefer Method 2, but could be swayed either way. The following table lists the top 10 shortstops ranked by Method 2 (I really need a better name).
| Rank |
Name |
3 year uzr |
Method 1 |
Method 2 |
| 1 |
Omar Vizquel |
18.4 |
10.1 |
11.8 |
| 2 |
Jack Wilson |
11.1 |
8.0 |
9.3 |
| 3 |
Brendan Ryan |
11.4 |
7.0 |
8.4 |
| 4 |
Cesar Izturis |
9.0 |
6.1 |
7.7 |
| 5 |
J.J. Hardy |
9.2 |
5.8 |
7.2 |
| 6 |
Elvis Andrus |
8.3 |
7.1 |
6.8 |
| 7 |
Adam Everett |
13.2 |
4.9 |
6.6 |
| 8 |
Erick Aybar |
7.3 |
6.3 |
6.5 |
| 9 |
Jimmy Rollins |
6.6 |
5.9 |
6.3 |
| 10 |
Paul Janish |
11.9 |
6.9 |
5.8 |
and the bottom 10
| Rank |
Name |
3 year uzr |
Method 1 |
Method 2 |
| 43 |
Hanley Ramirez |
-4.9 |
-3.3 |
-3.9 |
| 44 |
Stephen Drew |
-5.2 |
-2.6 |
-4.1 |
| 45 |
Ramon Vazquez |
-7.8 |
-4.5 |
-4.3 |
| 46 |
Alex Cora |
-5.3 |
-4.3 |
-4.4 |
| 47 |
Luis Rodriguez |
-7.9 |
-4.8 |
-5.0 |
| 48 |
Juan Castro |
-16.6 |
-2.6 |
-5.2 |
| 49 |
Julio Lugo |
-9.3 |
-5.3 |
-6.8 |
| 50 |
Khalil Greene |
-10.4 |
-1.5 |
-8.0 |
| 51 |
Brendan Harris |
-8.3 |
-7.9 |
-8.7 |
| 52 |
Yuniesky Betancourt |
-12.3 |
-9.6 |
-11.4 |
A couple of quick caveats, if you read the comments on the above linked thread, I noted that defensive games at fangraphs looks a little messed up. Those going back to normal would likely change these results. Also, I didn’t do a great job of searching the blogosphere, so if this has been done before, I apologize for presenting it as a new methodology.
As far as data sources: UZR via fangraphs and the fan’s scouting report via tangotiger. As always, comments or suggestions are appreciated.
Posted in General Sabermetrics | 6 Comments »
Posted by stevesommer05 on October 27, 2009
A few days ago in his 10@10 Derrick Goold talked about velocity and the Cardinals bullpen. I thought I’d take a look at velocity vs. effectiveness for relievers, and the following table is what I came up with
| Velocity |
value/100 |
total value |
FIP |
ERA |
K/9 |
% FB Thrown |
| <88 |
0.529 |
2.58 |
3.71 |
3.32 |
7.28 |
50% |
| 88-90 |
0.156 |
0.68 |
4.15 |
3.53 |
7.28 |
53% |
| 90-92 |
0.408 |
2.68 |
3.93 |
3.29 |
8.09 |
54% |
| 92-94 |
0.391 |
2.87 |
3.92 |
3.71 |
7.96 |
66% |
| 94-96 |
0.358 |
2.94 |
3.93 |
3.59 |
8.20 |
68% |
| 96+ |
0.718 |
5.82 |
2.86 |
3.44 |
10.20 |
69% |
Data is from Fangraphs and compiles all of the totals from the “qualified” relievers.
A couple of quick bulletized points
- Velocity is the reliever’s average fastball velocity, not on an individual pitch basis
- Value numbers are in runs above average
- I prefer to use the value/100 as it removes the playing time element. It’s value per 100 pitches thrown.
- It appears that once you get below 96 there is not a whole lot of differentiation other than in the amount of fastballs thrown (not surprising)
- Even the more global metrics (FIP, ERA) don’t have a lot of differentiation between the groups
Posted in General Sabermetrics | Leave a Comment »
Posted by stevesommer05 on October 25, 2009
A while back I went down the path of looking at similarity scores between pitches for different pitchers, but had since not really followed up. Since they came up in a thread over at BtB, I thought it would be a good idea to revisit them now. My basic methodology was the same that Josh Kalk used for pitchers, only removing the percentage thrown term. In addition I also ran a set of scores where along with the three “physical” traits, I added whiff rate and GB%.
For my initial cut I only took pitches that I had over 150 instances of in the database in a hope that it wouldn’t skew the whiff and GB% distributions. In other words if Joe Pitcher only threw 100 curveballs over the last 2 years, his curveball was not included for comparison. I’m not sure what I want to make the cutoff, or if I want one, but for this cut it was 150 instances. The primary drawback is that high of a cutoff will eliminate a lot of the pitches where a comparison would prove useful, pitchers that are relatively new to the big leagues. In retrospect I think I want to lower that bar substantially for “physical” only scores and keep it slightly higher when introducing results.
Anyway, on to results!! I only ran the numbers on fastballs (both 2 and 4 seam) since I wanted to get a feel for how the methodology was going to work and what kind of numbers to expect. The first table is the most similar fastballs based solely on their physical traits (movement and velocity)
| Pitcher 1 |
Pitcher 2 |
Score |
| Joakim Soria |
T.J. Beam |
0.008822813 |
| Jim Johnson |
Jorge Julio |
0.016173611 |
| Armando Galarraga |
Trevor Cahill |
0.029398104 |
| Josh Geer |
Dan Giese |
0.029431471 |
| Tommy Hunter |
Brendan Donnelly |
0.030507301 |
| Brandon Lyon |
Jason Berken |
0.030858617 |
| Dan Wheeler |
Kris Benson |
0.030926282 |
| Brandon McCarthy |
James Parr |
0.031952073 |
| Jesse Crain |
Andrew Bailey |
0.03290595 |
| Steve Trachsel |
Keith Foulke |
0.033642835 |
So what does the number in the score column represent? It is the sum of the differences between percentiles (as a decimal still so 90th percentile = 0.9) for each component. Clearly that number doesn’t look pretty, and I’m racking my brain to come up with a better way to present the number… any thoughts would be appreciated.
Moving on, the next table includes whiff and GB% to go along with the physical traits.
| Pitcher 1 |
Pitcher 2 |
Score |
| Miguel Batista |
Luis Mendoza |
0.088735311 |
| Joe Blanton |
Leo Rosales |
0.105227103 |
| Sean Green |
Joe Smith |
0.106274147 |
| Kyle Lohse |
Adam Wainwright |
0.107327243 |
| Chris Sampson |
Shawn Camp |
0.108540848 |
| Braden Looper |
Carlos Silva |
0.111769335 |
| Max Scherzer |
Josh Roenicke |
0.118732513 |
| Francisco Cordero |
Chris Resop |
0.122583619 |
| Leo Nunez |
Jorge Julio |
0.127953011 |
| Pedro Martinez |
Matt Herges |
0.136471458 |
For this set of scores all 5 components are equally weighted, which may or may not be valid. I’d like to put a little more thought into if I’d like to weight them differently or not. Anyway, that’s the first cut at getting some of this information out there. The to-do list with these is long. I’d like to re-run the fastball physical numbers with a broader net, run all the other pitch types, and look at some the top comps for various pitches that have good reputations/results (i.e. Mariano’s cutter, Brandon Webb’s sinker, Adam Wainwright’s curve).
Posted in Pitch F/X | 4 Comments »
Posted by stevesommer05 on October 22, 2009
In preparation for the debacle that was game 2 of the NLDS I did a quick survey of how Cardinal hitters hit high velocity LHP fastballs. With that data in hand I went down the path of looking at all hitters against all high velocity fastballs (for the purpose of this look >94 mph). For the study I only looked at players that had put 50 balls in play over the time frame I was looking at (2008-mid Sept 2009). First the list of best SLGCON
| SLG Rank |
Player |
SLGCON |
| 1 |
Adam Dunn |
0.984 |
| 2 |
J.D. Drew |
0.962 |
| 3 |
Ryan Howard |
0.896 |
| 4 |
Prince Fielder |
0.893 |
| 5 |
Chase Utley |
0.820 |
| 6 |
Joey Votto |
0.808 |
| 7 |
Nick Swisher |
0.806 |
| 8 |
B.J. Upton |
0.779 |
| 9 |
Carlos Quentin |
0.774 |
| 10 |
Torii Hunter |
0.771 |
And the worst
| SLG Rank |
Player |
SLGCON |
| 177 |
Bobby Crosby |
0.193 |
| 176 |
Juan Rivera |
0.241 |
| 175 |
Kenji Johjima |
0.250 |
| 174 |
Cesar Izturis |
0.260 |
| 173 |
David Eckstein |
0.278 |
| 172 |
Jason Kendall |
0.278 |
| 171 |
Jeremy Hermida |
0.280 |
| 170 |
Mark Kotsay |
0.291 |
| 169 |
Yadier Molina |
0.292 |
| 168 |
Magglio Ordonez |
0.300 |
And the notable Cardinals
| SLG Rank |
Player |
SLGCON |
| 23 |
Ryan Ludwick |
0.707 |
| 24 |
Albert Pujols |
0.700 |
| 30 |
Matt Holliday |
0.659 |
| 41 |
Mark DeRosa |
0.616 |
| 67 |
Skip Schumaker |
0.556 |
| 82 |
Troy Glaus |
0.518 |
| 169 |
Yadier Molina |
0.292 |
| NA |
Colby Rasmus |
0.556 |
| NA |
Rick Ankiel |
0.388 |
Just thought these might be interesting. For a point of reference MLB average over the sample was 0.510. A more interesting look might be to do run values and expected run values, and I hope to have those incorporated into my database soon.
Posted in General Sabermetrics, Pitch F/X | Leave a Comment »
Posted by stevesommer05 on October 8, 2009
This post was written before the game finished last night, so hopefully it gives some insight into how we’ll try and take a 2-0 lead. The opposition in game 2 will be Clayton Kershaw. First, as usual, a summary table
Read the rest of this entry »
Posted in Pitch F/X | Leave a Comment »