Minor League Splits Redux Launched

I’ve launched yet another half-complete web application relating to baseball sabermetrics! ML Splits is a database of minor league baseball players (batters only for now) that shows their “splits” (performance against LHP and RHP) as well as park effects and major league equivalencies. The data is taken from Jeff Sackmann’s old minorleaguesplits.com site where he made the CSVs available for import as open source.

ML Splits
ML Splits

I leaned on jQuery for front-end display purposes, as I’m getting more and more comfortable with using it for front-facing web applications. Mostly just dynamic div tagging and toggle() to keep the screen clear of distractions and make it easy to see what stats are really important. Wrote the entire thing in PHP 5.3.x, MySQL 5, CodeIgniter 2.0, and jQuery.


EDIT: Pitchers are up as of April 4th.

10 thoughts on “Minor League Splits Redux Launched

  1. Will you consider adding the other ways to sort splits that Jeff had at minorleaguesplits such as by month or something like that? Being able to sort out how someone progressed was really helpful. Thanks for taking the time to get back up as much as you did. It sucked when minorleaguesplits was taken down.

  2. I might add the other splits in the future. Currently I see very little value in these small sample sizes and think it clutters everything up, but if I get enough requests / have some time to do it, I will. Right now you can barely tell anything from the small platoon splits! 🙂

    Thanks for commenting.

  3. Thanks for taking the time to put it all up and make it available. And for taking time to respond to my question. I don’t disagree that month by month splits are small sample sizes, but I ask because I was reviewing Dustin Ackley’s season and I know he had a brutal first couple months and was trying to recall his turnaround point (I think June), and see what his numbers were from when everything clicked for him through the end of the season. So while it would be small samples for each month individually, looking at the 4 month period could be of some use. Anyway, it’s all gravy after what you’ve already put up- so thanks again!

  4. Oh- And will there be updates on it like Jeff used to do? He had top prospects performances (box scores) from the day before, as well as each game for each minor league team and their box scores. Just curious…

  5. Craig:

    I’m re-spidering all the data (see the site for a sneak preview) from MLBAM, which will take quite a long time. I’m also going on vacation for the next week, so don’t expect much to get done on the site.

    After I get all the data, I have to run scripts to store it in a data warehouse and create extract-transfer-load (ETL) scripts to maintain this warehouse periodically. And how Jeff stored his data won’t be the same way I store my data.

    Basically, this is a huge undertaking. I’m looking at about 60-80 more labor hours before it’s even ready for beta testing. I’d realistically hope to have a forward-updating site with old park factors and MLEs (don’t get me started on deriving/implementing these) by June 2011. But no guarantees, as a handful of MLB teams have been interested in my work and it’s possible that this stuff will get pulled offline. And there’s always the possibility of me losing interest in it since no one is paying me to do it.

  6. That only seems fair. Thanks for all the time spent, I for one appreciate it. And thanks for the responses.

  7. Can we get it so you can search by just last name. Such as a search for Wilson will return a list of Wilsons?

  8. I see your point about monthly splits being too small, and I can live without that if necesssary.

    What I would like to see is Home/Road splits. That is very important, I think, in the minors because many home parks are screwy. Like Norwich’s Dodd Stadium, which is an extreme pitcher’s park. Some give up a lot of homers (was that Lancaster?), other causes a lot of strikeouts (Mayo of MLB.com studied that and found the San Jose’s home park caused a spike in strikeouts for the players, and one player blamed that on the bad background, back then). It is important to know if a player’s performance is skewed good or bad by his home park, especially since conditions at these parks could mess with hitting (like hard surfaces, irregular bounces, so forth).

Leave a Reply

Your email address will not be published. Required fields are marked *