Monday, February 23, 2009

MLB Season Simulator

The baseball season is right around the corner, so given the MLB schedule, a log5 framework, and user-defined team strengths, the linked tool provides a Monte Carlo analysis, giving odds on division championships, wild card winners, and season win totals.

Consider this to be the play-at-home version of Baseball Prospectus's Postseason Odds or James Holzhauer's Playoff Odds.

MLB Season Simulation
Original author- Erich
Revision date- 20090222

8 comments:

SportsGuy said...

Will this sheet work in OpenOffice's Calc?

xlssports said...

Unfortunately, I do not deal with OpenOffice so I am not sure. My guess is the formulas will transfer, but the macro will not.
In the immediate term, I will not be able to install OpenOffice & check, but would love to know if someone does.

xlssports said...

Via email, VegasWatch asked:
"I'm assuming the stdev column shifts around a team's true talent from sim to sim? What have you been putting that as?"

My reply:
Thats right. In the past, people have complained that strong teams can show a 100% chance to make the playoffs. Incorporating a stdev allows for a basic variability in team performance, kind of like Pecota's 10, 50, and 90th percentile performance.
As for a recommended value, I'm just the excel guy. I leave it up to users to come up with a suggestion and hopefully some guidelines on which teams have a higher StDev vs a lower one (factors may include Stars & Scrubs, team age, key injury risk, player turnover, trade deadline flexibility).

j holz said...

This software looks really useful. Any interest in making it usable in mid-season, using each team's current record as an additional input? Or could this already be accomplished by setting the results of finished games to 0 or 1 on the Engine page?

xlssports said...

j holz,
You are correct the random #'s can be hardcoded to record actual results, though this too can be automated...

I currently have similar models for the NBA and NCAAB that update given game results and team performance (PF/PA).
Once the MLB season starts, I'll likely add the same functionality to this worksheet and would appreciate any pointers to a good source of data.

j holz said...

Also, adding playoffs to the simulation would be neat. If the Orioles can only advance to the postseason when they overperform and get a little lucky, it'd be interesting to see how that affects their hopes of winning a playoff series.

The software seems to be running very well, keep up the good work.

xlssports said...

I had previously released a baseball playoff simulator here.

The main reason I haven't incorporated it into this model deals with tiebreakers.

Right now, if 2 teams tie for a division or wild card, each gets credit for 1/2 of a crown. This is great for the season simulation, but bad for setting up a playoff tree.

The tiebreaker formulas are difficult, but even if feasible, they would likely add a lot of formula overhead, thus increasing the run time for a perceived minimal increase in utility (I have some hangups with the accuracy of the playoff simulation in the first place, such as the lack of adjustments for playoff built squads.)

I am quite interested in your own projections and if you have any advice on mechanically how I should add such functionality in, please drop me an email.

j holz said...

I think you're right about the tiebreakers not providing much additional utility in a preseason simulation. If you do add a mid-season functionality to the spreadsheet, it might then be important to adjust for tiebreakers.

Unfortunately, I'm a handicapper rather than a programmer, so I can't really offer any advice on how to add functions to the spreadsheet without increasing the run-time.

It's a very nice simulator, but I do think the default standard deviation is set too low; something between .03 and .04 would better reflect the accuracy of today's best projections. Of course, the user can just do this himself.

One more user-friendly feature might be an input to adjust for the AL/NL talent gap. Right now I'm accounting for this by setting the average AL team to a run differential of +40 and the average NL team at -35, which seems to be working fairly well.