Model Bug found - New Version released.
It was brought to our attention by @Elmi_R09 this week that there was a discrepancy in the probabilities between historic years and the 2018 predictions. We considered blocking and deleting him, but decided that in the theme of transparency that chaRlie is built on that we would have to address the issue. After a deep dive into the code, we discovered that the model had been “leaking”. This means that the model had indirect access to the actual result through a created feature and was, therefore, overperforming in historical predictions. 2018 was not affected significantly, of course, as the results are not out, but it did mean that the model was not as optimised as historical results indicated.
After many hours of debugging, testing and some slight improvements, we are confident that the model is ready to go again. The downside is that the results are slightly different to what you may have seen before, but overall nothing too significant. Tom Mitchell is still projected to win, in fact he has been given a significant boost (he was the biggest mover as the new model rewards high disposals more than the old one). Even though we don’t give any direct punting advice on this blog, we know that people do use it for that purpose and we apologise if the updates to the model change anything. If it helps, the existing model that was released was still a mathematically valid output—it just wasn’t as good as this one. On the flip side, as you will see, the historical results now accurately reflect how the model actually went (about 5-8% worse overall, but still 30% better than last year).
Thanks again to @Elmi_R09 for pointing out this error. Obviously by posting all of the historical results we leave the model open to interrogation and we think that by finding this bug we have improved the current model which is the overall aim. That doesn't mean we didn't consider deleting the entire website, however we feel slightly better after hearing master predictor Nate Silver talk about his own model bugs here.
Here is the link to the model. Enjoy!
P.S. My own personal opinion is that it’s very unlikely Tom Mitchell will get 41 votes. We just don’t think anyone has played a season like this before and therefore the model is freaking out. It will be interesting to see if the umpires recognise the sheer volume of disposals with votes, but somewhere in the low to mid thirties seems more reasonable.
P.P.S Patrick Cripps still can't win.