What Would a Perfect Predictor Score?

Interesting write up, Net Prophet. I don’t see anything wrong with your analysis. I’ve been doing some work on this myself and have convinced myself that the distribution of the Log Loss should be roughly normal. You’re adding up 63 (or in the case of the first round, 32) independent random variables, each with finite variance. That should be enough for the central limit theorem to kick in. You found the mean in your blog post, but there is also a variance. If you treat the games as independent (and if I’ve done my math right), the variance of the perfect prediction’s Kaggle score is variance = (1/n^2) Sum[ p_i*(1 – p_i)*(Log[p_i/(1 – p_i)])^2, {i,1,n}] Using the probabilities you cite (and taking special care of the case p_i=1 for the…


Link to Full Article: What Would a Perfect Predictor Score?