Effect of PassengerId

I have been playing around with feature engineering & parameter tuning to get my random forest entry up above .78 (just achieved). Unfortunately, part of doing that included using PassengerId. Now I am not one to sniff at any data freely provided to me, but logically I don’t see why it should be as effective as it is. Unless there is a bias in the data which we were provided or the original way it was collected (if PassengerId is a historic feature). Or is it merely an artifact of the public test set and will vanish if we were tested on the private leader-boards? Has anyone put some effort into researching this? I will upload my Kernel in the next few days as I am new to this and…


Link to Full Article: Effect of PassengerId