Image Preprocessing – Segmentation

The mxnet and keras tutorials simply crop the image to 64×64. There is no special centering of the heart. So I wanted to ask if people ranking high on the leaderboard are preprocessing the images (and if how) to center the heart so training a network gets easier? Any thoughts or findings on spatial transformer networks?

