Random Data Splitting and Cross-validation with Amazon Machine Learning

You can now set up your Amazon Machine Learning (Amazon ML) model evaluations to be more accurate through a random splitting strategy, enabling you to train and evaluate ML models based on random subsets of input data records. Random splitting may be the best strategy to establish that your evaluation data is representative of your training data, ensuring that your model evaluation is correct. You can choose your splitting strategy through the Amazon ML console or API, and receive alerts when the training and evaluation data are not similar, enabling you to select a different data splitting strategy for the next model iteration. You can also now also create more accurate evaluations of your models by using cross-validation on your data. Cross-validation is particularly valuable if you are invoking many…


Link to Full Article: Random Data Splitting and Cross-validation with Amazon Machine Learning