Enhance Machine Learning with Standardizing, Binning, Reducing

Editor’s Note: This is the second in a four-part series on improving analytics output with feature engineering. Visit Data Informed all this week for the subsequent entries in the series. In Part 1 of this series, we looked at some of the popular tricks for feature engineering and a broad overview of each. In this part, we will look at the first three tricks in detail. Standardizing Numerical Variables Standardization is a popular pre-processing step in data preparation. It is done to bring all the variables on the same scale so that your machine-learning algorithms give equal importance to all the variables and do not distinguish based on scale. Let’s consider an example with K-means clustering, a popular data mining and unsupervised-learning technique. We will work with publicly available wine data…


Link to Full Article: Enhance Machine Learning with Standardizing, Binning, Reducing