Apache Spark Machine Learning Tutorial

Editor’s Note: Don’t miss our new free on-demand training course about how to create data pipeline applications using Apache Spark – learn more here. Decision trees are widely used for the machine learning tasks of classification and regression. In this blog post, I’ll help you get started using Apache Spark’s MLlib machine learning decision trees for classification. Overview of ML Algorithms In general, machine learning may be broken down into two classes of algorithms: supervised and unsupervised. Supervised algorithms use labeled data in which both the input and output are provided to the algorithm. Unsupervised algorithms do not have the outputs in advance. These algorithms are left to make sense of the data without labels. Three Categories of Techniques for Machine Learning Three common categories of machine learning techniques are Classification,…


Link to Full Article: Apache Spark Machine Learning Tutorial