Machine Learning using DB2 for z/OS data and Spark Part 2

This is Part 2 of blog to do machine learning using DB2 for z/OS data and Spark machine learning feature. In Part 1 (https://www.ibm.com/developerworks/community/blogs/e429a8a2-b27f-48f3-aa73-ca13d5b69759/entry/Machine_Learning_using_DB2_for_z_OS_data_and_Spark_Part_1?lang=en), we used VectorAssembler to create features as input to our model. In Part 2, we will use R formula. Spark RFormula selects columns mentioned by an R model formula. See https://spark.apache.org/docs/2.0.2/ml-features.html#rformula for details. If you have not done so, please read Part 1 for background, pre-requisite, and general steps. The process for using R Formula is basically same as those mentioned in Part 1. I will call out the difference and additional steps.   Add the following import statement in Step 2 d) import org.apache.spark.ml.feature.RFormula   Replace Step 3 b) and c) with the following //use R formula val formula = new RFormula().setFormula(“drugLabel ~ AGE +…


Link to Full Article: Machine Learning using DB2 for z/OS data and Spark Part 2