Using Apache Spark to Analyze Large Neuroimaging Datasets

This article was written by Sergul Aydore, Ph.D., and Syed Ashrafulla, Ph.D. Sergul and Syed received their Ph.D.s in Electrical Engineering in 2014 from the University of Southern California, applying signal processing to neuroimaging data. They continue to use machine learning on brain imaging data as a pastime and sharing their knowledge with the community. They constantly challenge each other as good buddies and like to call themselves “signal learners.” The views expressed in this article are of the authors and not of their employers or Domino Data Lab. In this post we will describe how we used PySpark, through Domino’s data science platform, to analyze dominant components in high-dimensional neuroimaging data. We will demonstrate how to perform Principal Components Analysis (PCA) on a dataset large enough that standard single-computer…


Link to Full Article: Using Apache Spark to Analyze Large Neuroimaging Datasets