Review: Databricks makes big data dreams come true

For those of you just tuning in, Spark, an open source cluster computing framework, was originally developed by Matei Zaharia at U.C. Berkeley’s AMPLab in 2009, and later open-sourced and donated to the Apache Foundation. Part of the motivation for creating Spark is that MapReduce only allows a single pass through the data, while machine learning (ML) and graphing algorithms generally need to perform multiple passes.Spark is billed as a “fast and general engine for large-scale data processing,” with a tagline of “Lightning-fast cluster computing.” In the world of big data, Spark has been attracting attention and investment because it provides a powerful in-memory data-processing component within Hadoop that deals with both real-time and batch events. In addition to Databricks, Spark has been embraced by the likes of IBM, Microsoft,…


Link to Full Article: Review: Databricks makes big data dreams come true