Spark 2.0 aims to make streaming a first-class citizen

Spark has generated huge excitement in the big data community. It has provided the path for making the speed or velocity layer of big data reality, simplified some of the gotchas of MapReduce, and expanded the palette of computations beyond MapReduce, and it does this all through a set of common interfaces. But there remain issues with memory management, coordination with YARN, and setting a level playing field with R and Python developers. With the new release, Spark flattens the Lambda architecture, adds some tweaks for using R, and takes performance up a notch. Yes, Spark’s entering puberty. Spark 2.0, announced last week following a couple months of tech previews, generated few surprises. But the highlight was a new Structured Streaming API that brings interactive SQL query to the real-time…


Link to Full Article: Spark 2.0 aims to make streaming a first-class citizen