Five things you need to know about Hadoop v. Apache Spark

They’re sometimes viewed as competitors in the big-data space, but the growing consensus is that they’re better together Hadoop and Apache Spark are both big-data frameworks. Credit: IDGNS IDG News Service Dec 11, 2015 12:01 PM Listen in on any conversation about big data, and you’ll probably hear mention of Hadoop or Apache Spark. Here’s a brief look at what they do and how they compare. 1: They do different things. Hadoop and Apache Spark are both big-data frameworks, but they don’t really serve the same purposes. Hadoop is essentially a distributed data infrastructure: It distributes massive data collections across multiple nodes within a cluster of commodity servers, which means you don’t need to buy and maintain expensive custom hardware. It also indexes and keeps track of that data, enabling…


Link to Full Article: Five things you need to know about Hadoop v. Apache Spark