Sunday 25 September 2016

Spark Cluster

Apache Spark is a cluster computing engine for large scale data processing in lightening speed.

It run programs upto 100 times faster than Hadoop MapReduce in memory.

Apache spark apps run as independent sets of processes, coordinated by SparkContext object in your driver program.

To run on cluster, Spark context can able to connect to the several cluster managers.