Real World Spark 2 - ScalaIDE Spark Core 2 Developer

Build a Vagrant box, walk through Spark 2 Core Code via sbt and ScalaIDE. The modern cluster computation engine.

Ratings 4.00 / 5.00

Real World Spark 2 - ScalaIDE Spark Core 2 Developer

What You Will Learn!

Simply run a single command on your desktop, go for a coffee, and come back with a running distributed environment for cluster deployment
Code in Scala against Spark. Transformation, Actions and Spark Monitoring
Debug Spark Code within ScalaIDE

Description

Note : This course is built on top of the "Real World Vagrant - Build an Apache Spark Development Env! - Toyin Akin" course. So if you do not have a Spark + ScalaIDE environment already installed (within a VM or directly installed), you can take the stated course above.

Scala IDE provides advanced editing and debugging support for the development of pure Scala and mixed Scala-Java applications.

Now with a shiny Scala debugger, semantic highlight, more reliable JUnit test finder, an ecosystem of related plugins, and much more.

Scala Debugger. Stepping through closures and Scala-aware display of debugging information.

Spark Monitoring and Instrumentation

While creating RDDs, performing transformations and executing actions, you will be working heavily within the monitoring view of the Web UI.

Every SparkContext launches a web UI, by default on port 4040, that displays useful information about the application. This includes:

A list of scheduler stages and tasks
A summary of RDD sizes and memory usage
Environmental information.
Information about the running executors

Why Apache Spark ...

Apache Spark run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. Apache Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing. Apache Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python and R shells. Apache Spark can combine SQL, streaming, and complex analytics.

Apache Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application.

Who Should Attend!

Software engineers who want to expand their skills into the world of distributed computing
Developers who want to write/test their code against Scala / Spark

TAKE THIS COURSE