Spark

Apache Spark is an open-source computing framework. It was originally developed at the University of California, Berkeley's AMPLab in 2009 and donated to the Apache Software Foundation. It's part of a greater set of tools, along with Apache Hadoop and other open-source resources which are used in today’s analytics community.

Advantages

Lighting fast processing – Spark enables applications in Hadoop clusters to run up to 100x faster in memory, and 10x faster even when running on disk
Support for sophisticated analytics – Spark supports SQL queries, streaming data, complex analytics such as graph algorithms, and machine learning. Users can combine all these capabilities in a single workflow
Real-time Stream Processing
Ability to integrate with Hadoop and existing Hadoop Data
Active and expanding community

Disadvantages

Data arriving out of time order is a problem for batch-based processing
Batch length restricts Window-based analytics – data is often of poor quality, some records might be missing and streams can arrive with data out of time order
It offers limited performance per server according to stream processing standards these days. It scales out large numbers of servers to gain overall system performance
Writing stream processing operations from scratch is not easy – Spark streaming offers limited binaries of stream functions

Components

Types of cluster managers:

– Standalone: a simple cluster manager that makes it easy to set up a cluster
– Apache Mesos: a general cluster manager that can run service applications
– Hadoop YARN: the resource manager in Hadoop 2.0

Shipping code to the cluster – dynamically adding new files to be sent to executors
Monitoring – offers information about running executors and tasks
Job scheduling – control over resource allocation both on across and within applications is permitted

Development tools

IntelliJ
Eclipse

Spark

Advantages

Disadvantages

Components

Development tools

COMPANY

CONTACT

CERTIFICATIONS

KEEP IN TOUCH ON SOCIAL MEDIA

Spark

Advantages

Disadvantages

Components

Development tools

COMPANY

CONTACT

CERTIFICATIONS

KEEP IN TOUCH ON SOCIAL MEDIA

Start typing and press enter to search