Spark Fundamentals

Solid understanding and experience, with core tools, in any field promotes excellence and innovation. Apache Spark, as a general engine for large scale data processing, is such a tool within the big data realm. This learning path addresses the fundamentals of this program's design and its application in the everyday.


ABOUT THIS LEARNING PATH

Ever waited over night to run a report and to come back to your computer in the morning to find it still running. When the heat is on and you have a deadline, something is not working.  With larger and larger data sets you need to be fluent in the right tools to be able to make your commitments. This learning path is your opportunity to learn from industry leaders about Spark. This path provides hands on opportunities and projects to build your confidence within this tool set.

Come along and start your journey to receiving the following badges: Spark – Level 1 and Spark – Level 2.

TELL YOUR FRIENDS

AUDIENCE:

Data Engineers, Data Scientists

LEARNING PATH LEVEL:

Beginner

2 BADGES

5 COURSES

Spark Fundamentals Courses

Spark Fundamentals I

Spark Fundamentals I

About the course
Ignite your interest in Spark with an introduction to the core concepts that make this general processor an essential tool set for working with Big Data.
Spark Fundamentals II

Spark Fundamentals II

About the course
Building on your foundational knowledge of Spark, take this opportunity to move your skills to the next level. With a focus on Spark Resilient Distributed Data Set operations this course exposes you to concepts that are critical to your success in this field.
Spark MLlIB

Spark MLlIB

About the course
Spark provides a machine learning library known as MLlib. MLlib provides various machine learning algorithms such as classification, regression, clustering, and collaborative filtering. It also provides tools such as featurization, pipelines, persistence, and utilities for handing linear algebra operations, statistics and data handling.
Exploring Spark's GraphX

Exploring Spark's GraphX

About the course
Spark provides a graph-parallel computation library in GraphX. Graph-parallel is a paradigm that allows representation of your data as vertices and edges. Spark's GraphX provides a set of fundamental operators in addition to a growing collection of algorithms and builders to simplify graph analytics tasks.
Analyzing Big Data in R using Apache Spark

Analyzing Big Data in R using Apache Spark

About the course
Apache Spark is a popular cluster computing framework used for performing large scale data analysis. SparkR provides a distributed data frame API that enables structured data processing with a syntax familiar to R users.

Course Badges

Earn these badges along the way.

Spark – Level 1

Earn by completing this course.

Complete the Spark Fundamentals Learning path

BADGE: Spark - Level 2

Our learning paths are designed to build on the content learned in the first course and then build upon the concepts in courses that follow. We recommend that they are completed in the order outlined in this learning path to ensure you get the most out of your investment of time. If you like what you see here, come and discover other learning paths and browse our course catalog.