Apache Spark and Scala Certification Training Course

Apache Spark and Scala Certification Training Course by eMexo Technologies Training Institute Bangalore

Beginner 0(0 Ratings) 0 Students enrolled
Created by eMexo Technologies Training Institute Last updated Fri, 08-Apr-2022 English


Apache Spark and Scala Certification Training Course free videos and free material uploaded by eMexo Technologies Training Institute .

Syllabus / What will i learn?

Course Content

Introduction to Big Data Hadoop and Spark

 What is Big Data?

 Big Data Customer Scenarios

 Limitations and Solutions of Existing Data Analytics Architecture with Uber Use Case

 How Hadoop Solves the Big Data Problem?

 What is Hadoop?

 Hadoop’s Key Characteristics

 Hadoop Ecosystem and HDFS

 Hadoop Core Components

 Rack Awareness and Block Replication

 YARN and its Advantage

 Hadoop Cluster and its Architecture

 Hadoop: Different Cluster Modes

 Hadoop Terminal Commands

 Big Data Analytics with Batch & Real-time Processing

 Why Spark is needed?

 What is Spark?

 How Spark differs from other frameworks?

 Spark at Yahoo!

Introduction to Scala for Apache Spark

 What is Scala?

 Why Scala for Spark?

 Scala in other Frameworks

 Introduction to Scala REPL

 Basic Scala Operations

 Variable Types in Scala

 Control Structures in Scala

 Foreach loop, Functions and Procedures

 Collections in Scala- Array

 ArrayBuffer, Map, Tuples, Lists, and more

 Hands-on

Functional Programming and OOPs Concepts in Scala

 Functional Programming

 Higher Order Functions

 Anonymous Functions

 Class in Scala

 Getters and Setters

 Custom Getters and Setters

 Properties with only Getters

 Auxiliary Constructor and Primary Constructor

 Singletons

 Extending a Class

 Overriding Methods

 Traits as Interfaces and Layered Traits

 Hands-on

Deep Dive into Apache Spark Framework

 Spark’s Place in Hadoop Ecosystem

 Spark Components & its Architecture

 Spark Deployment Modes

 Introduction to Spark Shell

 Writing your first Spark Job Using SBT

 Submitting Spark Job

 Spark Web UI

 Data Ingestion using Sqoop

 Hands-on

Playing with Spark RDDs

 Challenges in Existing Computing Methods

 Probable Solution & How RDD Solves the Problem

 What is RDD, It’s Operations, Transformations & Actions

 Data Loading and Saving Through RDDs

 Key-Value Pair RDDs

 Other Pair RDDs, Two Pair RDDs

 RDD Lineage

 RDD Persistence

 WordCount Program Using RDD Concepts

 RDD Partitioning & How It Helps Achieve Parallelization

 Passing Functions to Spark

 Hands-on

DataFrames and Spark SQL

 Need for Spark SQL

 What is Spark SQL?

 Spark SQL Architecture

 SQL Context in Spark SQL

 User Defined Functions

 Data Frames & Datasets

 Interoperating with RDDs

 JSON and Parquet File Formats

 Loading Data through Different Sources

 Spark – Hive Integration

 Hands-on

Machine Learning using Spark MLlib

 Why Machine Learning?

 What is Machine Learning?

 Where Machine Learning is Used?

 Face Detection: USE CASE

 Different Types of Machine Learning Techniques

 Introduction to MLlib

 Features of MLlib and MLlib Tools

 Various ML algorithms supported by MLlib

Deep Dive into Spark MLlib

 Supervised Learning – Linear Regression, Logistic Regression, Decision Tree, Random Forest

 Unsupervised Learning – K-Means Clustering & How It Works with MLlib

 Analysis on US Election Data using MLlib (K-Means)

 Hands-on

Understanding Apache Kafka and Apache Flume

 Need for Kafka

 What is Kafka?

 Core Concepts of Kafka

 Kafka Architecture

 Where is Kafka Used?

 Understanding the Components of Kafka Cluster

 Configuring Kafka Cluster

 Kafka Producer and Consumer Java API

 Need of Apache Flume

 What is Apache Flume?

 Basic Flume Architecture

 Flume Sources

 Flume Sinks

 Flume Channels

 Flume Configuration

 Integrating Apache Flume and Apache Kafka

 Hands-on

Apache Spark Streaming - Processing Multiple Batches

 Drawbacks in Existing Computing Methods

 Why Streaming is Necessary?

 What is Spark Streaming?

 Spark Streaming Features

 Spark Streaming Workflow

 How Uber Uses Streaming Data

 Streaming Context & DStreams

 Transformations on DStreams

 Describe Windowed Operators and Why it is Useful

 Important Windowed Operators

 Slice, Window and ReduceByWindow Operators

 Stateful Operators

Apache Spark Streaming - Data Sources

 Apache Spark Streaming: Data Sources

 Streaming Data Source Overview

 Apache Flume and Apache Kafka Data Sources

 Example: Using a Kafka Direct Data Source

 Perform Twitter Sentimental Analysis Using Spark Streaming

 Hands-on



Curriculum for this course
0 Lessons 00:00:00 Hours
+ View more
Description

Let eMexo Technologies Best Apache Spark Training in Electronic City Bangalore take you from the fundamentals of Apache Spark to Advance Apache Spark and make you an expert in developing real time Apache Spark applications. Here are the major topics we cover under this Apache Spark course Syllabus, ntroduction to the Spark Shell and the training environment, Intro to Spark DataFrames and Spark SQL, Introduction to RDDs, Lazy Evaluation, Data Sources: reading from Parquet, S3, Cassandra, HDFS, and your local file system, Programming with Accumulators and Broadcast variables, Advanced programming with RDDs (understanding the shuffle phase, partitioning, etc.) , Visualization: matplotlib, gg_plot, dashboards, exploration and visualization in notebooks, Introduction to Spark Streaming, Introduction to MLlib and GraphX. Each topic will be covered in practical way with examples.


You need online training / explanation for this course?

1 to 1 Online Training contact instructor for demo :


+ View more

Other related courses
About the instructor
  • 0 Reviews
  • 1 Students
  • 35 Courses
Student feedback
0
Average rating
  • 0%
  • 0%
  • 0%
  • 0%
  • 0%
Reviews

Material price :

Free

1:1 Online Training Fee: 10000 /-
Contact instructor for demo :