Big Data Hadoop

Big Data Hadoop Training Provided by Technogeeks Training Institute in Pune,Aundh

Beginner 0(0 Ratings) 0 Students enrolled
Created by Technogeeks Training Institute staff Last updated Wed, 13-Apr-2022 English


Big Data Hadoop free videos and free material uploaded by Technogeeks Training Institute. This session contains about Big Data Hadoop Updated syllabus , Lecture notes , videos , MCQ , Privious Question papers and Toppers Training Provided Training of this course. If Material not uploaded check another subject

Syllabus / What will i learn?

MODULE -1 INTRODUCTION TO HADOOP

Hadoop- Demo

What is Bigdata

When data becomes Bigdata

3V’s of Bigdata

Introduction to Hadoop Ecosystem

Why Hadoop? If Existing Tools and Technologies are there in the market for decades?

How Hadoop is getting two categories Projects- New projects on Hadoop

Clients want POC and migration of Existing tools and Technologies on Hadoop

Clients want POC and migration of Existing tools and Technologies on Hadoop Technology

How Open Source tool (HADOOP) is capable to run jobs in lesser time which take longer time in

other tools in the market.

Hadoop Processing Framework (Map Reduce) / YARN

Alternates of Map Reduce

Why NoSQL is in more demand nowadays

Distributed warehouse for DFS

Most demanding tools which can run on the top of Hadoop Ecosystem for specific requirements in

specific scenarios

Data import/Export tools

MODULE 2 - HADOOP SETUP INSTALLATION AND PIG BASICS

Hadoop installation

Introduction to Hadoop FS and Processing Environment’s UIs

How to read and write files

Basic Unix commands for Hadoop

Hadoop’s FS shell

Hadoop’s releases

Hadoop’s daemons

MODULE 3 - HIVE BASIC, HIVE ADVANCED

Hive Introduction

Hive Advanced

Partitioning

Bucketing

External Tables

Complex Use cases in Hive

Hive Advanced Assignment

Real-time scenarios of Hive

MODULE 4 - MAP REDUCE BASICS, POC (PROOF OF CONCEPT)

How Map Reduce works as Processing Framework

End to End execution flow of Map Reduce job

Different tasks in Map Reduce job

Why Reducer is optional while Mapper is mandatory?

Introduction to Combiner

Introduction to Partitioner

Programming languages for Map Reduce

Why Java is preferred for Map Reduce programming

POC based on Pig, Hive, HDFS, MR

<!--[endif]-->

MODULE 5 - MAP-REDUCE ADVANCED, HBASE BASICS

How to work on Map Reduce in real-time

Map Reduce complex scenarios

Drawbacks of Hadoop

Why Hadoop can’t be used for real-time processing

MODULE- 6 ZOOKEEPER, SQOOP, QUICK REVISION OF PREVIOUS CLASSES

Introduction to Zookeeper

How Zookeeper helps in Hadoop Ecosystem

How to load data from Relational storage in Hadoop

Sqoop basics Sqoop practical implementation

Quick revision of previous classes to fill the gap in understanding and correct understandings

MODULE- 7 FLUME, OOZIE, HADOOP RELEASES, INTRODUCTION TO YARN

How to load data in Hadoop that is coming from the web server or other storage without fixed  

schema

How to load unstructured and semi-structured data in Hadoop

Introduction to Flume

Hands-on on Flume

How to load Twitter data in HDFS using Hadoop

Introduction to Oozie

What kind of jobs can be scheduled using Oozie

How to schedule time-based jobs

Hadoop releases

From where to get Hadoop and other components to install

Introduction to YARN

Significance of YARN

MODULE- 8 INTRODUCTION TO HUE, DIFFERENT VENDORS IN THE MARKET, MAJOR PROJECT

DISCUSSION

Introduction to Hue

How Hue is used in real-time

Real-time Hadoop usage

Real-time cluster introduction

Hadoop Release 1 vs Hadoop Release 2 in real-time

Hadoop real-time project

Major POC based on the combination of several tools of Hadoop Ecosystem

Datasets for practice purpose

MODULE- 9 SPARK AND PYTHON

Introduction to Spark

Introduction to Python

Pyspark concepts

Advantages of Spark over Hadoop

Is Spark a replacement for Hadoop?

How Spark is Faster than Hadoop

Spark RDD

Spark Transformation and Actions

Spark SQL

Datasets and Data Frames

Real-time scenarios examples of Spark where we prefer Spark over Hadoop

How Spark is capable to process complex data sets in lesser time

In-Memory Processing Framework for Analytics

MODULE- 10 HADOOP IN CLOUD COMPUTING: AWS

Introduction to Cloud Computing

On-premises vs cloud setup

Major cloud providers of Bigdata

What is EMR

HDFS vs S3

Overview and working of AWS Glue jobs

AWS Glue

AWS Redshift

AWS Athena

 



Curriculum for this course
0 Lessons 00:00:00 Hours
+ View more
Description
You need online training / explanation for this course?
1:1 Online Training / Explanation Fee: 10000 /- Month

1 to 1 Online Training contact instructor for demo :


+ View more

Other related courses
About the instructor
  • 0 Reviews
  • 0 Students
  • 52 Courses
Student feedback
0
Average rating
  • 0%
  • 0%
  • 0%
  • 0%
  • 0%
Reviews

Material price :

Free

1:1 Online Training Fee: 10000 /- Month
Contact instructor for demo :