Introduction to big data & hadoop
- Hadoop Introduction
- Big Data usage & benefits
- Hadoop Architecture
- MapReduce 1.0 & 2.0
- Hadoop cluster setup
- File read / write operations
Preparing Hadoop Server Operating System (Linux) with Networking
- Host-name, IP address / Network configuration, Persistent Rules
- Setup Data-nodes hosts
Preparing client Data-node machines
- Host-name, IP address / Network configuration, Persistent Rules
- Testing connectivity’s between cluster hosts
Setup SSH password less login for Hadoop Cluster
- SSH
- id_RSA.pub
- authorized_keys
Install & Configure HTTP web server for Hadoop server
- FQDN’s
- Testing HTTP server
- Setting up Virtual hosts
Configure repositories for Cloudera Distribution of Hadoop & Cloudera Manager
- Accessing Cloudera archive,
- createrepo, RPMS
- Repolist
Setup Hadoop server configuration
- Secure Linux
- Network time protocol daemon
- Swappiness
- Linux firewall
- File handling limits
Installing cloudera hadoop
- Creating Yum Repository
- .repo files
- Host inspectors
- Block Size
- Mount directories for DataNodes , NameNode, Secondary Namenode
Exploring HDFS
- HDFS logs directories
- Observing charts
- Setup root user
- Dfs.block size, Data.dir etc properties
- property
- Fault tolerance
- Checksum
- Working with HDFS Block size
- HDFS replication
- Observing Namenode UI
- Block pool
Working with Metadata
- FSImage
- Edit logs
- Core-site.xml
- HDFS-site.xml
- Secondary Namenode
Hadoop 1.0-Namenode Recovery & Secondary Namenode
- dfsadmin
- Safe mode
- save Namespace
- fsimage / edits log
- Namenode recovery after server crash process
- Data node registrations
Setup Hadoop 2.0-Hadoop High Availability (HA) Name node's & Zookeeper
- Cloudera Manager
- Setup Zookeeper services Cloudera Manager
- Setup Journal Nodes using Cloudera Manager
- Active & Standby Name Nodes
- Understanding Zookeeper & Journal Nodes
- High availability configurations (core-site.xml / HDFS-site.xml)
- Understanding CDH Certification
HDFS Commands
- File system commands (FS, jar, fsck)
- FS – append To File,
- put, copy To Local,
- chown, chgrp,
- df, du
- move From Local, move To Local
- stat etc
Understating Map Reduce 1.0 Flow
- Map Reduce 1.0 architecture
- Heartbeats
- Job Tracker
- Task Tracker
- Speculative Execution
Setup YARN-Map Reduce 2.0 and Resource Manager & Node Managers
- Add YARN service
- YARN Architecture
- Application Masters
- Resource Managers
- Containers
- Cluster Metrics
Write a public review