Hadoop Developer

Our Data Science program is an interdisciplinary field where our expert faculty uses scientific methods, processes, algorithms, and systems to train students with this knowledge so that a student can extract data points and share insights from structured and unstructured data across a broad range of applications domains.

Request for Course

Hadoop Developer Course content

Module Topic Details Status
1    Setting up VM and Hadoop
    Set up Cloudera VM
    Completed
    Install JDK
    Installation Step of Hadoop Single Node Cluster
    Install VM ware / Virtual Box

2     Introduction to Bigdata
    Bigdata landscape
    Completed
    Course Content
    Session details and Feedback process

3    Hadoop Architecture, Networking and Cluster (HDFS & MapReduce)
    Name Node
    Completed
    Data Node
    Secondary Name Node
    Rack Awareness
    Replication & Re-replication
    HDFS Read & Write

4    Linux & HDFS Commands
    Basic Linux Commands
    Completed
    HDFS Commands
5     Working Session: Local FileSystem & HDFS Commands
6    MapReduce-1(MR V1)
    Understanding Map Reduce
    Completed
    Job Tracker and Task Tracker
    Architecture of Map Reduce
    Data Flow of Map Reduce
    Hadoop Writable with Java data types
    Map Function & Reduce Function
    How Map Reduce Works
    Anatomy of Map Reduce Job
    Submission & Initialization of Map Reduce Job
    Monitoring & Progress of Map Reduce Job
    Understand Difference Between Block and Input Split
    Role of Record Reader, Shuffler and Sorter
    File Input Formats
    How To check the Logs of all the Nodes(NN,DN,TT,JT,SNN)
    Setting up Eclipse Development Environment
    Creating Map Reduce Projects
    Configuring Hadoop API on Eclipse IDE
    Life cycle of the Job
    Identity of Reducer

7    Working Session: Program
    Map Reduce program flow with word count
    Completed
    Cricket Match Avg Score Program
    Completed

8    Assessment
    Hadoop MCQ

9    Apache Sqoop
    Installation of Sqoop
    Introduction to SQOOP & Architecture
    Import data from RDBMS to HDFS
    Handling incremental loads using sqoop
    Hands on exercise
10     Working Session: Sqoop Commands

11    Sqoop Assignment

12    Apache Hive
    Apache Hive Introduction & History
    End-to-End workflow(Hive Architecture)
    Data Types in Hive
    Apache Hive table
    Types of Tables in Hive(External &Internal)
    Partitions(Static & Dynamic)
    Types of Insertion(Single &Multi Table)
    CTAS & CVAS Concept
    Bucketing
    File Input Formats(RCFILE,TEXTFILE,ORCFILE,SQUENCEFILE)

13    Working Session: Hive Practice

14    Hive Assignment

15    Apache PIG
    PIG Introduction Architecture Commands
16    Working Session: Apache Pig Practice

17    Yarn(MapReduce V2)-Hadoop 2.x
    Inroduction of Yarn
    Architecture of Yarn

18    ZooKeeper
    Role of ZooKeeper
    Journal Node
    Use of ZoopKeeper

19    Apache Hbase
    Hbase Introduction Hbase commands
    How To View Table data
    How to Insert,Update and delete the data

20    Apache Oozie
    Oozie Introduction Components How to Schedule Job What is Workflow What is Cordinator What is Bundle

21    Hue
    Introduction of Hue
    How to run ETL process in Hue(Sqoop,Hive,Pig,Oozie)
22    Course closure : Mock Interview
    Ending the cource Mock Interview on the concepts covered