Introduction to Hadoop Programming

Course:  HADOOPIP
Duration:  5 Days
Level:  I
Course Summary

This seminar will utilize lectures, hands-on workshops and case studies to illustrate the use of Hadoop Framework in Big Data Analytics environment. The topics will focus on role of Big data, overview of Hadoop and comparison to other formats, role of MapReduce, analysis of data using MapReduce, depict Map stage and Reduce stage, illustration of Payload, Mapper and various nodes (Master, Data and Slave), use of Hadoop File System and its design and concepts, HDFS command line and Java interface, HDFS reading and writing data, data flows and parallel copies, YARN application flows and scheduling operations, Hadoop I/O with integrity , compression and serialization, MapReduce execution flow and unit testing, use of Hadoop Cluster (setup and configuration) and administration.

Additional topics that may be added include Flume, Pig, Sqoop, Hive, Parquet and Flume.

« Hide The Details
Topics Covered In This Course
  • Understand the role of Big Data and use of Hadoop
  • Illustrate history of Hadoop and comparison to other systems
  • Define use of MapReduce and analysis capacities
  • Demonstrate design of HDFS and major concepts (Blocks, Caching, Federation, etc.)
  • Illustrate command line and use of basic commands
  • Define Java interface for reading, writing, deleting of data and query operations
  • Demonstrate role of YARM to split up the functionalities of resource management and job scheduling/monitoring into separate daemons
  • Depict role of ResourceManager and the NodeManager to form the data-computation framework
  • Illustrate Hadoop I/O with data integrity, use of compression tools and data serialization
  • Demonstrate the MapReduce configuration API and define development environment
  • Create unit Test within MapReduce using MRUnit
  • Illustrate MapReduce execution locally and cluster-based
  • Depict MapReduce workflows and decomposition into MapRecue Jobs
  • Illustrate MapReduce flow thru submission, assignment and execution and demonstrate fallback when failures occur within environment
  • Demonstrate MapReduce data types for input and output types
  • Understand features of MapReduce for counters, joins, sorting and distribution
  • Depict process for setting up a Hadoop Cluster via sizing and topology
  • Demonstrate Hadoop Cluster installation, formatting and daemon startup
  • Illustrate use of Kerberos for Hadoop security and use of delegation tokens
  • Understand Hadoop administration techniques for HDFS (persistence, logging, etc), Monitoring (log files and metrics) and basic maintenance
  • Optional Additions

Additional topics that may be added include Flume, Pig, Sqoop, Hive, Parquet and Flume.

What You Can Expect

This course is designed to provide the skills required to understand data analytics and use the Hadoop Framework.

Who Should Take This Course

This course is designed to provide the skills required to understand data analytics and use the Hadoop Framework.

Recommended Prerequisites

Prior exposure to Java development and database concepts strongly suggested.

Training Style

60% Hands-on/40% Lecture

« Hide The Details
Related Courses
Code Course Title Duration Level
JAVAF
Fundamentals of Java Development
5 Days
I
Details
DWHFUN
Data Warehousing Fundamentals
3 Days
I
Details

Every student attending a Verhoef Training class will receive a certificate good for $100 toward their next public class taken within a year.

You can also buy "Verhoef Vouchers" to get a discounted rate for a single student in any of our public or web-based classes. Contact your account manager or our sales office for details.

Schedule For This Course
There are currently no public sessions scheduled for this course. We can schedule a private class for your organization just a couple of weeks from now. Or we can let you know the next time we do schedule a public session.
Notify me the next time this course is confirmed!
Can't find the course you want?
Call us at 800.533.3893, or
email us at [email protected]