Hadoop Training

Teacher

keentechnologies

Free

Prerequisites

To apply for the Hadoop Training in India, you need to either:

To learn big data Analytics tools you need to know at least one programming language like Java, Python or R.
You must also have basic knowledge on databases like SQL to retrieve and manipulate data.
You need to have knowledge on basic statistics like progression, distribution, etc. and mathematical skills like linear algebra and calculus.

Course Curriculum

Module 1: Introduction to Big Data and Hadoop

1.1 Introduction to Big Data and Hadoop
1.2 Introduction to Big Data
1.3 Big Data Analytics
1.4 What is Big Data
1.5 Four Vs Of Big Data
1.6 Case Study Royal Bank of Scotland
1.7 Challenges of Traditional System
1.8 Distributed Systems
1.9 Introduction to Hadoop
1.10 Components of Hadoop Ecosystem
1.11 Commercial Hadoop Distributions

Module 2: Hadoop Architecture, Distributed Storage (HDFS) and YARN

2.1 Introduction to Hadoop Architecture Distributed Storage (HDFS) and YARN
2.2 What Is HDFS
2.3 Need for HDFS
2.4 Regular File System vs HDFS
2.5 Characteristics of HDFS
2.6 HDFS Architecture and Components
2.7 High Availability Cluster Implementations
2.8 HDFS Component File System Namespace
2.9 Data Block Split
2.10 Data Replication Topology
2.11 HDFS Command Line
2.12 YARN Introduction
2.13 YARN Use Case
2.14 YARN and Its Architecture
2.15 Resource Manager
2.16 How Resource Manager Operates
2.17 Application Master
2.18 How YARN Runs an Application
2.19 Tools for YARN Developers

Module 3: Data Ingestion into Big Data Systems and ETL

3.1 Introduction to Data Ingestion into Big Data Systems and ETL
3.2 Overview of Data Ingestion
3.3 Apache Sqoop
3.4 Sqoop and Its Uses
3.5 Sqoop Processing
3.6 Sqoop Import Process
3.7 Sqoop Connectors
3.8 Apache Flume
3.9 Flume Model
3.10 Scalability in Flume
3.11 Components in Flume’s Architecture
3.12 Configuring Flume Components
3.13 Apache Kafka
3.14 Aggregating User Activity Using Kafka
3.15 Kafka Data Model
3.16 Partitions
3.17 Apache Kafka Architecture
3.18 Producer Side API Example
3.19 Consumer Side API
3.20 Consumer Side API Example
3.21 Kafka Connect

Module 4: Distributed Processing – MapReduce Framework and Pig

4.1 Introduction to Distributed Processing MapReduce Framework and Pig
4.2 Distributed Processing in MapReduce
4.3 Word Count Example
4.4 Map Execution Phases
4.5 Map Execution Distributed Two Node Environment
4.6 MapReduce Jobs
4.7 Hadoop MapReduce Job Work Interaction
4.8 Setting Up the Environment for MapReduce Development
4.9 Set of Classes
4.10 Creating a New Project
4.11 Advanced MapReduce
4.12 Data Types in Hadoop
4.13 OutputFormats in MapReduce
4.14 Using Distributed Cache
4.15 Joins in MapReduce
4.16 Replicated Join
4.17 Introduction to Pig
4.18 Components of Pig
4.19 Pig Data Model
4.20 Pig Interactive Modes
4.21 Pig Operations
4.22 Various Relations Performed by Developers

Module 5: Apache Hive

5.1 Introduction to Apache Hive
5.2 Hive SQL over Hadoop MapReduce
5.3 Hive Architecture
5.4 Interfaces to Run Hive Queries
5.5 Running Beeline from Command Line
5.6 Hive Metastore
5.7 Hive DDL and DML
5.8 Creating New Table
5.9 Data Types
5.10 Validation of Data
5.11 File Format Types
5.12 Data Serialization
5.13 Hive Table and Avro Schema
5.14 Hive Optimization Partitioning Bucketing and Sampling
5.15 Non-Partitioned Table
5.16 Data Insertion
5.17 Dynamic Partitioning in Hive
5.18 Bucketing
5.19 What Do Buckets Do
5.20 Hive Analytics UDF and UDAF
5.21 Other Functions of Hive

Module 6: NoSQL Databases – HBase

6.1 Introduction to NoSQL Databases HBase
6.2 NoSQL Introduction
6.3 HBase Overview
6.4 HBase Architecture
6.5 Data Model
6.6 Connecting to HBase

Module 7: Basics of Functional Programming and Scala

7.1 Introduction to the basics of Functional Programming and Scala
7.2 Introduction to Scala
7.3 Functional Programming
7.4 Programming with Scala
7.5 Type Inference Classes Objects and Functions in Scala
7.6 Collections
7.7 Types of Collections
7.8 Scala REPL

Module 8: Apache Spark Next-Generation Big Data Framework

8.1 Introduction to Apache Spark Next-Generation Big Data Framework
8.2 History of Spark
8.3 Limitations of MapReduce in Hadoop
8.4 Introduction to Apache Spark
8.5 Components of Spark
8.6 Application of In-Memory Processing
8.7 Hadoop Ecosystem vs Spark
8.8 Advantages of Spark
8.9 Spark Architecture
8.10 Spark Cluster in Real World

Module 9: Spark Core Processing RDD

9.1 Processing RDD
9.2 Introduction to Spark RDD
9.3 RDD in Spark
9.4 Creating Spark RDD
9.5 Pair RDD
9.6 RDD Operations
9.7 Demo: Spark Transformation Detailed Exploration Using Scala Examples
9.8 Demo: Spark Action Detailed Exploration Using Scala
9.9 Caching and Persistence
9.10 Storage Levels
9.11 Lineage and DAG
9.12 Need for DAG
9.13 Debugging in Spark
9.14 Partitioning in Spark
9.15 Scheduling in Spark
9.16 Shuffling in Spark
9.17 Sort Shuffle
9.18 Aggregating Data with Pair RDD

Module 10: Spark SQL – Processing DataFrames

10.1 Introduction to Spark SQL Processing DataFrames
10.2 Spark SQL Introduction
10.3 Spark SQL Architecture
10.4 DataFrames
10.5 Demo: Handling Various Data Formats
10.6 Demo: Implement Various DataFrame Operations
10.7 Demo: UDF and UDAF
10.8 Interoperating with RDDs
10.9 Demo: Process DataFrame Using SQL Query
10.10 RDD vs DataFrame vs Dataset

Module 11: Stream Processing Frameworks and Spark Streaming

11.1 Introduction to Stream Processing Frameworks and Spark Streaming
11.2 Overview of Streaming
11.3 Real-Time Processing of Big Data
11.4 Data Processing Architectures
11.5 Spark Streaming
11.6 Introduction to DStreams
11.7 Transformations on DStreams
11.8 Design Patterns for Using ForeachRDD
11.9 State Operations
11.10 Windowing Operations
11.11 Join Operations stream-dataset Join
11.12 Streaming Sources
11.13 Structured Spark Streaming
11.14 Use Case Banking Transactions
11.15 Structured Streaming Architecture Model and Its Components
11.16 Output Sinks
11.17 Structured Streaming APIs
11.18 Constructing Columns in Structured Streaming
11.19 Windowed Operations on Event-Time
11.20 Use Cases

Module 12: Spark MLLib – Modelling BigData with Spark

12.1 Introduction to Spark MLlib Modeling Big Data with Spark
12.2 Role of Data Scientist and Data Analyst in Big Data
12.3 Analytics in Spark
12.4 Machine Learning
12.5 Supervised Learning
12.6 Demo: Classification of Linear SVM
12.7 Demo: Linear Regression with Real-World Case Studies
12.8 Unsupervised Learning
12.9 Demo: Unsupervised Clustering K-Means
12.10 Reinforcement Learning
12.11 Semi-Supervised Learning
12.12 Overview of MLlib
12.13 MLlib Pipelines

Module 13: Spark GraphX

13.1 Introduction to Spark GraphX
13.2 Introduction to Graph
13.3 Graphx in Spark
13.4 Graph Operators
13.5 Join Operators
13.6 Graph Parallel System
13.7 Algorithms in Spark
13.8 Pregel API
13.9 Use Case of GraphX

Frequently Asked Questions

Give me some information about your instructors?

Our workforce comprises of experts and certified professionals having years of experience in related technology. All professionals are having in depth knowledge and are always ready to tackle your queries.

Are assignments provided during online training?

As you receive assignments during offline classes same way online assignments are also provided by certified professionals.

Who can opt for online training program?

Anyone who is having passion for learning new technologies can choose our online training programs. Training is perfect for candidates interested in enhancing their career by grabbing knowledge about latest technologies.

Is there any specific software requirement for starting your online training course?

No specific software is required for staring our training. You only need a computer with internet connection and you can easily access our course from any part of world.

What options are available for payment of fee from abroad?

International candidates can easily pay their fees with various online options like Xoom, money bookers etc. For more details you can check payments option available in our site.

Do you offer study material for various training programs?

Yes, we provide study materials according to course you have selected. We offer soft copies containing complete information of subject you have opted.

What are benefits of online training?

With online training you can get vast knowledge about latest technologies while saving your precious money and time.

Hadoop Training

Hadoop Training

Prerequisites

Course Curriculum

Module 1: Introduction to Big Data and Hadoop

Module 2: Hadoop Architecture, Distributed Storage (HDFS) and YARN

Module 3: Data Ingestion into Big Data Systems and ETL

Module 4: Distributed Processing – MapReduce Framework and Pig

Module 5: Apache Hive

Module 6: NoSQL Databases – HBase

Module 7: Basics of Functional Programming and Scala

Module 8: Apache Spark Next-Generation Big Data Framework

Module 9: Spark Core Processing RDD

Module 10: Spark SQL – Processing DataFrames

Module 11: Stream Processing Frameworks and Spark Streaming

Module 12: Spark MLLib – Modelling BigData with Spark

Module 13: Spark GraphX

Frequently Asked Questions

Give me some information about your instructors?

Are assignments provided during online training?

Who can opt for online training program?

Is there any specific software requirement for starting your online training course?

What options are available for payment of fee from abroad?

Do you offer study material for various training programs?

What are benefits of online training?

Are you interested in shaping the future in IT

Better Content = Better Results

Flexible and Personalized

ask-by-task daily calendar adjusted to your pace.

Comprehensive Quality

Study Anywhere, Anytime

24/7 support

We Accept Online Payments

Follow Us On

Disclaimer:

Hadoop Training

Hadoop Training

Prerequisites

Course Curriculum

Module 1: Introduction to Big Data and Hadoop

Module 2: Hadoop Architecture, Distributed Storage (HDFS) and YARN

Module 3: Data Ingestion into Big Data Systems and ETL

Module 4: Distributed Processing – MapReduce Framework and Pig

Module 5: Apache Hive

Module 6: NoSQL Databases – HBase

Module 7: Basics of Functional Programming and Scala

Module 8: Apache Spark Next-Generation Big Data Framework

Module 9: Spark Core Processing RDD

Module 10: Spark SQL – Processing DataFrames

Module 11: Stream Processing Frameworks and Spark Streaming

Module 12: Spark MLLib – Modelling BigData with Spark

Module 13: Spark GraphX

You May Like

ITIL Training

SCCM Training

ServiceNow Training

PMP Training in Mississauga

ServiceNow Vendor Risk Management Training

Frequently Asked Questions

Give me some information about your instructors?

Are assignments provided during online training?

Who can opt for online training program?

Is there any specific software requirement for starting your online training course?

What options are available for payment of fee from abroad?

Do you offer study material for various training programs?

What are benefits of online training?

We Accept Online Payments

Follow Us On

Disclaimer:

Modal title