Developer Training for Hadoop and Spark

Who Should Attend?

DEVELOPERS and ENGINEERS with programming experience

}
Duration: 4 Days
  • Distribute, store, and process data in a Hadoop cluster
  • Write, configure, and deploy Spark applications on a cluster
  • Use the Spark shell for interactive data analysis
  • Process and query structured data using Spark SQL
  • Use Spark Streaming to process a live data stream

Module 1

  • Intro to Apache Hadoop & the Hadoop Ecosystem
  • Apache Hadoop File Storage
  • Data Processing on an Apache Hadoop Cluster
  • Importing Relational Data with Apache Sqoop
  • Apache Spark Basics

Module 2

  • Working with RDDs
  • Aggregating Data with Pair RDDs
  • Writing and Running Apache Spark Applications
  • Configuring Apache Spark Applications
  • Parallel Processing in Apache Spark

Module 3

  • RDD Persistence
  • Common Patterns in Apache Spark Data Processing
  • Data Frames and Apache Spark SQL
  • Message Processing with Apache Kafka

Module 4

  • Capturing Data with Apache Flume
  • Integrating Apache Flume and Apache Kafka
  • Apache Spark Streaming: Introduction to DStreams
  • Apache Spark Streaming: Processing Multiple Batches
  • Apache Spark Streaming: Data Sources

Register Now

Drop us your entry if you are interested to join this course.