Talend Big Data Advanced – Spark Batch

Who Should Attend?

Complete TALEND BIG DATA BASICS Using TALEND STUDIO to interact with BIG DATA SYSTEMS

}
Duration: 1 Day
Training Date
  • 17 August 2020 (KL)

To complete Talend Big Data Basics using Talend Studio to interact with Big Data Systems.

  • Create a Big Data batch Job using the Spark framework
  • Copy data from a local file to HDFS
  • Copy data from MySQL to HDFS
  • Create a Hive table and copy data from HDFS to it
  • Import tweets to HDFS
  • Join, sort, and aggregate data
  • Use caches for faster processing
  • Query data from a Hive table using Hive QL
  • Query data from Spark datasets using Spark SQL
  1. Introduction to Spark
    • Monitoring the Hadoop cluster
    • Setting up the development environment
    • Understanding the basics of Spark
    • Analyzing customer data
  2. Sentiment Analysis Use Case
    • Monitoring the Hadoop cluster
    • Setting up the development environment
    • Loading tweets into HDFS
    • Processing tweets with Spark
    • Scheduling job execution
  3. Download Analysis Use Case
    • Setting up the development
      environment
    • Loading customers to Hive
    • Download analysis
    • Using Spark SQL to query data

Register Now

Drop us your entry if you are interested to join this course.

 


You may like

Using Pig, Hive, and Impala with Hadoop

Using Pig, Hive, and Impala with Hadoop

Using Pig, Hive, and Impala with HadoopDuration: 3 DaysThrough instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as: The features that Pig, Hive, and Impala offer for data acquisition,...

Visual Analytics

Visual Analytics

Visual AnalyticsDATA PROFESSIONALS involve in DATA STORYTELLINGDuration: 3 DaysIn this course, you will learn to design visualizations that effectively share information and insights with others. This course will strengthen your understanding of visual best practices,...

Time Series Analytics with RapidMiner

Time Series Analytics with RapidMiner

Time Series Analytics with RapidMinerDATA ANALYST and DATA SCIENTIST involved in Time Series DataDuration: 1 DayTime Series Analysis with RapidMiner is a course regarding the analysis and handling of time series data science techniques. It introduces basic concepts in...