Talend Big Data Advanced – Machine Learning

Who Should Attend?


Duration: 1 Day
Training Date
  • 17 June 2020 (KL)
  • 12 October 2020 (KL)

To complete Talend Big Data Basics using Talend Studio to industrialize Machine Learning.

  • Connect to a Hadoop cluster from a Talend Job
  • Use context variables and metadata
  • Read and write files in HDFS in a Big Data Batch Job
  • Configure a Big Data Batch Job to use the Spark framework
  • Create and test recommendation models
  • Create and test classification models
  • Use a machine learning algorithm to deduplicate data

1. SMS Classification Use Case

  • Monitoring the Hadoop cluster
  • Exploring an SMS classification use case: decision trees
  • Connecting to your cluster
  • Creating an SMS classification model
  • Testing the SMS classification mode
  • 2. Movie Recommendation Use Case
  • Movie recommendation use case:
    alternating least squares
  • Building a movie recommendation model
  • Testing the movie recommendation model

3. Irises Classification Use Case

  • Exploring an Iris flower classification
    use case: Naïve Bayes classifier
  • Building an iris classification model
  • Testing the iris classification model

4. Child-Care Deduplication Use Case

  • Exploring a child care use case and
    dataset: matching
  • Setting up the environment
  • Pairing data
  • Building a matching model
  • Using the matching model
  • Merging groups of duplicates

Register Now

Drop us your entry if you are interested to join this course.


You may like

Using Pig, Hive, and Impala with Hadoop

Using Pig, Hive, and Impala with Hadoop

Using Pig, Hive, and Impala with HadoopDuration: 3 DaysThrough instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as: The features that Pig, Hive, and Impala offer for data acquisition,...

Visual Analytics

Visual Analytics

Visual AnalyticsDATA PROFESSIONALS involve in DATA STORYTELLINGDuration: 3 DaysIn this course, you will learn to design visualizations that effectively share information and insights with others. This course will strengthen your understanding of visual best practices,...

Time Series Analytics with RapidMiner

Time Series Analytics with RapidMiner

Time Series Analytics with RapidMinerDATA ANALYST and DATA SCIENTIST involved in Time Series DataDuration: 1 DayTime Series Analysis with RapidMiner is a course regarding the analysis and handling of time series data science techniques. It introduces basic concepts in...