Introduction to Hadoop and Radoop
- 20 – 21 July 2020 (Bangkok)
- 2 – 3 November 2020 (KL)

- Understand and explore Hadoop infrastructure and ecosystem
- Explore Hadoop core component HDFS and YARN
- Use relational data stores with Hadoop
- Understand large scale data processing frameworks
- Connect to Hadoop cluster using RapidMiner Radoop
- Integrate task and analysis into RapidMiner processes
Part A: Introduction to Hadoop
- Introduction to Big Data Hadoop
- Hadoop Overview and History
- Exploring Hadoop Ecosystem
- Introduction to Cloudera Manager and CDH
- Introduction to Hadoop Core Components: HDFS and YARN
- HDFS Distributed File System
- Understanding HDFS Architecture and Components
- Hands on HDFS basics shell commands
- YARN Resource Management
- Understanding YARN Architecture and Components
- Exploring Hadoop file formats (Which file format is better)
- HDFS Distributed File System
- Using relational data stores with Hadoop
- Introduction to Hive (Hive Architecture and how it works)
- Overview of Hive supported file formats and Hive partition
- Hands on HiveQL through Hue interface & Hive Client
- Integrating MySQL with Hive using Sqoop
- Use Sqoop to import data from MySQL to HDFS/Hive
- Use Sqoop to export data from Hadoop to MySQL
- Large Scale Data Processing Frameworks
- Introduction to Hadoop MapReduce
- What is it & How it works?
- Practical MapReduce example
- Introduction to Apache Spark
- Overview of Spark Core Concepts and Architecture
- Spark Clusters and the Resource Management System
- Spark Application
- Spark Driver and Executor
- Introducing to Spark data structures: The Resilient Distributed Dataset (RDD) & RDD Operations
- Hands On simple Spark jobs
- Overview of Spark Core Concepts and Architecture
- Introduction to Hadoop MapReduce
Part B: Introduction to Radoop
- Introduction to Radoop
- Hadoop Integration with RapidMiner: Radoop
- Introduction to the Radoop GUI
- Connecting to a Hadoop Cluster
- Data Exploration
- Browsing Tables
- Viewing Statistics and High-Level Information
- Data Extraction and Loading
- Formulation of Queries
- Pushing Data into Hadoop
- Clustering
- Integration of In cluster Analyses into
RapidMiner Processes- Modeling Algorithms
- Natural Aggregation
- In memory Training, in Hadoop Scoring
- Beyond Natural Aggregation
- Chunking
- Voting
- In Hadoop Modelling
- Clustering
Register Now
Drop us your entry if you are interested to join this course.
You may like
Smart Manufacturing – Improving OEE via Predictive Maintenance and Anomaly Detection
Smart Manufacturing - Improving OEE via Predictive Maintenance and Anomaly DetectionPERSONNELS involve in SMART MANUFACTURINGDuration: 3 DaysTraining Date 20 - 22 July 2020 (KL) 10 - 12 August 2020 (Penang) 19 - 21 October 2020 (KL) 1 - 3 December 2020 (Penang) Class...
Python for Machine Learning and Advanced Analytics
Python for Machine Learning and Advanced AnalyticsIT PROFESSIONALS, DATA ANALYST and PROFESSIONALS with basic knowledge of programmingDuration: 4 DaysThis course will introduce the learner to applied data analytics with Python, focusing more on the techniques and...
Modern Data Engineering In The Cloud
Modern Data Engineering In The CloudPERSONNEL involve in DATA INTEGRATIONDuration: 3 Days, 2 Days (Online Class)Data engineering is the crucial part to enable and operationalize big data analytics and cloud applications in the big data ecosystem. Modern data...