Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages

Text, Web and Social Media Mining

Who Should Attend?

DATA SCIENTIST involve in TEXT & WEB MINING

}
Duration: 3 Days
Date: 30 September – 2 October 2020

This course is an introduction into knowledge discovery using unstructured data like text documents, web and social media contents. It focuses on the necessary preprocessing steps and the most successful methods for automatic text classification including: Naive Bayes, Support Vector Machines (SVM), and clustering. Upon completion of this course, participants will have a solid understanding of typical text mining workflows and be able to identify techniques for processing unstructured data, apply different statistical textprocessing methods, and perform content classification and clustering.

  • Identify techniques for processing unstructured data
  • Transform textual data into a structured format
  • Apply different statistical text-processing methods
  • Perform text classification and text clustering
  • Work on popular tasks like sentiment analysis or opinion mining
  1. Overview
    • Business Scenario
    • Analytics Taxonomy & Hierarchy
    • CRISP-DM & Data Mining in the Enterprise
  2. EDA: Exploratory Data Analysis
    • Loading Data
    • Quick Summary Statistics
  3. Data Preparation
    • Basic Data ETL (Extract, Transform & Load)
  4. Predictive Model’s Algorithms
    • K-Nearest Neighbour
  5. Model Construction and Evaluation
    • Machine Learning Theory: Bias, Variance, Overfitting & Underfitting
    • Split and Cross Validation
    • Applying Models 
    • Evaluation Methods & Performance Criteria
  6. Loading of Texts
    • Loading from Flat Files
    • Loading from Data Sets
    • Loading from Web Sources (e.g. URL Crawling, Twitter)
  7. Concepts
    • Text Processing
    • Documents
    • Tokens
  8. Visualization
    • Visualizing Documents and Tokens
    • Multi-Dimensional Visualizations
  9. Handling Unstructured Data
    • Preprocessing of Textual Data
    • Tokenizing
    • Stemming
    • Filtering of Tokens
    • Term Frequencies
    • Document Frequencies
    • TF-IDF
  10. Advanced Modeling
    • Support Vector Machines
    • Naïve Bayes
    • Text Clustering
  11. Web Mining
    • Crawling the Web
    • Extracting Information from Web Sites
    • Transforming Web Sites to Documents
    • Retrieving Structured Web Data
    • Data ETL and Pre-processing for Web Sourced Data
    • Enriching Data via Web Services
    • Using Third Party Web Mining Extensions

Register Now

Drop us your entry if you are interested to join this course.

 


You may like

Talend Data Integration Administration

Talend Data Integration Administration Anyone responsible for the operation , deployment , or maintenance of Talend Jobs. Basic knowledge of networking and systems architecture Duration: 1 Day Training Date 9 November 2020 (KL) Describe the Talend DI server...