Comprehensive Machine Learning with Python

Comprehensive Machine Learning with Python

data-analyst-science near Pune
Recorded content
Of Total 10 Hrs.
data-analyst-science near Pune
Duration
3 Months (50 hours)
data-analyst-science near Pune
LIVE sessions
4 Workshops
data-analyst-science near Pune
Hands-On Learning
With Practice Modules
data-analyst-science near Pune
Certificate
With License

Overview

Skillsmetrix private, onsite or online Comprehensive Machine Learning (ML) with Python training course builds on our Comprehensive Data Science with Python class and teaches attendees how to write machine learning applications in Python.

Objective

  • Understand machine learning as a useful tool for predictive models
  • Implement data preprocessing for an ML workflow
  • Know when to reach for machine learning as a tool
  • Implement data preprocessing for an ML workflow
  • Understand the difference between supervised and unsupervised tasks
  • Implement several classification algorithms
  • Evaluate model performance using a variety of metrics
  • Compare models across a workflow
  • Implement regression algorithm variations
  • Understand clustering approaches to data
  • Interpret labels generated from clustering
  • Transform unstructured text data into structured data
  • Understand text-specific data preparation
  • Visualize frequency data from text sources
  • Perform topic modeling on a collection of documents
  • Use labeled text to perform document classification

Outline

  • • Anaconda Computing Environment
  • • Importing and manipulating Data with Pandas
  • • Exploratory Data Analysis with Pandas and Seaborn
  • • NumPy ndarrays versus Pandas Dataframes
    • • Machine Learning Theory
    • • Data pre-processing
    • • Missing Data
    • • Dummy Coding
    • • Standardization
    • • Data Validation Strategies
    • • Supervised Versus Unsupervised Learning

  • • Understanding the linear model
  • • Describing model fit
  • • Adding complexity to the model
  • • Explaining the relationship between model inputs and the outcome
  • • Making predictions from the model

  • • Linear Regression
  • • Penalized Linear Regression
  • • Stochastic Gradient Descent
  • • Decision Tree Regressor
  • • Random Forest Regression
  • • Gradient Boosting Regressor
  • • Scoring New Data Sets
  • • Cross Validation
  • • Variance-Bias Tradeoff
  • • Feature Importance

  • • Logistic Regression
  • • LASSO
  • • Support Vector Machine
  • • Random Forest
  • • Ensemble Methods
  • • Feature Importance
  • • Scoring New Data Sets
  • • Cross Validation

  • • Preparing Data for Ingestion
  • • K-Means Clustering
  • • Visualizing Clusters
  • • Comparison of Clustering Methods
  • • Agglomerative Clustering and DBSCAN
  • • Evaluating Cluster Performance with Silhouette Scores
  • • Scaling
  • • Mean Shift, Affinity Propagation and Birch
  • • Scaling Clustering with mini-batch approaches

  • • Understand average versus conditional treatment effects
  • • Estimating conditional average treatment effects for a sample
  • • Summarizing and Interpreting

  • Generative AI fundamentals
    • • Intro to H20
    • • Launching the cluster, checking status
    • • Data Import, manipulation in H20
    • • Fitting models in H20
    • • Generalized Linear Models
    • • naïve bayes
    • • Random forest
    • • Gradient boosting machine (GBM)
    • • Ensemble model building
    • • automl
    • • data preparation
    • • leaderboards
    • • Methods for explaining modeling output

  • • Transforming Raw Text Data into a Corpus of Documents
  • • Identifying Methods for Representing Text Data
  • • Transformations of Text Data
  • • Summarizing a Corpus into a TF—IDF Matrix
  • • Visualizing Word Frequencies

  • • Installing And Accessing Sample Text Corpora
  • • Tokenizing Text
  • • Cleaning/Processing Tokens
  • • Segmentation
  • • Tagging And Categorizing Tokens
  • • Stopwords
  • • Vectorization Schemes for Representing Text
  • • Parts-of-speech (POS) Tagging
  • • Sentiment Analysis 
  • • Topic Modeling with Latent Semantic Analysis

  • • Unsupervised Machine Learning and Text Data
  • • Topic Modeling via Clustering
  • • Supervised Machine Learning Applications in NLP

Training Materials

All Machine Learning with Python students receive courseware covering the topics in the class.

Software Requirements

• Windows, Mac, or Linux with at least 8 GB RAM

• A current version of Anaconda for Python 3.x

• Related lab files that Skillsmetrix will provide

Why Online Bootcamps

Develop skills for real career growth

Cutting-edge curriculum designed in guidance with industry and academia to develop job-ready skills

Learn by working on real-world problems

Capstone projects involving real world data sets with virtual labs for hands-on learning

Learn from experts active in their field, not out-of-touch trainers

Leading practitioners who bring current best practices and case studies to sessions that fit into your work schedule.

Structured guidance ensuring learning never stops

24x7 Learning support from mentors and a community of like-minded peers to resolve any conceptual doubts