Applied Machine Learning using Python and Apache Spark

Applied Machine Learning using Python and Apache Spark

data-analyst-science near Pune
Recorded content
Of Total 10 Hrs.
data-analyst-science near Pune
Duration
3 Months (50 hours)
data-analyst-science near Pune
LIVE sessions
4 Workshops
data-analyst-science near Pune
Hands-On Learning
With Practice Modules
data-analyst-science near Pune
Certificate
With License

Overview

This Applied Machine Learning using Python and Apache Spark training teaches attendees Machine Learning (ML) concepts, terminology, and usage. Students learn how to perform and scale ML tasks using Python libraries (including NumPy, Pandas, Matplotlib, and Scikit-learn) on the Apache Spark platform.

Objective

  • Gain a basic understanding of Machine Learning
  • Understand the differences between supervised and unsupervised learning
  • Understand how to use Python libraries to explore, clean, and prepare data
  • Describe the role of ML and where it fits into IT strategies
  • Explain the technical and business drivers that result from using Machine Learning
  • Understand techniques like classification, clustering, and regression
  • Discuss how to identify which techniques should be applied for a specific use case
  • Understand popular machine offerings, including Amazon Machine Learning, TensorFlow, Azure Machine Learning, Google Cloud, Spark MLlib, Python, R, and more
  • Install and set up Anaconda
  • Use Jupyter Notebooks
  • Understand the popular Machine Learning algorithms, including linear regression, decision tree, logistic regression, K-nearest neighbor, K-means clustering, and more
  • Use Python libraries like NumPy, Pandas, Matplotlib and Scikit-learn
  • Understand Apache Spark Processing Framework and distributed architecture
  • Compare Machine learning using Python versus Apache Spark
  • Use Databricks cloud with Apache Spark MLlib

Outline

  • • History and background of Machine Learning
  • • Compare traditional programming to Machine Learning
  • • Supervised and unsupervised learning overview
    • • Classification
    • • Clustering
    • • Regression
    • • Classification
    • • Clustering
    • • Regression

  • • Machine Learning offerings in the industry
  • • Install and set up Anaconda
  • • Descriptive statistics
  • • Jupyter Notebooks

  • • NumPy
  • • Pandas
  • • Matplotlib

  • • Scikit-learn

  • • Linear regression
  • • Naive Bayes
  • • Decision tree
  • • Random forest
  • • Logistics regression
  • • Support vector machine
  • • K-nearest neighbor
  • • K-means clustering

  • • Spark libraries

  • Generative AI fundamentals
    • • Confusion Matrix
    • • ROC Curve, Area Under Curve (AUC)

Training Materials

All Machine Learning training students receive comprehensive courseware.

Software Requirements

• Windows, Mac, or Linux with at least 8 GB RAM

o Most class activities will create Spark code and visualizations in a browser-based notebook environment. The class also details how to export these notebooks and how to run code outside of this environment.

• A current version of Anaconda for Python 3.x

• Related lab files that Skillsmetrix will provide

• Internet access

Why Online Bootcamps

Develop skills for real career growth

Cutting-edge curriculum designed in guidance with industry and academia to develop job-ready skills

Learn by working on real-world problems

Capstone projects involving real world data sets with virtual labs for hands-on learning

Learn from experts active in their field, not out-of-touch trainers

Leading practitioners who bring current best practices and case studies to sessions that fit into your work schedule.

Structured guidance ensuring learning never stops

24x7 Learning support from mentors and a community of like-minded peers to resolve any conceptual doubts