Overview
Development environments might not have the exact requirements as production environments. Moving data science and machine learning projects from idea to production requires state-of-the-art skills. You need to architect and implement your projects for scale and operational efficiency. Data science is an interdisciplinary field that combines domain knowledge with mathematics, statistics, data visualization, and programming skills.
The Practical Data Science Specialization brings together these disciplines using purpose-built ML tools in the AWS cloud. It helps you develop the practical skills to effectively deploy your data science projects and overcome challenges at each step of the ML workflow using Amazon SageMaker.
This Specialization is designed for data-focused developers, scientists, and analysts familiar with the Python and SQL programming languages who want to learn how to build, train, and deploy scalable, end-to-end ML pipelines - both automated and human-in-the-loop - in the AWS cloud.
Each of the 10 weeks features a comprehensive lab developed specifically for this Specialization that provides hands-on experience with state-of-the-art algorithms for natural language processing (NLP) and natural language understanding (NLU), including BERT and FastText using Amazon SageMaker.
Syllabus
Course 1: Analyze Datasets and Train ML Models using AutoML
- Offered by Amazon Web Services and DeepLearning.AI. In the first course of the Practical Data Science Specialization, you will learn ... Enroll for free.
Course 2: Build, Train, and Deploy ML Pipelines using BERT
- Offered by Amazon Web Services and DeepLearning.AI. In the second course of the Practical Data Science Specialization, you will learn to ... Enroll for free.
Course 3: Optimize ML Models and Deploy Human-in-the-Loop Pipelines
- Offered by Amazon Web Services and DeepLearning.AI. In the third course of the Practical Data Science Specialization, you will learn a ... Enroll for free.
- Offered by Amazon Web Services and DeepLearning.AI. In the first course of the Practical Data Science Specialization, you will learn ... Enroll for free.
Course 2: Build, Train, and Deploy ML Pipelines using BERT
- Offered by Amazon Web Services and DeepLearning.AI. In the second course of the Practical Data Science Specialization, you will learn to ... Enroll for free.
Course 3: Optimize ML Models and Deploy Human-in-the-Loop Pipelines
- Offered by Amazon Web Services and DeepLearning.AI. In the third course of the Practical Data Science Specialization, you will learn a ... Enroll for free.
Courses
-
In the first course of the Practical Data Science Specialization, you will learn foundational concepts for exploratory data analysis (EDA), automated machine learning (AutoML), and text classification algorithms. With Amazon SageMaker Clarify and Amazon SageMaker Data Wrangler, you will analyze a dataset for statistical bias, transform the dataset into machine-readable features, and select the most important features to train a multi-class text classifier. You will then perform automated machine learning (AutoML) to automatically train, tune, and deploy the best text-classification algorithm for the given dataset using Amazon SageMaker Autopilot. Next, you will work with Amazon SageMaker BlazingText, a highly optimized and scalable implementation of the popular FastText algorithm, to train a text classifier with very little code.
Practical data science is geared towards handling massive datasets that do not fit in your local hardware and could originate from multiple sources. One of the biggest benefits of developing and running data science projects in the cloud is the agility and elasticity that the cloud offers to scale up and out at a minimum cost.
The Practical Data Science Specialization helps you develop the practical skills to effectively deploy your data science projects and overcome challenges at each step of the ML workflow using Amazon SageMaker. This Specialization is designed for data-focused developers, scientists, and analysts familiar with the Python and SQL programming languages and want to learn how to build, train, and deploy scalable, end-to-end ML pipelines - both automated and human-in-the-loop - in the AWS cloud. -
In the second course of the Practical Data Science Specialization, you will learn to automate a natural language processing task by building an end-to-end machine learning pipeline using Hugging Face’s highly-optimized implementation of the state-of-the-art BERT algorithm with Amazon SageMaker Pipelines. Your pipeline will first transform the dataset into BERT-readable features and store the features in the Amazon SageMaker Feature Store. It will then fine-tune a text classification model to the dataset using a Hugging Face pre-trained model, which has learned to understand the human language from millions of Wikipedia documents. Finally, your pipeline will evaluate the model’s accuracy and only deploy the model if the accuracy exceeds a given threshold.
Practical data science is geared towards handling massive datasets that do not fit in your local hardware and could originate from multiple sources. One of the biggest benefits of developing and running data science projects in the cloud is the agility and elasticity that the cloud offers to scale up and out at a minimum cost.
The Practical Data Science Specialization helps you develop the practical skills to effectively deploy your data science projects and overcome challenges at each step of the ML workflow using Amazon SageMaker. This Specialization is designed for data-focused developers, scientists, and analysts familiar with the Python and SQL programming languages and want to learn how to build, train, and deploy scalable, end-to-end ML pipelines - both automated and human-in-the-loop - in the AWS cloud. -
In the third course of the Practical Data Science Specialization, you will learn a series of performance-improvement and cost-reduction techniques to automatically tune model accuracy, compare prediction performance, and generate new training data with human intelligence. After tuning your text classifier using Amazon SageMaker Hyper-parameter Tuning (HPT), you will deploy two model candidates into an A/B test to compare their real-time prediction performance and automatically scale the winning model using Amazon SageMaker Hosting. Lastly, you will set up a human-in-the-loop pipeline to fix misclassified predictions and generate new training data using Amazon Augmented AI and Amazon SageMaker Ground Truth.
Practical data science is geared towards handling massive datasets that do not fit in your local hardware and could originate from multiple sources. One of the biggest benefits of developing and running data science projects in the cloud is the agility and elasticity that the cloud offers to scale up and out at a minimum cost.
The Practical Data Science Specialization helps you develop the practical skills to effectively deploy your data science projects and overcome challenges at each step of the ML workflow using Amazon SageMaker. This Specialization is designed for data-focused developers, scientists, and analysts familiar with the Python and SQL programming languages and want to learn how to build, train, and deploy scalable, end-to-end ML pipelines - both automated and human-in-the-loop - in the AWS cloud.
Taught by
Antje Barth, Chris Fregly, Shelbee Eigenbrode and Sireesha Muppala
Reviews
5.0 rating, based on 1 Class Central review
Showing Class Central Sort
-
Implementing, building and deploying a machine learning pipeline is complex and involves many time-consuming tasks. Amazon AWS with SageMaker and related tools allow you to automate a lot of the parts making you more efficient as a data scientist. All intermediate artefacts in the pipeline are available for manual tailoring via the AWS management console, via the CLI or the SDK. This gives you plenty of choices of which steps you want to fully automate by default, include your own scripts, or do manual interventions.
Data scientist resources are scarce, use them optimally and automate where possible the tedious tasks, so you can make faster progress, save money, and focus on the interesting parts of the business.