SlideShare une entreprise Scribd logo
1  sur  24
Amazon SageMaker Autopilot
Giuseppe Angelo Porcelli
Principal, Specialist Solutions Architect – Machine Learning
Build models automatically with Amazon SageMaker
© 2020, Amazon Web Services, Inc. or its Affiliates.
Agenda
• Amazon SageMaker Overview
• Amazon SageMaker Autopilot
• Demo
• Automatic Model Tuning in Amazon SageMaker
• Resources to get you started
© 2020, Amazon Web Services, Inc. or its Affiliates.
THE ML WORKFLOW IS ITERATIVE AND COMPLEX
Collect and prepare
training data
Choose or bring your
own ML algorithm
Set up and manage
environments for training
Train, debug, and
tune models
Manage
training runs
Deploy model
in production
Monitor
models
Validate
predictions
Scale and manage the
production environment
© 2020, Amazon Web Services, Inc. or its Affiliates.
AMAZON SAGEMAKER
SageMaker
Ground Truth
Fully managed
data labeling
SageMaker
Studio Notebooks
One-click notebooks
with elastic compute
One-click
Training
Makes it easy to
run training jobs
Automatic
Model Tuning
One-click hyperparameter
optimization
One-click
Deployment
Supports real-time,
batch and multi-model
Model
Monitor
Automatically detect
concept drift
SageMaker
Processing
Built-in Python,
BYO R/Spark
Built-in & bring-
your-own algorithms
Supervised and
unsupervised algorithms
SageMaker
Experiments
Capture, organize, and
compare every step
SageMaker
Neo
Train once,
deploy anywhere
AWS
Marketplace
Pre-built algorithms
and models
SageMaker
Debugger
Debug
training runs
Inf1/Amazon
Elastic Inference
High performance
at lowest cost
Managed
Spot training
Reduce training
cost by 90%
Amazon
Augmented AI
Add human review
of model predictions
SageMaker
Studio
SageMaker
Autopilot
Automatically build
and train models
© 2020, Amazon Web Services, Inc. or its Affiliates.
AutoML
• AutoML aims at automating the process of building a model
• Problem identification: looking at the data set, what class of problem are we trying to
solve?
• Algorithm selection: which algorithm is best suited to solve the problem?
• Data preprocessing: how should data be prepared for best results?
• Hyperparameter tuning: what is the optimal set of training parameters?
• Black box vs. white box
• Black box: the best model only
à Hard to understand the model, impossible to reproduce it manually
• White box: the best model, other candidates, full source code for preprocessing and
training
à See how the model was built, and keep tweaking for extra performance
© 2020, Amazon Web Services, Inc. or its Affiliates.
AutoML with SageMaker Autopilot
• SageMaker Autopilot covers all steps
• Problem identification: looking at the data set, what class of problem are we trying to solve?
• Algorithm selection: which algorithm is best suited to solve the problem?
• Data preprocessing: how should data be prepared for best results?
• Hyperparameter tuning: what is the optimal set of training parameters?
• Autopilot is white box AutoML
• You can understand how the model was built, and you can keep tweaking
• Supported algorithms: regression and classification
• Linear Learner
• Factorization Machines
• KNN
• XGBoost
© 2020, Amazon Web Services, Inc. or its Affiliates.
AutoML with SageMaker Autopilot
1. Upload the unprocessed dataset to S3
2. Configure the AutoML job
• Location of dataset
• Completion criteria
3. Launch the job
4. View the list of candidates and the autogenerated notebooks
5. Deploy the best candidate to a real-time endpoint, or use batch transform
© 2020, Amazon Web Services, Inc. or its Affiliates.
AutoML with SageMaker Autopilot
Dataset,
target attribute
Models
Notebooks
Metrics
© 2020, Amazon Web Services, Inc. or its Affiliates.
SageMaker Autopilot from Studio
Only S3 location and target variable required
Optional control points:
• dry-run vs complete mode
• setting problem type
• security settings
API/SDK level control points:
• number of candidate models to build
• maximum time to take
• model evaluation metric (accuracy, F1, RMSE)
© 2020, Amazon Web Services, Inc. or its Affiliates.
SageMaker Autopilot whitebox AutoML
Dataset Exploration Notebook
• Dataset statistics: row-wise and column-
wise
• Suggested remedies for common data set
problems
© 2020, Amazon Web Services, Inc. or its Affiliates.
SageMaker Autopilot whitebox AutoML
Model candidate notebook (fully runnable)
• data transformers
• featurization techniques applied
• override points
• algorithms considered
• evaluation metric
• hyper-parameter ranges
• model search strategy
• instances used
© 2020, Amazon Web Services, Inc. or its Affiliates.
Recent Enhancements
• Supporting AUC as an optimization metric: you can now
direct Autopilot to use AUC for tuning in binary classification problems,
resulting in higher performing models
• Confidence scores on predictions: for classification problems, inference
now returns the probability scores of each class
• Minimum dataset size: reduced from 1000 to 500 rows
© 2020, Amazon Web Services, Inc. or its Affiliates.
Demo
© 2020, Amazon Web Services, Inc. or its Affiliates.
Resources on Amazon SageMaker Autopilot
Documentation
https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development.html
https://sagemaker.readthedocs.io/en/stable/automl.html
Notebooks
https://github.com/awslabs/amazon-sagemaker-examples/tree/master/autopilot
Blog posts
https://aws.amazon.com/blogs/aws/amazon-sagemaker-autopilot-fully-managed-automatic-machine-learning/
Tutorials
https://aws.amazon.com/getting-started/hands-on/create-machine-learning-model-automatically-sagemaker-
autopilot/
© 2020, Amazon Web Services, Inc. or its Affiliates.
Automatic model tuning
in Amazon SageMaker
© 2020, Amazon Web Services, Inc. or its Affiliates.
Algorithms require many hyperparameters
Neural Networks
Number of layers
Hidden layer width
Learning rate
Embedding
dimensions
Dropout
…
XGBoost
Tree depth
Max leaf nodes
Gamma
Eta
Lambda
Alpha
…
Which values
should I pick?
Which ones
are the most
influential?
How many
combinations
should I try?
© 2020, Amazon Web Services, Inc. or its Affiliates.
Setting hyperparameters in Amazon SageMaker
• Built-in algorithms
• Python parameters set on the relevant estimator (KMeans, LinearLearner, etc.)
xgb.set_hyperparameters(max_depth=5, eta=0.2, gamma=4)
• Built-in frameworks
• hyperparameters parameter passed to the relevant estimator (TensorFlow, MXNet, etc.)
• This must be a Python dictionary
tf_estimator = TensorFlow(…, hyperparameters={'epochs’: 1, ‘lr’: ‘0.01’})
• Your code must be able to accept them as command-line arguments (script mode)
• Bring your own container
• hyperparameters parameter passed to the Estimator
• This must be Python dictionary
• It’s automatically copied inside the container: /opt/ml/input/config/hyperparameters.json
© 2020, Amazon Web Services, Inc. or its Affiliates.
Finding the optimal set of hyperparameters
• Manual Search: ”I know what I’m doing”
• Grid Search: “X marks the spot”
• Typically training hundreds of models
• Slow and expensive
• Random Search: “Spray and pray”
« Random Search for Hyper-Parameter Optimization », Bergstra & Bengio, 2012
• Works better and faster than Grid Search
• But… but… but… it’s random!
• Hyperparameter Optimization: use ML to predict hyperparameters
• Training fewer models
• Gaussian Process Regression and Bayesian Optimization
• https://docs.aws.amazon.com/en_pv/sagemaker/latest/dg/automatic-model-tuning-how-it-works.html
© 2020, Amazon Web Services, Inc. or its Affiliates.
Automatic Model Tuning in Amazon SageMaker
1. Define an Estimator the normal way
2. Define the metric to tune on
• Pre-defined metrics for built-in algorithms and frameworks
• Or anything present in the training log, provided that you pass a regular expression for it
3. Define parameter ranges to explore
• Type: categorical (avoid if possible), integer, continuous (aka floating point)
• Range of values
• Scaling: linear, logarithmic, reverse logarithmic
4. Create an HyperparameterTuner
• Estimator, metric, parameters, total number of jobs, number of jobs in parallel
• Strategy: bayesian (default), or random search
5. Launch the tuning job with fit()
© 2020, Amazon Web Services, Inc. or its Affiliates.
Workflow
Training JobHyperparameter Tuning
Job
Tuning strategy
Objective metrics
Training Job
Training Job
Training Job
Clients
(console, notebook, IDEs, CLI)
model name
model1
model2
…
objective
metric
0.8
0.75
…
eta
0.07
0.09
…
max_depth
6
5
…
…
© 2020, Amazon Web Services, Inc. or its Affiliates.
Tips
• Use the bayesian strategy for better, faster, cheaper results
• Most customers use random search as a baseline, to check that bayesian performs
better
• Don’t run too many jobs in parallel
• This gives the bayesian strategy fewer opportunities to predict
• Instance limits!
• Don’t run too many jobs
• Bayesian typically requires 10x fewer jobs than random
• Cost vs business benefits (beware of diminishing returns)
© 2020, Amazon Web Services, Inc. or its Affiliates.
Resources on Automatic Model Tuning
Documentation
https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html
https://sagemaker.readthedocs.io/en/stable/tuner.html
Notebooks
https://github.com/awslabs/amazon-sagemaker-examples/tree/master/hyperparameter_tuning
Blog posts
https://aws.amazon.com/blogs/aws/sagemaker-automatic-model-tuning/
https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-automatic-model-tuning-produces-better-models-faster/
https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-automatic-model-tuning-now-supports-early-stopping-of-
training-jobs/
https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-automatic-model-tuning-becomes-more-efficient-with-
warm-start-of-hyperparameter-tuning-jobs/
https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-automatic-model-tuning-now-supports-random-search-and-
hyperparameter-scaling/
© 2020, Amazon Web Services, Inc. or its Affiliates.
AWS Training & Certification
https://www.aws.training Free on-demand courses to help you build new cloud skills
Curriculum: Exploring the Machine Learning Toolset
https://www.aws.training/Details/Curriculum?id=27155
Curriculum: Developing Machine Learning Applications
https://www.aws.training/Details/Curriculum?id=27243
Curriculum: Machine Learning Security
https://www.aws.training/Details/Curriculum?id=27273
Curriculum: Demystifying AI/ML/DL
https://www.aws.training/Details/Curriculum?id=27241
Video: AWS Foundations: Machine Learning Basics
https://www.aws.training/Details/Video?id=49644
Curriculum: Conversation Primer: Machine Learning
Terminology
https://www.aws.training/Details/Curriculum?id=27270
For more info on AWS T&C visit: https://aws.amazon.com/it/training/
Thanks!

Contenu connexe

Tendances

Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
Amazon Web Services
 
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptxTrack 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Amazon Web Services
 
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Amazon Web Services
 
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptxTrack 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
Amazon Web Services
 
Track 4 Session 6_ IOT01 如何透過 AWS IoT 服務建構物聯網應用
Track 4 Session 6_ IOT01 如何透過 AWS IoT 服務建構物聯網應用Track 4 Session 6_ IOT01 如何透過 AWS IoT 服務建構物聯網應用
Track 4 Session 6_ IOT01 如何透過 AWS IoT 服務建構物聯網應用
Amazon Web Services
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Amazon Web Services
 
Track 2 Session 3_ 日本電視直播技術革命串流平台不容忽視的技術創新.pptx
Track 2 Session 3_ 日本電視直播技術革命串流平台不容忽視的技術創新.pptxTrack 2 Session 3_ 日本電視直播技術革命串流平台不容忽視的技術創新.pptx
Track 2 Session 3_ 日本電視直播技術革命串流平台不容忽視的技術創新.pptx
Amazon Web Services
 
Track 3 Session 4_企業工作負載遷移至 AWS 的最佳實踐
Track 3 Session 4_企業工作負載遷移至 AWS 的最佳實踐Track 3 Session 4_企業工作負載遷移至 AWS 的最佳實踐
Track 3 Session 4_企業工作負載遷移至 AWS 的最佳實踐
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 

Tendances (20)

Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
 
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptxTrack 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
 
ENT207-The Future of Enterprise IT.pdf
ENT207-The Future of Enterprise IT.pdfENT207-The Future of Enterprise IT.pdf
ENT207-The Future of Enterprise IT.pdf
 
AWS 微服務架構分享
AWS 微服務架構分享AWS 微服務架構分享
AWS 微服務架構分享
 
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
 
Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...
Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...
Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...
 
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptxTrack 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
 
Track 4 Session 6_ IOT01 如何透過 AWS IoT 服務建構物聯網應用
Track 4 Session 6_ IOT01 如何透過 AWS IoT 服務建構物聯網應用Track 4 Session 6_ IOT01 如何透過 AWS IoT 服務建構物聯網應用
Track 4 Session 6_ IOT01 如何透過 AWS IoT 服務建構物聯網應用
 
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
 
Track 2 Session 3_ 日本電視直播技術革命串流平台不容忽視的技術創新.pptx
Track 2 Session 3_ 日本電視直播技術革命串流平台不容忽視的技術創新.pptxTrack 2 Session 3_ 日本電視直播技術革命串流平台不容忽視的技術創新.pptx
Track 2 Session 3_ 日本電視直播技術革命串流平台不容忽視的技術創新.pptx
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Keep Your Infrastructure Costs Low: AWS Startup Day - New York 2018
Keep Your Infrastructure Costs Low: AWS Startup Day - New York 2018Keep Your Infrastructure Costs Low: AWS Startup Day - New York 2018
Keep Your Infrastructure Costs Low: AWS Startup Day - New York 2018
 
Cloud Economics: The Financial Case for Cloud Migration
Cloud Economics: The Financial Case for Cloud MigrationCloud Economics: The Financial Case for Cloud Migration
Cloud Economics: The Financial Case for Cloud Migration
 
aws basics
aws basicsaws basics
aws basics
 
Track 3 Session 4_企業工作負載遷移至 AWS 的最佳實踐
Track 3 Session 4_企業工作負載遷移至 AWS 的最佳實踐Track 3 Session 4_企業工作負載遷移至 AWS 的最佳實踐
Track 3 Session 4_企業工作負載遷移至 AWS 的最佳實踐
 
Executing a Large-Scale Migration to AWS
Executing a Large-Scale Migration to AWSExecuting a Large-Scale Migration to AWS
Executing a Large-Scale Migration to AWS
 
Serverless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best PracticesServerless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best Practices
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 

Similaire à Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot

AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
Amazon Web Services Korea
 

Similaire à Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot (20)

AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)
 
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
 
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
 
Advanced Machine Learning with Amazon SageMaker
Advanced Machine Learning with Amazon SageMakerAdvanced Machine Learning with Amazon SageMaker
Advanced Machine Learning with Amazon SageMaker
 
Amazon SageMaker workshop
Amazon SageMaker workshopAmazon SageMaker workshop
Amazon SageMaker workshop
 
Machine Learning with Amazon SageMaker
Machine Learning with Amazon SageMakerMachine Learning with Amazon SageMaker
Machine Learning with Amazon SageMaker
 
Building, Training and Deploying Custom Algorithms with Amazon SageMaker
Building, Training and Deploying Custom Algorithms with Amazon SageMakerBuilding, Training and Deploying Custom Algorithms with Amazon SageMaker
Building, Training and Deploying Custom Algorithms with Amazon SageMaker
 
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
 
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - ENT321 - SageMaker WorkshopAWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
 
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
 
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
 
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
 
MLOPS By Amazon offered and free download
MLOPS By Amazon offered and free downloadMLOPS By Amazon offered and free download
MLOPS By Amazon offered and free download
 
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
 
AWS reinvent 2019 recap - Riyadh - AI And ML - Ahmed Raafat
AWS reinvent 2019 recap - Riyadh - AI And ML - Ahmed RaafatAWS reinvent 2019 recap - Riyadh - AI And ML - Ahmed Raafat
AWS reinvent 2019 recap - Riyadh - AI And ML - Ahmed Raafat
 
Demystifying Machine Learning with AWS (ACD Mumbai)
Demystifying Machine Learning with AWS (ACD Mumbai)Demystifying Machine Learning with AWS (ACD Mumbai)
Demystifying Machine Learning with AWS (ACD Mumbai)
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
AWS Machine Learning Week SF: End to End Model Development Using SageMaker
AWS Machine Learning Week SF: End to End Model Development Using SageMakerAWS Machine Learning Week SF: End to End Model Development Using SageMaker
AWS Machine Learning Week SF: End to End Model Development Using SageMaker
 
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
 

Plus de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWS
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei server
 
Crea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightCrea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSight
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows
 

Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot

  • 1. Amazon SageMaker Autopilot Giuseppe Angelo Porcelli Principal, Specialist Solutions Architect – Machine Learning Build models automatically with Amazon SageMaker
  • 2. © 2020, Amazon Web Services, Inc. or its Affiliates. Agenda • Amazon SageMaker Overview • Amazon SageMaker Autopilot • Demo • Automatic Model Tuning in Amazon SageMaker • Resources to get you started
  • 3. © 2020, Amazon Web Services, Inc. or its Affiliates. THE ML WORKFLOW IS ITERATIVE AND COMPLEX Collect and prepare training data Choose or bring your own ML algorithm Set up and manage environments for training Train, debug, and tune models Manage training runs Deploy model in production Monitor models Validate predictions Scale and manage the production environment
  • 4. © 2020, Amazon Web Services, Inc. or its Affiliates. AMAZON SAGEMAKER SageMaker Ground Truth Fully managed data labeling SageMaker Studio Notebooks One-click notebooks with elastic compute One-click Training Makes it easy to run training jobs Automatic Model Tuning One-click hyperparameter optimization One-click Deployment Supports real-time, batch and multi-model Model Monitor Automatically detect concept drift SageMaker Processing Built-in Python, BYO R/Spark Built-in & bring- your-own algorithms Supervised and unsupervised algorithms SageMaker Experiments Capture, organize, and compare every step SageMaker Neo Train once, deploy anywhere AWS Marketplace Pre-built algorithms and models SageMaker Debugger Debug training runs Inf1/Amazon Elastic Inference High performance at lowest cost Managed Spot training Reduce training cost by 90% Amazon Augmented AI Add human review of model predictions SageMaker Studio SageMaker Autopilot Automatically build and train models
  • 5. © 2020, Amazon Web Services, Inc. or its Affiliates. AutoML • AutoML aims at automating the process of building a model • Problem identification: looking at the data set, what class of problem are we trying to solve? • Algorithm selection: which algorithm is best suited to solve the problem? • Data preprocessing: how should data be prepared for best results? • Hyperparameter tuning: what is the optimal set of training parameters? • Black box vs. white box • Black box: the best model only à Hard to understand the model, impossible to reproduce it manually • White box: the best model, other candidates, full source code for preprocessing and training à See how the model was built, and keep tweaking for extra performance
  • 6. © 2020, Amazon Web Services, Inc. or its Affiliates. AutoML with SageMaker Autopilot • SageMaker Autopilot covers all steps • Problem identification: looking at the data set, what class of problem are we trying to solve? • Algorithm selection: which algorithm is best suited to solve the problem? • Data preprocessing: how should data be prepared for best results? • Hyperparameter tuning: what is the optimal set of training parameters? • Autopilot is white box AutoML • You can understand how the model was built, and you can keep tweaking • Supported algorithms: regression and classification • Linear Learner • Factorization Machines • KNN • XGBoost
  • 7. © 2020, Amazon Web Services, Inc. or its Affiliates. AutoML with SageMaker Autopilot 1. Upload the unprocessed dataset to S3 2. Configure the AutoML job • Location of dataset • Completion criteria 3. Launch the job 4. View the list of candidates and the autogenerated notebooks 5. Deploy the best candidate to a real-time endpoint, or use batch transform
  • 8. © 2020, Amazon Web Services, Inc. or its Affiliates. AutoML with SageMaker Autopilot Dataset, target attribute Models Notebooks Metrics
  • 9. © 2020, Amazon Web Services, Inc. or its Affiliates. SageMaker Autopilot from Studio Only S3 location and target variable required Optional control points: • dry-run vs complete mode • setting problem type • security settings API/SDK level control points: • number of candidate models to build • maximum time to take • model evaluation metric (accuracy, F1, RMSE)
  • 10. © 2020, Amazon Web Services, Inc. or its Affiliates. SageMaker Autopilot whitebox AutoML Dataset Exploration Notebook • Dataset statistics: row-wise and column- wise • Suggested remedies for common data set problems
  • 11. © 2020, Amazon Web Services, Inc. or its Affiliates. SageMaker Autopilot whitebox AutoML Model candidate notebook (fully runnable) • data transformers • featurization techniques applied • override points • algorithms considered • evaluation metric • hyper-parameter ranges • model search strategy • instances used
  • 12. © 2020, Amazon Web Services, Inc. or its Affiliates. Recent Enhancements • Supporting AUC as an optimization metric: you can now direct Autopilot to use AUC for tuning in binary classification problems, resulting in higher performing models • Confidence scores on predictions: for classification problems, inference now returns the probability scores of each class • Minimum dataset size: reduced from 1000 to 500 rows
  • 13. © 2020, Amazon Web Services, Inc. or its Affiliates. Demo
  • 14. © 2020, Amazon Web Services, Inc. or its Affiliates. Resources on Amazon SageMaker Autopilot Documentation https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development.html https://sagemaker.readthedocs.io/en/stable/automl.html Notebooks https://github.com/awslabs/amazon-sagemaker-examples/tree/master/autopilot Blog posts https://aws.amazon.com/blogs/aws/amazon-sagemaker-autopilot-fully-managed-automatic-machine-learning/ Tutorials https://aws.amazon.com/getting-started/hands-on/create-machine-learning-model-automatically-sagemaker- autopilot/
  • 15. © 2020, Amazon Web Services, Inc. or its Affiliates. Automatic model tuning in Amazon SageMaker
  • 16. © 2020, Amazon Web Services, Inc. or its Affiliates. Algorithms require many hyperparameters Neural Networks Number of layers Hidden layer width Learning rate Embedding dimensions Dropout … XGBoost Tree depth Max leaf nodes Gamma Eta Lambda Alpha … Which values should I pick? Which ones are the most influential? How many combinations should I try?
  • 17. © 2020, Amazon Web Services, Inc. or its Affiliates. Setting hyperparameters in Amazon SageMaker • Built-in algorithms • Python parameters set on the relevant estimator (KMeans, LinearLearner, etc.) xgb.set_hyperparameters(max_depth=5, eta=0.2, gamma=4) • Built-in frameworks • hyperparameters parameter passed to the relevant estimator (TensorFlow, MXNet, etc.) • This must be a Python dictionary tf_estimator = TensorFlow(…, hyperparameters={'epochs’: 1, ‘lr’: ‘0.01’}) • Your code must be able to accept them as command-line arguments (script mode) • Bring your own container • hyperparameters parameter passed to the Estimator • This must be Python dictionary • It’s automatically copied inside the container: /opt/ml/input/config/hyperparameters.json
  • 18. © 2020, Amazon Web Services, Inc. or its Affiliates. Finding the optimal set of hyperparameters • Manual Search: ”I know what I’m doing” • Grid Search: “X marks the spot” • Typically training hundreds of models • Slow and expensive • Random Search: “Spray and pray” « Random Search for Hyper-Parameter Optimization », Bergstra & Bengio, 2012 • Works better and faster than Grid Search • But… but… but… it’s random! • Hyperparameter Optimization: use ML to predict hyperparameters • Training fewer models • Gaussian Process Regression and Bayesian Optimization • https://docs.aws.amazon.com/en_pv/sagemaker/latest/dg/automatic-model-tuning-how-it-works.html
  • 19. © 2020, Amazon Web Services, Inc. or its Affiliates. Automatic Model Tuning in Amazon SageMaker 1. Define an Estimator the normal way 2. Define the metric to tune on • Pre-defined metrics for built-in algorithms and frameworks • Or anything present in the training log, provided that you pass a regular expression for it 3. Define parameter ranges to explore • Type: categorical (avoid if possible), integer, continuous (aka floating point) • Range of values • Scaling: linear, logarithmic, reverse logarithmic 4. Create an HyperparameterTuner • Estimator, metric, parameters, total number of jobs, number of jobs in parallel • Strategy: bayesian (default), or random search 5. Launch the tuning job with fit()
  • 20. © 2020, Amazon Web Services, Inc. or its Affiliates. Workflow Training JobHyperparameter Tuning Job Tuning strategy Objective metrics Training Job Training Job Training Job Clients (console, notebook, IDEs, CLI) model name model1 model2 … objective metric 0.8 0.75 … eta 0.07 0.09 … max_depth 6 5 … …
  • 21. © 2020, Amazon Web Services, Inc. or its Affiliates. Tips • Use the bayesian strategy for better, faster, cheaper results • Most customers use random search as a baseline, to check that bayesian performs better • Don’t run too many jobs in parallel • This gives the bayesian strategy fewer opportunities to predict • Instance limits! • Don’t run too many jobs • Bayesian typically requires 10x fewer jobs than random • Cost vs business benefits (beware of diminishing returns)
  • 22. © 2020, Amazon Web Services, Inc. or its Affiliates. Resources on Automatic Model Tuning Documentation https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html https://sagemaker.readthedocs.io/en/stable/tuner.html Notebooks https://github.com/awslabs/amazon-sagemaker-examples/tree/master/hyperparameter_tuning Blog posts https://aws.amazon.com/blogs/aws/sagemaker-automatic-model-tuning/ https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-automatic-model-tuning-produces-better-models-faster/ https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-automatic-model-tuning-now-supports-early-stopping-of- training-jobs/ https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-automatic-model-tuning-becomes-more-efficient-with- warm-start-of-hyperparameter-tuning-jobs/ https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-automatic-model-tuning-now-supports-random-search-and- hyperparameter-scaling/
  • 23. © 2020, Amazon Web Services, Inc. or its Affiliates. AWS Training & Certification https://www.aws.training Free on-demand courses to help you build new cloud skills Curriculum: Exploring the Machine Learning Toolset https://www.aws.training/Details/Curriculum?id=27155 Curriculum: Developing Machine Learning Applications https://www.aws.training/Details/Curriculum?id=27243 Curriculum: Machine Learning Security https://www.aws.training/Details/Curriculum?id=27273 Curriculum: Demystifying AI/ML/DL https://www.aws.training/Details/Curriculum?id=27241 Video: AWS Foundations: Machine Learning Basics https://www.aws.training/Details/Video?id=49644 Curriculum: Conversation Primer: Machine Learning Terminology https://www.aws.training/Details/Curriculum?id=27270 For more info on AWS T&C visit: https://aws.amazon.com/it/training/