SlideShare une entreprise Scribd logo
1  sur  39
Scale Machine Learning
Deployment
Gang Tao
Data Science Project Life Cycle
Model Persistent
▶ Python pickle based
code serialization
▶ sklearn.externals.joblib
▶ Spark provide api to
save model/pipeline
as file
▶ Tensorflow provide
tf.train.Saver that
persists the tensor
graph
▶ It is pickle +
metadata +
checkpoint
Python Sklearn / Spark / Tensorflow
▶ Models from different tools are not compatible
▶ Code serialization has dependency on python version
▶ Code serialization has potential security concerns
▶ For tf model, those tensor names are required ( need check if there are in the
meta data)
▶ tf mode has dependency on customer code which defined customer
operations
Issues and Limitations
A simple view of model deployment
▶ Enable wide range of ML modeling tools : Python, R, Tensorflow, Spark
▶ Scale up and down
▶ Performance, Latency optimization
▶ Accessing model, API
▶ Audit and Versioning
▶ CI/CD
▶ Metrics and Monitoring
▶ Optimization, AB Tests
ML Deployment Challenges
Seldon
▶ Seldon, A London Company focuses on providing control over Machine
Learning based on open source software
▶ Seldon Core is a open source platform for deploying machine learning model
on Kubernetes
• Python/Spark/H2O/R model support
• REST and gRPC API
• Deploy Inference graph of Model/Routers/Combiner/Transformers as microservices
• Leveraging K8s to provide scale, security, monitoring etc
Seldon
Pros Cons
▶ Seamless K8s integration
▶ Graph definition to support AB
test and ensembling
▶ No Scala support for Spark
▶ Need customer image for
pySpark
▶ No customization support for
liveness/readiness check due to
CRD
Summary
Clipper
▶ Clipper.ai is a system developed by UC Berkeley RISE lab.
▶ Clipper is a prediction serving system that sits between user-facing
applications and a wide range of commonly used machine learning models
and frameworks.
Clipper
Pros Cons
▶ Easy to use interactive model
deploy
▶ Support Docker and K8s
▶ Query Latency Objective support
▶ Model Version management
• Update and Rollback
▶ Cloud pickle version issue
▶ Python only
▶ Less examples/Documents
▶ Not friendly to AWS
• use_internal_ip does not work well
• need manually create repo for
model
• Failed to pull image from ecr
▶ Cluster creation is not stable
▶ Tensorflow failed to pickle
Summary
MLFlow
▶ MLflow is an open source platform for managing the end-to-end machine
learning lifecycle.
▶ MLFlow is developed by Databricks
MLFlow
Pros Cons
▶ Flexible
▶ Easy to do with SKlearn
▶ Cloud integration to support
sagemaker and azure
▶ No K8s integration
▶ Spark/Tensorflow support is
based on Python
▶ Projects are better managed by
container
Summary
MLeap
▶ MLeap allows data scientists and engineers to deploy machine learning
pipelines from Spark and Scikit-learn to a portable format and execution
engine.
• A JSON base serialization
• A Runtime execution engine
• Benchmarks
▶ http://mleap-docs.combust.ml/core-concepts/transformers/support.html
MLeap
MLeap Serialization
Pros Cons
▶ Portable model between Spark
and Sklearn
▶ Human readable model
▶ Easy model serving
▶ Support matrix is incomplete
▶ Extensibility
• Write code for each
estimator/transformer
▶ To support tensorflow, need
customer build tf-java binding,
and is under experiment
Summary
Wrap up
▶ Seldon tightly integrates with k8s to support the scalability of model serving,
and it’s graph function is powerful.
▶ Clipper provides good interaction, while the code is not stable enough
▶ MLflow’s model serving is simple, with less functions
▶ MLeap targets to provide inter-operation between different tools which is very
nice, while there is still a long way to go to support all the features.
• PMML is not covered
▶ Some other tools are not touched
• MXnet model server
• Oracle Graphpipe
Wrap up
Model Persistent ML Tools K8s Integration Version License Implementation
Seldon
Core
S2i + Pickle Tensorflow, SKlearn,
Keras, R, H2O,
Nodejs, PMML
Yes 0.3.2 Apache Docker + K8s CRD
Clipper Pickle Python, PySpark,
PyTorch, Tensorflow,
MXnet, Customer
Container
Yes 0.3.0 Apache CPP / Python
MLFlow Directory +
Metadata
Python, H2O, Kera,
MLeap, PyTorch,
Sklearn, Spark,
Tensorflow, R
No Alpha Apache Python
MLeap Spark,Sklearn,
Tensorflow
No 0.12.0 Apache Scala/Java
Other findings
▶ Enabling Spark is not easy
• Version, pyspark version, java version
• Build spark image with glibc support
• Java gateway process exited before sending its port number
• Access spark from k8s is not easy
▶ Some K8s pods are pending with Unknown status
• kubectl delete pod {} --grace-period=0 --force
▶ Building your own ML image from python is not easy, use
continuumio/miniconda may save you some time
▶ Using batch command to clean the docker images
• docker images | grep "something_to_search" | awk '{print $1 ":" $2}' |xargs docker rmi -f
• docker system prune
Some other findings
References
▶ https://cmry.github.io/notes/serialize
▶ https://cmry.github.io/notes/serialize-sk
▶ https://github.com/hiveml/simple-ml-serving
▶ https://medium.com/@vikati/the-rise-of-the-model-servers-9395522b6c58
▶ https://qconsp.com/system/files/presentation-slides/qconsp18-deployingml-
may18-npentreath.pdf
▶ https://www.slideshare.net/dscrankshaw/veloxampcamp5-final
References

Contenu connexe

Tendances

MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleDatabricks
 
DAIS Europe Nov. 2020 presentation on MLflow Model Serving
DAIS Europe Nov. 2020 presentation on MLflow Model ServingDAIS Europe Nov. 2020 presentation on MLflow Model Serving
DAIS Europe Nov. 2020 presentation on MLflow Model Servingamesar0
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreMoritz Meister
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformDatabricks
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro sessionAvinash Patil
 
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Databricks
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricksLiangjun Jiang
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOpsRui Quintino
 
Nasscom ml ops webinar
Nasscom ml ops webinarNasscom ml ops webinar
Nasscom ml ops webinarSameer Mahajan
 
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...Databricks
 
Automating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflowAutomating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflowStepan Pushkarev
 
Why is dev ops for machine learning so different - dataxdays
Why is dev ops for machine learning so different  - dataxdaysWhy is dev ops for machine learning so different  - dataxdays
Why is dev ops for machine learning so different - dataxdaysRyan Dawson
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowJan Kirenz
 
Productionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflowProductionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflowDatabricks
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLJordan Birdsell
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform Seldon
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusManasi Vartak
 

Tendances (20)

MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
 
DAIS Europe Nov. 2020 presentation on MLflow Model Serving
DAIS Europe Nov. 2020 presentation on MLflow Model ServingDAIS Europe Nov. 2020 presentation on MLflow Model Serving
DAIS Europe Nov. 2020 presentation on MLflow Model Serving
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature Store
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro session
 
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricks
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
Nasscom ml ops webinar
Nasscom ml ops webinarNasscom ml ops webinar
Nasscom ml ops webinar
 
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
 
Automating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflowAutomating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflow
 
Why is dev ops for machine learning so different - dataxdays
Why is dev ops for machine learning so different  - dataxdaysWhy is dev ops for machine learning so different  - dataxdays
Why is dev ops for machine learning so different - dataxdays
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
Productionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflowProductionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflow
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
 

Similaire à Scale machine learning deployment

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsStijn Decubber
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleDatabricks
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureFei Chen
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAnimesh Singh
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Akash Tandon
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsFatih Baltacı
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
 
How to Choose a Deep Learning Framework
How to Choose a Deep Learning FrameworkHow to Choose a Deep Learning Framework
How to Choose a Deep Learning FrameworkNavid Kalaei
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesDatabricks
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondProvectus
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformGetInData
 
Distributed Deep Learning with Keras and TensorFlow on Apache Spark
Distributed Deep Learning with Keras and TensorFlow on Apache SparkDistributed Deep Learning with Keras and TensorFlow on Apache Spark
Distributed Deep Learning with Keras and TensorFlow on Apache SparkGuglielmo Iozzia
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with DatabricksLiangjun Jiang
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Nisha Talagala
 
Democratizing machine learning on kubernetes
Democratizing machine learning on kubernetesDemocratizing machine learning on kubernetes
Democratizing machine learning on kubernetesDocker, Inc.
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in ProductionMatthias Feys
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayNick Pentreath
 
Triton As NLP Model Inference Back-end
 Triton As NLP Model Inference Back-end Triton As NLP Model Inference Back-end
Triton As NLP Model Inference Back-endKo Ko
 

Similaire à Scale machine learning deployment (20)

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOps
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
How to Choose a Deep Learning Framework
How to Choose a Deep Learning FrameworkHow to Choose a Deep Learning Framework
How to Choose a Deep Learning Framework
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using Kubernetes
 
Tensorflow 2.0 and Coral Edge TPU
Tensorflow 2.0 and Coral Edge TPU Tensorflow 2.0 and Coral Edge TPU
Tensorflow 2.0 and Coral Edge TPU
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and Beyond
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML Platform
 
Distributed Deep Learning with Keras and TensorFlow on Apache Spark
Distributed Deep Learning with Keras and TensorFlow on Apache SparkDistributed Deep Learning with Keras and TensorFlow on Apache Spark
Distributed Deep Learning with Keras and TensorFlow on Apache Spark
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with Databricks
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017
 
Democratizing machine learning on kubernetes
Democratizing machine learning on kubernetesDemocratizing machine learning on kubernetes
Democratizing machine learning on kubernetes
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in Production
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI Day
 
Triton As NLP Model Inference Back-end
 Triton As NLP Model Inference Back-end Triton As NLP Model Inference Back-end
Triton As NLP Model Inference Back-end
 

Plus de Gang Tao

Critical thinking
Critical thinkingCritical thinking
Critical thinkingGang Tao
 
Cloud monitoring
Cloud monitoringCloud monitoring
Cloud monitoringGang Tao
 
Big Data Computing Architecture
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing ArchitectureGang Tao
 
Splunk Spark Integration
Splunk Spark IntegrationSplunk Spark Integration
Splunk Spark IntegrationGang Tao
 
Regression
RegressionRegression
RegressionGang Tao
 
Bayesian Classification
Bayesian ClassificationBayesian Classification
Bayesian ClassificationGang Tao
 
Quality attributes in software architecture
Quality attributes in software architectureQuality attributes in software architecture
Quality attributes in software architectureGang Tao
 
Great bychoice
Great bychoiceGreat bychoice
Great bychoiceGang Tao
 
Data Science Introduction
Data Science IntroductionData Science Introduction
Data Science IntroductionGang Tao
 
Now you see it
Now you see itNow you see it
Now you see itGang Tao
 

Plus de Gang Tao (10)

Critical thinking
Critical thinkingCritical thinking
Critical thinking
 
Cloud monitoring
Cloud monitoringCloud monitoring
Cloud monitoring
 
Big Data Computing Architecture
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing Architecture
 
Splunk Spark Integration
Splunk Spark IntegrationSplunk Spark Integration
Splunk Spark Integration
 
Regression
RegressionRegression
Regression
 
Bayesian Classification
Bayesian ClassificationBayesian Classification
Bayesian Classification
 
Quality attributes in software architecture
Quality attributes in software architectureQuality attributes in software architecture
Quality attributes in software architecture
 
Great bychoice
Great bychoiceGreat bychoice
Great bychoice
 
Data Science Introduction
Data Science IntroductionData Science Introduction
Data Science Introduction
 
Now you see it
Now you see itNow you see it
Now you see it
 

Dernier

Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 

Dernier (20)

Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 

Scale machine learning deployment

  • 2. Data Science Project Life Cycle
  • 4. ▶ Python pickle based code serialization ▶ sklearn.externals.joblib ▶ Spark provide api to save model/pipeline as file ▶ Tensorflow provide tf.train.Saver that persists the tensor graph ▶ It is pickle + metadata + checkpoint Python Sklearn / Spark / Tensorflow
  • 5.
  • 6. ▶ Models from different tools are not compatible ▶ Code serialization has dependency on python version ▶ Code serialization has potential security concerns ▶ For tf model, those tensor names are required ( need check if there are in the meta data) ▶ tf mode has dependency on customer code which defined customer operations Issues and Limitations
  • 7. A simple view of model deployment
  • 8. ▶ Enable wide range of ML modeling tools : Python, R, Tensorflow, Spark ▶ Scale up and down ▶ Performance, Latency optimization ▶ Accessing model, API ▶ Audit and Versioning ▶ CI/CD ▶ Metrics and Monitoring ▶ Optimization, AB Tests ML Deployment Challenges
  • 10. ▶ Seldon, A London Company focuses on providing control over Machine Learning based on open source software ▶ Seldon Core is a open source platform for deploying machine learning model on Kubernetes • Python/Spark/H2O/R model support • REST and gRPC API • Deploy Inference graph of Model/Routers/Combiner/Transformers as microservices • Leveraging K8s to provide scale, security, monitoring etc Seldon
  • 11.
  • 12.
  • 13.
  • 14.
  • 15. Pros Cons ▶ Seamless K8s integration ▶ Graph definition to support AB test and ensembling ▶ No Scala support for Spark ▶ Need customer image for pySpark ▶ No customization support for liveness/readiness check due to CRD Summary
  • 17. ▶ Clipper.ai is a system developed by UC Berkeley RISE lab. ▶ Clipper is a prediction serving system that sits between user-facing applications and a wide range of commonly used machine learning models and frameworks. Clipper
  • 18.
  • 19.
  • 20. Pros Cons ▶ Easy to use interactive model deploy ▶ Support Docker and K8s ▶ Query Latency Objective support ▶ Model Version management • Update and Rollback ▶ Cloud pickle version issue ▶ Python only ▶ Less examples/Documents ▶ Not friendly to AWS • use_internal_ip does not work well • need manually create repo for model • Failed to pull image from ecr ▶ Cluster creation is not stable ▶ Tensorflow failed to pickle Summary
  • 22. ▶ MLflow is an open source platform for managing the end-to-end machine learning lifecycle. ▶ MLFlow is developed by Databricks MLFlow
  • 23.
  • 24.
  • 25. Pros Cons ▶ Flexible ▶ Easy to do with SKlearn ▶ Cloud integration to support sagemaker and azure ▶ No K8s integration ▶ Spark/Tensorflow support is based on Python ▶ Projects are better managed by container Summary
  • 26. MLeap
  • 27. ▶ MLeap allows data scientists and engineers to deploy machine learning pipelines from Spark and Scikit-learn to a portable format and execution engine. • A JSON base serialization • A Runtime execution engine • Benchmarks ▶ http://mleap-docs.combust.ml/core-concepts/transformers/support.html MLeap
  • 28.
  • 30.
  • 31. Pros Cons ▶ Portable model between Spark and Sklearn ▶ Human readable model ▶ Easy model serving ▶ Support matrix is incomplete ▶ Extensibility • Write code for each estimator/transformer ▶ To support tensorflow, need customer build tf-java binding, and is under experiment Summary
  • 33. ▶ Seldon tightly integrates with k8s to support the scalability of model serving, and it’s graph function is powerful. ▶ Clipper provides good interaction, while the code is not stable enough ▶ MLflow’s model serving is simple, with less functions ▶ MLeap targets to provide inter-operation between different tools which is very nice, while there is still a long way to go to support all the features. • PMML is not covered ▶ Some other tools are not touched • MXnet model server • Oracle Graphpipe Wrap up
  • 34. Model Persistent ML Tools K8s Integration Version License Implementation Seldon Core S2i + Pickle Tensorflow, SKlearn, Keras, R, H2O, Nodejs, PMML Yes 0.3.2 Apache Docker + K8s CRD Clipper Pickle Python, PySpark, PyTorch, Tensorflow, MXnet, Customer Container Yes 0.3.0 Apache CPP / Python MLFlow Directory + Metadata Python, H2O, Kera, MLeap, PyTorch, Sklearn, Spark, Tensorflow, R No Alpha Apache Python MLeap Spark,Sklearn, Tensorflow No 0.12.0 Apache Scala/Java
  • 36. ▶ Enabling Spark is not easy • Version, pyspark version, java version • Build spark image with glibc support • Java gateway process exited before sending its port number • Access spark from k8s is not easy ▶ Some K8s pods are pending with Unknown status • kubectl delete pod {} --grace-period=0 --force ▶ Building your own ML image from python is not easy, use continuumio/miniconda may save you some time ▶ Using batch command to clean the docker images • docker images | grep "something_to_search" | awk '{print $1 ":" $2}' |xargs docker rmi -f • docker system prune Some other findings
  • 37.
  • 39. ▶ https://cmry.github.io/notes/serialize ▶ https://cmry.github.io/notes/serialize-sk ▶ https://github.com/hiveml/simple-ml-serving ▶ https://medium.com/@vikati/the-rise-of-the-model-servers-9395522b6c58 ▶ https://qconsp.com/system/files/presentation-slides/qconsp18-deployingml- may18-npentreath.pdf ▶ https://www.slideshare.net/dscrankshaw/veloxampcamp5-final References