Scale machine learning deployment

Scale Machine Learning
Deployment
Gang Tao

Data Science Project Life Cycle

▶ Python pickle based
code serialization
▶ sklearn.externals.joblib
▶ Spark provide api to
save model/pipeline
as file
▶ Tensorflow provide
tf.train.Saver that
persists the tensor
graph
▶ It is pickle +
metadata +
checkpoint
Python Sklearn / Spark / Tensorflow

▶ Models from different tools are not compatible
▶ Code serialization has dependency on python version
▶ Code serialization has potential security concerns
▶ For tf model, those tensor names are required ( need check if there are in the
meta data)
▶ tf mode has dependency on customer code which defined customer
operations
Issues and Limitations

A simple view of model deployment

▶ Enable wide range of ML modeling tools : Python, R, Tensorflow, Spark
▶ Scale up and down
▶ Performance, Latency optimization
▶ Accessing model, API
▶ Audit and Versioning
▶ CI/CD
▶ Metrics and Monitoring
▶ Optimization, AB Tests
ML Deployment Challenges

▶ Seldon, A London Company focuses on providing control over Machine
Learning based on open source software
▶ Seldon Core is a open source platform for deploying machine learning model
on Kubernetes
• Python/Spark/H2O/R model support
• REST and gRPC API
• Deploy Inference graph of Model/Routers/Combiner/Transformers as microservices
• Leveraging K8s to provide scale, security, monitoring etc
Seldon

Pros Cons
▶ Seamless K8s integration
▶ Graph definition to support AB
test and ensembling
▶ No Scala support for Spark
▶ Need customer image for
pySpark
▶ No customization support for
liveness/readiness check due to
CRD
Summary

▶ Clipper.ai is a system developed by UC Berkeley RISE lab.
▶ Clipper is a prediction serving system that sits between user-facing
applications and a wide range of commonly used machine learning models
and frameworks.
Clipper

Pros Cons
▶ Easy to use interactive model
deploy
▶ Support Docker and K8s
▶ Query Latency Objective support
▶ Model Version management
• Update and Rollback
▶ Cloud pickle version issue
▶ Python only
▶ Less examples/Documents
▶ Not friendly to AWS
• use_internal_ip does not work well
• need manually create repo for
model
• Failed to pull image from ecr
▶ Cluster creation is not stable
▶ Tensorflow failed to pickle
Summary

▶ MLflow is an open source platform for managing the end-to-end machine
learning lifecycle.
▶ MLFlow is developed by Databricks
MLFlow

Pros Cons
▶ Flexible
▶ Easy to do with SKlearn
▶ Cloud integration to support
sagemaker and azure
▶ No K8s integration
▶ Spark/Tensorflow support is
based on Python
▶ Projects are better managed by
container
Summary

▶ MLeap allows data scientists and engineers to deploy machine learning
pipelines from Spark and Scikit-learn to a portable format and execution
engine.
• A JSON base serialization
• A Runtime execution engine
• Benchmarks
▶ http://mleap-docs.combust.ml/core-concepts/transformers/support.html
MLeap

Pros Cons
▶ Portable model between Spark
and Sklearn
▶ Human readable model
▶ Easy model serving
▶ Support matrix is incomplete
▶ Extensibility
• Write code for each
estimator/transformer
▶ To support tensorflow, need
customer build tf-java binding,
and is under experiment
Summary

▶ Seldon tightly integrates with k8s to support the scalability of model serving,
and it’s graph function is powerful.
▶ Clipper provides good interaction, while the code is not stable enough
▶ MLflow’s model serving is simple, with less functions
▶ MLeap targets to provide inter-operation between different tools which is very
nice, while there is still a long way to go to support all the features.
• PMML is not covered
▶ Some other tools are not touched
• MXnet model server
• Oracle Graphpipe
Wrap up

Model Persistent ML Tools K8s Integration Version License Implementation
Seldon
Core
S2i + Pickle Tensorflow, SKlearn,
Keras, R, H2O,
Nodejs, PMML
Yes 0.3.2 Apache Docker + K8s CRD
Clipper Pickle Python, PySpark,
PyTorch, Tensorflow,
MXnet, Customer
Container
Yes 0.3.0 Apache CPP / Python
MLFlow Directory +
Metadata
Python, H2O, Kera,
MLeap, PyTorch,
Sklearn, Spark,
Tensorflow, R
No Alpha Apache Python
MLeap Spark,Sklearn,
Tensorflow
No 0.12.0 Apache Scala/Java

▶ Enabling Spark is not easy
• Version, pyspark version, java version
• Build spark image with glibc support
• Java gateway process exited before sending its port number
• Access spark from k8s is not easy
▶ Some K8s pods are pending with Unknown status
• kubectl delete pod {} --grace-period=0 --force
▶ Building your own ML image from python is not easy, use
continuumio/miniconda may save you some time
▶ Using batch command to clean the docker images
• docker images | grep "something_to_search" | awk '{print $1 ":" $2}' |xargs docker rmi -f
• docker system prune
Some other findings

▶ https://cmry.github.io/notes/serialize
▶ https://cmry.github.io/notes/serialize-sk
▶ https://github.com/hiveml/simple-ml-serving
▶ https://medium.com/@vikati/the-rise-of-the-model-servers-9395522b6c58
▶ https://qconsp.com/system/files/presentation-slides/qconsp18-deployingml-
may18-npentreath.pdf
▶ https://www.slideshare.net/dscrankshaw/veloxampcamp5-final
References

Scale machine learning deployment

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Scale machine learning deployment

Similaire à Scale machine learning deployment (20)

Plus de Gang Tao

Plus de Gang Tao (10)

Dernier

Dernier (20)

Scale machine learning deployment