SlideShare une entreprise Scribd logo
1  sur  44
23 April 2019
IMPROVING HOW WE
DELIVER MACHINE
LEARNING MODELS
2
WHO ARE WE?
David Tan
Developer @ ThoughtWorks
Jonathan Heng
Developer @ ThoughtWorks
3
THE PLAN TODAY
Why do we need to improve the ML workflow?
What are some better practices?
How can we practice these practices?
4
TEMPERATURE CHECK
Who has...
● Trained a ML model before?
● Deployed a ML model for fun?
● Deployed a ML model at work?
● Deployed a model using an automated CI pipeline?
WHAT’S THE
PROBLEM?
6
OBSERVATION
One of these is not like the others
Continuous delivery practices can help us change this.
7
JOURNEY ON AN ML PROJECT
Jupyter
Notebooks
Production
???
8
IT’S NEVER JUST ML
We got 99 problems and machine learning ain’t one
Potentially
unethical
outcomes
How can we help people do difficult things?
Source: Machine Learning: The High Interest Credit Card of Technical Debt (Google, 2015)
9
HELPING PEOPLE DO DIFFICULT THINGS
Sensible defaults
● Two reference repos:
○ github.com/ThoughtWorksInc/ml-cd-starter-kit
○ github.com/ThoughtWorksInc/ml-app-template
● Better ways of working
○ 6 common problems and suggested solutions
10
LET’S GO
6 common problems and suggested solutions
1WORKS ON MY MACHINE
11
1212
1. WORKS ON MY MACHINE
To start, simply:
• docker build …
• docker run ...
1313
1. WORKS ON MY MACHINE
Demo
2NO DATA /
DATA SUCKS
Business problem
ML model
Feature engineering
Data
1515
2. NO DATA / DATA SUCKS
Mitigation measures
● Think about data access before starting project
● Collect better data with every
release (more on this a few slides
from now)
● “Wizard of Oz” / Fake it till we make it
○ Provide interface but without ML
implementation
16
3DEPLOYMENTS ARE
COMPLICATED
1717
3. DEPLOYMENTS ARE COMPLICATED
Mitigation measures
● CI pipeline
● Deploy early and often
● Tracer bullet
● Bring the pain forward
1818
3. DEPLOY EARLY AND OFTEN
Source: Continuous Delivery (Jez Humble, Dave Farley)
feedback
Run unit
tests
push
Source code
repository
trigger
Local env
1919
3. DEPLOY EARLY AND OFTEN
feedback
Run unit
tests
Train and
evaluate
model
push
Source code
repository
trigger
Local env
Source: Continuous Delivery (Jez Humble, Dave Farley)
2020
3. DEPLOY EARLY AND OFTEN
feedback
Run unit
tests
Deploy
candidate
model to
staging
Train and
evaluate
model
push
Source code
repository
trigger
Local env
Artifact
repositor
y
Source: Continuous Delivery (Jez Humble, Dave Farley)
2121
3. DEPLOY EARLY AND OFTEN
Run unit
tests
Deploy
candidate
model to
staging
Deploy
model to
production
Train and
evaluate
model
push
Source code
repository
trigger
Local env
Artifact
repositor
y
Source: Continuous Delivery (Jez Humble, Dave Farley)
22
4HOW DO WE CHOOSE
BETWEEN CANDIDATE
MODELS?
2323
4. CHOOSING BUILDS
Include model evaluation metrics in CI pipeline
24
5HOW’S THE MODEL
DOING IN THE WILD?
2525
5. OBSERVE!
Monitoring service usage
Benefit #1: Feedback on production model
2626
5. OBSERVE!
Monitoring model output
Benefit #1: Feedback on production model
2727
5. OBSERVE!
Monitoring model inputs
● Could help identify training-serving skew
Benefit #1: Feedback on production model
2828
5. OBSERVE!
Benefit #2: Interpretability of predictions
2929
5. OBSERVE!
Benefit #3: Closing the data collection loop
Data turking Train &
test
model
Deploy
model
Data / feature
repository
data
Evaluate
models
Flow of data
Flow of model
Model
Service
Logs
30
5. OBSERVE!
Benefit #4: Ability to measure goodness of any model
build_and_
test
deploy_
staging
deploy_
prod
evaluate_
model_w_new_data
(git push)
evaluate_
model
model = my-image:$BUILD_ID
r_2 = 0.7
rmse = 42
3131
5. OBSERVE!
Benefit #4: Ability to measure goodness of any model
3232
5. HOW’S THE MODEL IN THE WILD?
OBSERVE!
Summing up
● Mitigation measures
○ Logging + Monitoring
● Benefits
○ Feedback on production models
○ Interpretability (how did the model decide on this particular prediction?)
○ Better data for training
○ Better (unseen) data for evaluating candidate/champion models
3333
5. HOW’S THE MODEL IN THE WILD?
Demo
GoCD
MLFlow
Kubernetes
+
Helm
ElasticSearch
Fluentd
Kibana
Grafana
34
5. HOW’S THE MODEL IN THE WILD?
Demo
35
6HARMFUL MODELS IN
PRODUCTION
3636
6. HARMFUL MODELS IN PRODUCTION
● PredPol algorithm reinforces racial biases in policing data
● Recruiting tool shows bias against women
Actual news headlines
Image source: I’m an AI researcher, and here’s what scares me about AI (Rachel Thomas)
3737
● Discuss and define what “bad” looks like in our context
● “Black mirror” retros
● Measure unfairness
○ Make fairness a measurable fitness function
● Data ethics checklist (link)
● Human-in-the-loop / appeal processes
● Ability to recover from harmful models
37
6. HARMFUL MODELS IN PRODUCTION
Mitigation measures
38
6. HARMFUL MODELS IN PRODUCTION
Demo: rollback to last good build
39
SUMMING UP
How can we make easier to do the right thing?
40
MAKE IT EASIER TO DO THE RIGHT THING
● Better ways of working
○ Environment management
○ Closing the data collection loop
○ Deploy early and often
○ Automated tracking of hyperparameters and metrics
○ Logging and monitoring
○ Do no harm
● Two reference repos:
○ github.com/ThoughtWorksInc/ml-cd-starter-kit
○ github.com/ThoughtWorksInc/ml-app-template
4141
Provision and configure cross-cutting services
GoCD
EFKG
MLFlow
github.com/ThoughtWorksInc/ml-cd-starter-kit
github.com/ThoughtWorksInc/ml-app-template
Project boilerplate template
Unit tests
Train model
Test model metrics
Dockerised setup
Store CI pipeline as code
Track hyperparameters and metrics of each training run on CI
Logging (predictions, inputs, explanatory variables)
424242
SUMMING UP
Notebook
/
playgroun
d
PROD
(maybe
)
commit and push
Experiment /
Develop
Monitor Deploy
Test
Continuous
Delivery
43
FURTHER READING
● https://www.thoughtworks.com/intelligent-empowerment
● www.continuousdelivery.com
David Tan / Jonathan Heng
davidtan+jonheng@thoughtworks.co
m
THANK YOU.
44

Contenu connexe

Tendances

Test Driven Development by Denis Lutz
Test Driven Development by Denis LutzTest Driven Development by Denis Lutz
Test Driven Development by Denis Lutzjazzman1980
 
Scrum and Test-driven development
Scrum and Test-driven developmentScrum and Test-driven development
Scrum and Test-driven developmenttoteb5
 
Codeception Testing Framework -- English #phpkansai
Codeception Testing Framework -- English #phpkansaiCodeception Testing Framework -- English #phpkansai
Codeception Testing Framework -- English #phpkansaiFlorent Batard
 
Agile testing overview
Agile testing overviewAgile testing overview
Agile testing overviewraianup
 
Weekly #106: Deep Learning on Mobile
Weekly #106: Deep Learning on MobileWeekly #106: Deep Learning on Mobile
Weekly #106: Deep Learning on MobileBill Liu
 
Test Driven iOS Development (TDD)
Test Driven iOS Development (TDD)Test Driven iOS Development (TDD)
Test Driven iOS Development (TDD)Babul Mirdha
 
Agile Test Driven Development
Agile Test Driven DevelopmentAgile Test Driven Development
Agile Test Driven DevelopmentViraf Karai
 
Dream QA: Designing the QA team where we'd love to work
Dream QA: Designing the QA team where we'd love to workDream QA: Designing the QA team where we'd love to work
Dream QA: Designing the QA team where we'd love to workManuel de la Peña Peña
 
Testing practicies not only in scala
Testing practicies not only in scalaTesting practicies not only in scala
Testing practicies not only in scalaPaweł Panasewicz
 
Practices of agile developers
Practices of agile developersPractices of agile developers
Practices of agile developersDUONG Trong Tan
 
Implementing Quality on a Java Project
Implementing Quality on a Java ProjectImplementing Quality on a Java Project
Implementing Quality on a Java ProjectVincent Massol
 
Why Test Driven Development?
Why Test Driven Development?Why Test Driven Development?
Why Test Driven Development?Naresh Jain
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids ProgrammingLynn Langit
 
Usability Test Results Xtext New Project Wizard
Usability Test Results Xtext New Project WizardUsability Test Results Xtext New Project Wizard
Usability Test Results Xtext New Project WizardSandra Schering
 
Test Driven Development (TDD) Preso 360|Flex 2010
Test Driven Development (TDD) Preso 360|Flex 2010Test Driven Development (TDD) Preso 360|Flex 2010
Test Driven Development (TDD) Preso 360|Flex 2010guest5639fa9
 
Creating testing tools to support development
Creating testing tools to support developmentCreating testing tools to support development
Creating testing tools to support developmentChema del Barco
 
Hey You Got Your TDD in my SQL DB by Jeff McKenzie
Hey You Got Your TDD in my SQL DB by Jeff McKenzieHey You Got Your TDD in my SQL DB by Jeff McKenzie
Hey You Got Your TDD in my SQL DB by Jeff McKenzieQA or the Highway
 
xUnit and TDD: Why and How in Enterprise Software, August 2012
xUnit and TDD: Why and How in Enterprise Software, August 2012xUnit and TDD: Why and How in Enterprise Software, August 2012
xUnit and TDD: Why and How in Enterprise Software, August 2012Justin Gordon
 
TDD That Was Easy!
TDD   That Was Easy!TDD   That Was Easy!
TDD That Was Easy!Kaizenko
 
Metamorphosis from Forms to Java: a technical lead's perspective
Metamorphosis from Forms to Java:  a technical lead's perspectiveMetamorphosis from Forms to Java:  a technical lead's perspective
Metamorphosis from Forms to Java: a technical lead's perspectiveMichael Fons
 

Tendances (20)

Test Driven Development by Denis Lutz
Test Driven Development by Denis LutzTest Driven Development by Denis Lutz
Test Driven Development by Denis Lutz
 
Scrum and Test-driven development
Scrum and Test-driven developmentScrum and Test-driven development
Scrum and Test-driven development
 
Codeception Testing Framework -- English #phpkansai
Codeception Testing Framework -- English #phpkansaiCodeception Testing Framework -- English #phpkansai
Codeception Testing Framework -- English #phpkansai
 
Agile testing overview
Agile testing overviewAgile testing overview
Agile testing overview
 
Weekly #106: Deep Learning on Mobile
Weekly #106: Deep Learning on MobileWeekly #106: Deep Learning on Mobile
Weekly #106: Deep Learning on Mobile
 
Test Driven iOS Development (TDD)
Test Driven iOS Development (TDD)Test Driven iOS Development (TDD)
Test Driven iOS Development (TDD)
 
Agile Test Driven Development
Agile Test Driven DevelopmentAgile Test Driven Development
Agile Test Driven Development
 
Dream QA: Designing the QA team where we'd love to work
Dream QA: Designing the QA team where we'd love to workDream QA: Designing the QA team where we'd love to work
Dream QA: Designing the QA team where we'd love to work
 
Testing practicies not only in scala
Testing practicies not only in scalaTesting practicies not only in scala
Testing practicies not only in scala
 
Practices of agile developers
Practices of agile developersPractices of agile developers
Practices of agile developers
 
Implementing Quality on a Java Project
Implementing Quality on a Java ProjectImplementing Quality on a Java Project
Implementing Quality on a Java Project
 
Why Test Driven Development?
Why Test Driven Development?Why Test Driven Development?
Why Test Driven Development?
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids Programming
 
Usability Test Results Xtext New Project Wizard
Usability Test Results Xtext New Project WizardUsability Test Results Xtext New Project Wizard
Usability Test Results Xtext New Project Wizard
 
Test Driven Development (TDD) Preso 360|Flex 2010
Test Driven Development (TDD) Preso 360|Flex 2010Test Driven Development (TDD) Preso 360|Flex 2010
Test Driven Development (TDD) Preso 360|Flex 2010
 
Creating testing tools to support development
Creating testing tools to support developmentCreating testing tools to support development
Creating testing tools to support development
 
Hey You Got Your TDD in my SQL DB by Jeff McKenzie
Hey You Got Your TDD in my SQL DB by Jeff McKenzieHey You Got Your TDD in my SQL DB by Jeff McKenzie
Hey You Got Your TDD in my SQL DB by Jeff McKenzie
 
xUnit and TDD: Why and How in Enterprise Software, August 2012
xUnit and TDD: Why and How in Enterprise Software, August 2012xUnit and TDD: Why and How in Enterprise Software, August 2012
xUnit and TDD: Why and How in Enterprise Software, August 2012
 
TDD That Was Easy!
TDD   That Was Easy!TDD   That Was Easy!
TDD That Was Easy!
 
Metamorphosis from Forms to Java: a technical lead's perspective
Metamorphosis from Forms to Java:  a technical lead's perspectiveMetamorphosis from Forms to Java:  a technical lead's perspective
Metamorphosis from Forms to Java: a technical lead's perspective
 

Similaire à Improving How We Deliver Machine Learning Models (XCONF 2019)

Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBigML, Inc
 
When the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biasesWhen the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biasesClément DUFFAU
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptxgdgsurrey
 
From DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionFrom DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionAnne-Marie Tousch
 
How to Apply Machine Learning by Lyft Senior Product Manager
How to Apply Machine Learning by Lyft Senior Product ManagerHow to Apply Machine Learning by Lyft Senior Product Manager
How to Apply Machine Learning by Lyft Senior Product ManagerProduct School
 
Limits of Machine Learning
Limits of Machine LearningLimits of Machine Learning
Limits of Machine LearningAlexey Grigorev
 
Semi-Supervised Insight Generation from Petabyte Scale Text Data
Semi-Supervised Insight Generation from Petabyte Scale Text DataSemi-Supervised Insight Generation from Petabyte Scale Text Data
Semi-Supervised Insight Generation from Petabyte Scale Text DataTech Triveni
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence WorkshopDavid Tan
 
On Machine Learning Readiness
On Machine Learning ReadinessOn Machine Learning Readiness
On Machine Learning ReadinessAnne-Marie Tousch
 
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...QA or the Highway
 
DSDT Meetup April 2021
DSDT Meetup April 2021DSDT Meetup April 2021
DSDT Meetup April 2021DSDT_MTL
 
vodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applicationsvodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applicationsvodQA
 
DutchMLSchool. ML Business Perspective
DutchMLSchool. ML Business PerspectiveDutchMLSchool. ML Business Perspective
DutchMLSchool. ML Business PerspectiveBigML, Inc
 
When UX (guy) Meets Operations
When UX (guy) Meets OperationsWhen UX (guy) Meets Operations
When UX (guy) Meets OperationsTim Sheiner
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learningTheodoros Vasiloudis
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsXavier Amatriain
 
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Austin Ogilvie
 

Similaire à Improving How We Deliver Machine Learning Models (XCONF 2019) (20)

Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, Evaluations
 
When the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biasesWhen the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biases
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx
 
From DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionFrom DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transition
 
How to Apply Machine Learning by Lyft Senior Product Manager
How to Apply Machine Learning by Lyft Senior Product ManagerHow to Apply Machine Learning by Lyft Senior Product Manager
How to Apply Machine Learning by Lyft Senior Product Manager
 
Limits of Machine Learning
Limits of Machine LearningLimits of Machine Learning
Limits of Machine Learning
 
Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
 
Semi-Supervised Insight Generation from Petabyte Scale Text Data
Semi-Supervised Insight Generation from Petabyte Scale Text DataSemi-Supervised Insight Generation from Petabyte Scale Text Data
Semi-Supervised Insight Generation from Petabyte Scale Text Data
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence Workshop
 
On Machine Learning Readiness
On Machine Learning ReadinessOn Machine Learning Readiness
On Machine Learning Readiness
 
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
 
DSDT Meetup April 2021
DSDT Meetup April 2021DSDT Meetup April 2021
DSDT Meetup April 2021
 
vodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applicationsvodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applications
 
DutchMLSchool. ML Business Perspective
DutchMLSchool. ML Business PerspectiveDutchMLSchool. ML Business Perspective
DutchMLSchool. ML Business Perspective
 
When UX (guy) Meets Operations
When UX (guy) Meets OperationsWhen UX (guy) Meets Operations
When UX (guy) Meets Operations
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learning
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
ML Playbook
ML PlaybookML Playbook
ML Playbook
 
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
 

Dernier

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 

Dernier (20)

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 

Improving How We Deliver Machine Learning Models (XCONF 2019)

  • 1. 23 April 2019 IMPROVING HOW WE DELIVER MACHINE LEARNING MODELS
  • 2. 2 WHO ARE WE? David Tan Developer @ ThoughtWorks Jonathan Heng Developer @ ThoughtWorks
  • 3. 3 THE PLAN TODAY Why do we need to improve the ML workflow? What are some better practices? How can we practice these practices?
  • 4. 4 TEMPERATURE CHECK Who has... ● Trained a ML model before? ● Deployed a ML model for fun? ● Deployed a ML model at work? ● Deployed a model using an automated CI pipeline?
  • 6. 6 OBSERVATION One of these is not like the others Continuous delivery practices can help us change this.
  • 7. 7 JOURNEY ON AN ML PROJECT Jupyter Notebooks Production ???
  • 8. 8 IT’S NEVER JUST ML We got 99 problems and machine learning ain’t one Potentially unethical outcomes How can we help people do difficult things? Source: Machine Learning: The High Interest Credit Card of Technical Debt (Google, 2015)
  • 9. 9 HELPING PEOPLE DO DIFFICULT THINGS Sensible defaults ● Two reference repos: ○ github.com/ThoughtWorksInc/ml-cd-starter-kit ○ github.com/ThoughtWorksInc/ml-app-template ● Better ways of working ○ 6 common problems and suggested solutions
  • 10. 10 LET’S GO 6 common problems and suggested solutions
  • 11. 1WORKS ON MY MACHINE 11
  • 12. 1212 1. WORKS ON MY MACHINE To start, simply: • docker build … • docker run ...
  • 13. 1313 1. WORKS ON MY MACHINE Demo
  • 15. Business problem ML model Feature engineering Data 1515 2. NO DATA / DATA SUCKS Mitigation measures ● Think about data access before starting project ● Collect better data with every release (more on this a few slides from now) ● “Wizard of Oz” / Fake it till we make it ○ Provide interface but without ML implementation
  • 17. 1717 3. DEPLOYMENTS ARE COMPLICATED Mitigation measures ● CI pipeline ● Deploy early and often ● Tracer bullet ● Bring the pain forward
  • 18. 1818 3. DEPLOY EARLY AND OFTEN Source: Continuous Delivery (Jez Humble, Dave Farley) feedback Run unit tests push Source code repository trigger Local env
  • 19. 1919 3. DEPLOY EARLY AND OFTEN feedback Run unit tests Train and evaluate model push Source code repository trigger Local env Source: Continuous Delivery (Jez Humble, Dave Farley)
  • 20. 2020 3. DEPLOY EARLY AND OFTEN feedback Run unit tests Deploy candidate model to staging Train and evaluate model push Source code repository trigger Local env Artifact repositor y Source: Continuous Delivery (Jez Humble, Dave Farley)
  • 21. 2121 3. DEPLOY EARLY AND OFTEN Run unit tests Deploy candidate model to staging Deploy model to production Train and evaluate model push Source code repository trigger Local env Artifact repositor y Source: Continuous Delivery (Jez Humble, Dave Farley)
  • 22. 22 4HOW DO WE CHOOSE BETWEEN CANDIDATE MODELS?
  • 23. 2323 4. CHOOSING BUILDS Include model evaluation metrics in CI pipeline
  • 25. 2525 5. OBSERVE! Monitoring service usage Benefit #1: Feedback on production model
  • 26. 2626 5. OBSERVE! Monitoring model output Benefit #1: Feedback on production model
  • 27. 2727 5. OBSERVE! Monitoring model inputs ● Could help identify training-serving skew Benefit #1: Feedback on production model
  • 28. 2828 5. OBSERVE! Benefit #2: Interpretability of predictions
  • 29. 2929 5. OBSERVE! Benefit #3: Closing the data collection loop Data turking Train & test model Deploy model Data / feature repository data Evaluate models Flow of data Flow of model Model Service Logs
  • 30. 30 5. OBSERVE! Benefit #4: Ability to measure goodness of any model build_and_ test deploy_ staging deploy_ prod evaluate_ model_w_new_data (git push) evaluate_ model model = my-image:$BUILD_ID r_2 = 0.7 rmse = 42
  • 31. 3131 5. OBSERVE! Benefit #4: Ability to measure goodness of any model
  • 32. 3232 5. HOW’S THE MODEL IN THE WILD? OBSERVE! Summing up ● Mitigation measures ○ Logging + Monitoring ● Benefits ○ Feedback on production models ○ Interpretability (how did the model decide on this particular prediction?) ○ Better data for training ○ Better (unseen) data for evaluating candidate/champion models
  • 33. 3333 5. HOW’S THE MODEL IN THE WILD? Demo GoCD MLFlow Kubernetes + Helm ElasticSearch Fluentd Kibana Grafana
  • 34. 34 5. HOW’S THE MODEL IN THE WILD? Demo
  • 36. 3636 6. HARMFUL MODELS IN PRODUCTION ● PredPol algorithm reinforces racial biases in policing data ● Recruiting tool shows bias against women Actual news headlines Image source: I’m an AI researcher, and here’s what scares me about AI (Rachel Thomas)
  • 37. 3737 ● Discuss and define what “bad” looks like in our context ● “Black mirror” retros ● Measure unfairness ○ Make fairness a measurable fitness function ● Data ethics checklist (link) ● Human-in-the-loop / appeal processes ● Ability to recover from harmful models 37 6. HARMFUL MODELS IN PRODUCTION Mitigation measures
  • 38. 38 6. HARMFUL MODELS IN PRODUCTION Demo: rollback to last good build
  • 39. 39 SUMMING UP How can we make easier to do the right thing?
  • 40. 40 MAKE IT EASIER TO DO THE RIGHT THING ● Better ways of working ○ Environment management ○ Closing the data collection loop ○ Deploy early and often ○ Automated tracking of hyperparameters and metrics ○ Logging and monitoring ○ Do no harm ● Two reference repos: ○ github.com/ThoughtWorksInc/ml-cd-starter-kit ○ github.com/ThoughtWorksInc/ml-app-template
  • 41. 4141 Provision and configure cross-cutting services GoCD EFKG MLFlow github.com/ThoughtWorksInc/ml-cd-starter-kit github.com/ThoughtWorksInc/ml-app-template Project boilerplate template Unit tests Train model Test model metrics Dockerised setup Store CI pipeline as code Track hyperparameters and metrics of each training run on CI Logging (predictions, inputs, explanatory variables)
  • 42. 424242 SUMMING UP Notebook / playgroun d PROD (maybe ) commit and push Experiment / Develop Monitor Deploy Test Continuous Delivery
  • 44. David Tan / Jonathan Heng davidtan+jonheng@thoughtworks.co m THANK YOU. 44