SlideShare une entreprise Scribd logo
1  sur  24
1
CD4ML and the challenges
of testing and quality in ML
systems
TensorFlow London Meetup, May 2020
Danilo Sato
@dtsato
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
7000+ technologists with 43 offices in 14 countries
We help clients become Modern Digital Businesses
DELIVER VALUE MOVE FASTTHINK BIG
#1
in Agile and
Continuous Delivery
100+
books written
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
Techniques
Continuous delivery
for machine
learning (CD4ML)
TRIAL
7
https://www.thoughtworks.com/radar
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
CD4ML isn’t a technology or a
tool; it is a practice and a set of
principles. Quality is built into
software and improvement is
always possible.
But machine learning systems
have unique challenges; unlike
deterministic software, it is
difficult—or impossible—to
understand the behavior of
data-driven intelligent systems.
This poses a huge challenge
when it comes to deploying
machine learning systems in
accordance with CD principles.
6
PRODUCTIONIZING ML IS HARD
Production systems should be:
● Reproducible
● Testable
● Auditable
● Continuously Improving
HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO
INTELLIGENT SYSTEMS?
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
CD4ML isn’t a technology or a
tool; it is a practice and a set of
principles. Quality is built into
software and improvement is
always possible.
But machine learning systems
have unique challenges; unlike
deterministic software, it is
difficult—or impossible—to
understand the behavior of
data-driven intelligent systems.
This poses a huge challenge
when it comes to deploying
machine learning systems in
accordance with CD principles.
7
PRODUCTIONIZING ML IS HARD
Production systems should be:
● Reproducible
● Testable
● Auditable
● Continuously Improving
Machine Learning is:
● Non-deterministic
● Hard to test
● Hard to explain
● Hard to improve
HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO
INTELLIGENT SYSTEMS?
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
MANY SOURCES OF CHANGE
8
ModelData Code
+ +
Schema
Sampling over Time
Volume
Algorithms
More Training
Experiments
Business Needs
Bug Fixes
Configuration
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
“Continuous Delivery is the ability to get changes of
all types — including new features, configuration
changes, bug fixes and experiments — into
production, or into the hands of users, safely and
quickly in a sustainable way.”
- Jez Humble & Dave Farley
9
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
PRINCIPLES OF CONTINUOUS DELIVERY
10
→ Create a Repeatable, Reliable Process for Releasing
Software
→ Automate Almost Everything
→ Build Quality In
→ Work in Small Batches
→ Keep Everything in Source Control
→ Done Means “Released”
→ Improve Continuously
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
TECHNICAL
COMPONENTS OF
CD4ML
Implementation requires lots of tools,
technologies, and architecture decisions
to fully automate the end-to-end process.
This presentation will focus on the
testing and quality aspects of CD4ML.
11
DOING CD4ML IS STILL A HARD PROBLEM
DISCOVERABLE AND
ACCESSIBLE DATA
REPRODUCIBLE
MODEL TRAINING
EXPERIMENTS
TRACKING
ELASTIC
INFRASTRUCTURE
VERSION CONTROL
& ARTIFACTS REPOS
MODEL SERVING
MODEL
DEPLOYMENT
TESTING & QUALITY
MONITORING &
OBSERVABILITY
CD
ORCHESTRATION
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
https://martinfowler.com/articles/cd4ml.html
“CLASSIC” SOFTWARE TEST PYRAMID
12
UI
Tests
Service Tests
Unit Tests
https://martinfowler.com/bliki/TestPyramid.html©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
Speed
Cost
AS SOFTWARE BECAME MORE COMPLEX
13
https://martinfowler.com/articles/microservice-testing©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
TESTING IN PRODUCTION
14
https://sookocheff.com/post/architecture/testing-in-production/©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
15
ModelData Code
+ +
??
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
TESTS FOR DATA
16
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
Data
Pipeline
Data/Feature Validation
Unit Tests
(Transformations, Engineered Features)
- Adherence to schemas
- Features can be used
- Schema versioning and
compatibility
- Integration tests against
(small) sample input
- Adherence to privacy
controls
- On-demand quality
checks
TESTS FOR MODEL
17
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
- Compare against a
simple model
- Numerical stability
(behaviour when NaN or
infinite values appear)
Unit Tests
(Model Specification)
Model
Quality
ML Training Pipeline
- Training is reproducible
(Watch out for sources of
non-determinism – e.g. RNG
seeds, initialization order)
- Integration test
18
ModelData Code
+ +
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
19
Model Performance
Contract Tests
Model Bias and Fairness
Data
Pipeline
Data/Feature Validation
Unit Tests
(Transformations, Engineered Features)
Unit Tests
(Model Specification)
Model
Quality
UI
Tests
Service Tests
Unit Tests
ML Training Pipeline
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
- Model evaluation against
different validation
datasets
- Thresholds for model
metrics and execution
performance
- Different data slices
- Feature generation is
same for training/serving
- Model contract is
adhered in production
- When model is exported,
test it still works
TESTING WHERE THEY OVERLAP
20
Model Performance
Contract Tests
Model Bias and Fairness
Data
Pipeline
Data/Feature Validation
Unit Tests
(Transformations, Engineered Features)
Unit Tests
(Model Specification)
Model
Quality
UI
Tests
Service Tests
Unit Tests
End-to-End Tests
Production Monitoring
Exploratory
Tests
ML Training Pipeline
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
- Model degradation
- Training/serving skew
- Operational metrics
(latency, throughput,
resource usage)
- Real impact! (KPIs)
21
“Inspection does not improve the
quality, nor guarantee quality.
Inspection is too late. The quality,
good or bad, is already in the
product.”
- W. Edward Deming
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
QUESTIONS?
22
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
WORKSHOPS,
PRESENTATIONS &
ARTICLES
Workshops:
https://github.com/ThoughtWorksInc/cd4ml-workshop
https://github.com/ThoughtWorksInc/CD4ML-Scenarios
Articles:
https://martinfowler.com/articles/cd4ml.html
https://www.thoughtworks.com/insights/articles/intelligent-enterprise-series-cd4ml
Paper:
“The ML Test Score: A Rubric for ML Production Readiness and Technical Debt
Reduction”, Breck et al (Google)
2323
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
2424
THANK YOU!
Danilo Sato (dsato@thoughtworks.com)
@dtsato
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020

Contenu connexe

Tendances

Greenplum Database Overview
Greenplum Database Overview Greenplum Database Overview
Greenplum Database Overview EMC
 
Introdução à Neo4j
Introdução à Neo4j Introdução à Neo4j
Introdução à Neo4j Neo4j
 
ETL and Event Sourcing
ETL and Event SourcingETL and Event Sourcing
ETL and Event SourcingMarc Siegel
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software OverviewKNIMESlides
 
20220302_TechDojo_OpenShift_BootCamp_1章概要
20220302_TechDojo_OpenShift_BootCamp_1章概要20220302_TechDojo_OpenShift_BootCamp_1章概要
20220302_TechDojo_OpenShift_BootCamp_1章概要Airi Furukawa
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInDataWorks Summit
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsSpark Summit
 
HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019 #hc...
HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019  #hc...HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019  #hc...
HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019 #hc...Yahoo!デベロッパーネットワーク
 
Kafka Connect:Iceberg Sink Connectorを使ってみる
Kafka Connect:Iceberg Sink Connectorを使ってみるKafka Connect:Iceberg Sink Connectorを使ってみる
Kafka Connect:Iceberg Sink Connectorを使ってみるMicroAd, Inc.(Engineer)
 
【 Zabbix 2.0 】zabbix 2.0による簡単 MySQL 監視 #Zabbix
【 Zabbix 2.0 】zabbix 2.0による簡単 MySQL 監視 #Zabbix【 Zabbix 2.0 】zabbix 2.0による簡単 MySQL 監視 #Zabbix
【 Zabbix 2.0 】zabbix 2.0による簡単 MySQL 監視 #Zabbix真乙 九龍
 
SpotBugs(FindBugs)による 大規模ERPのコード品質改善
SpotBugs(FindBugs)による 大規模ERPのコード品質改善SpotBugs(FindBugs)による 大規模ERPのコード品質改善
SpotBugs(FindBugs)による 大規模ERPのコード品質改善Works Applications
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storagehive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata StorageDataWorks Summit/Hadoop Summit
 
Linux Performance Analysis in 15 minutes
Linux Performance Analysis in 15 minutesLinux Performance Analysis in 15 minutes
Linux Performance Analysis in 15 minutesYohei Azekatsu
 
【基礎編】社内向けMySQL勉強会
【基礎編】社内向けMySQL勉強会【基礎編】社内向けMySQL勉強会
【基礎編】社内向けMySQL勉強会Yuji Otani
 
Data dictionary pl17
Data dictionary pl17Data dictionary pl17
Data dictionary pl17Ståle Deraas
 
DVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projectsDVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projectsFrancesco Casalegno
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 

Tendances (20)

Greenplum Database Overview
Greenplum Database Overview Greenplum Database Overview
Greenplum Database Overview
 
Introdução à Neo4j
Introdução à Neo4j Introdução à Neo4j
Introdução à Neo4j
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
ETL and Event Sourcing
ETL and Event SourcingETL and Event Sourcing
ETL and Event Sourcing
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software Overview
 
20220302_TechDojo_OpenShift_BootCamp_1章概要
20220302_TechDojo_OpenShift_BootCamp_1章概要20220302_TechDojo_OpenShift_BootCamp_1章概要
20220302_TechDojo_OpenShift_BootCamp_1章概要
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
 
Apache Kylin 101
Apache Kylin 101Apache Kylin 101
Apache Kylin 101
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
 
HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019 #hc...
HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019  #hc...HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019  #hc...
HDFSのスケーラビリティの限界を突破するためのさまざまな取り組み | Hadoop / Spark Conference Japan 2019 #hc...
 
Kafka Connect:Iceberg Sink Connectorを使ってみる
Kafka Connect:Iceberg Sink Connectorを使ってみるKafka Connect:Iceberg Sink Connectorを使ってみる
Kafka Connect:Iceberg Sink Connectorを使ってみる
 
【 Zabbix 2.0 】zabbix 2.0による簡単 MySQL 監視 #Zabbix
【 Zabbix 2.0 】zabbix 2.0による簡単 MySQL 監視 #Zabbix【 Zabbix 2.0 】zabbix 2.0による簡単 MySQL 監視 #Zabbix
【 Zabbix 2.0 】zabbix 2.0による簡単 MySQL 監視 #Zabbix
 
SpotBugs(FindBugs)による 大規模ERPのコード品質改善
SpotBugs(FindBugs)による 大規模ERPのコード品質改善SpotBugs(FindBugs)による 大規模ERPのコード品質改善
SpotBugs(FindBugs)による 大規模ERPのコード品質改善
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storagehive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
 
Linux Performance Analysis in 15 minutes
Linux Performance Analysis in 15 minutesLinux Performance Analysis in 15 minutes
Linux Performance Analysis in 15 minutes
 
【基礎編】社内向けMySQL勉強会
【基礎編】社内向けMySQL勉強会【基礎編】社内向けMySQL勉強会
【基礎編】社内向けMySQL勉強会
 
Data dictionary pl17
Data dictionary pl17Data dictionary pl17
Data dictionary pl17
 
DVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projectsDVC - Git-like Data Version Control for Machine Learning projects
DVC - Git-like Data Version Control for Machine Learning projects
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 

Similaire à CD4ML and the challenges of testing and quality in ML systems

Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionDr. Arif Wider
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
Continuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyContinuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyDr. Arif Wider
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Itai Yaffe
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsMartin Gutenbrunner
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGroup
 
Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AISanjana Chowdhury
 
Continuous Delivery for Machine Learning
Continuous Delivery for Machine LearningContinuous Delivery for Machine Learning
Continuous Delivery for Machine LearningThoughtworks
 
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019Christoph Windheuser
 
Data Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdfData Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdfHemaVeeradhi1
 
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Open Data Group
 
Our research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringOur research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringJordi Cabot
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisVivek Raja P S
 
Continuous delivery practices and real experiences
Continuous delivery   practices and real experiencesContinuous delivery   practices and real experiences
Continuous delivery practices and real experiencesEduardo Ferro Aldama
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation Profinit
 
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...Rik Marselis
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Lionel Briand
 

Similaire à CD4ML and the challenges of testing and quality in ML systems (20)

Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in Production
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
IBM Think Milano
IBM Think MilanoIBM Think Milano
IBM Think Milano
 
Continuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyContinuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production Reliably
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environments
 
Eliminate 7 Mudas
Eliminate 7 MudasEliminate 7 Mudas
Eliminate 7 Mudas
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
 
Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AI
 
Continuous Delivery for Machine Learning
Continuous Delivery for Machine LearningContinuous Delivery for Machine Learning
Continuous Delivery for Machine Learning
 
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
 
Data Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdfData Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdf
 
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
 
Our research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringOur research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software Engineering
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model Analysis
 
Continuous delivery practices and real experiences
Continuous delivery   practices and real experiencesContinuous delivery   practices and real experiences
Continuous delivery practices and real experiences
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation
 
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
 

Plus de Seldon

TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsSeldon
 
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiTensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiSeldon
 
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...Seldon
 
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...Seldon
 
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...Seldon
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon
 
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...Seldon
 
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAITensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAISeldon
 
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow Seldon
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform Seldon
 
Ai in financial services
Ai in financial servicesAi in financial services
Ai in financial servicesSeldon
 
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code Seldon
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...Seldon
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Seldon
 
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Seldon
 
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...Seldon
 
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'Seldon
 
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Seldon
 
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'Seldon
 
TensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoTensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoSeldon
 

Plus de Seldon (20)

TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative models
 
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiTensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
 
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
 
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
 
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at Scale
 
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
 
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAITensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
 
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
Ai in financial services
Ai in financial servicesAi in financial services
Ai in financial services
 
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
 
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
 
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
 
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
 
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
 
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
 
TensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoTensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya Dmitrichenko
 

Dernier

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Dernier (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

CD4ML and the challenges of testing and quality in ML systems

  • 1. 1 CD4ML and the challenges of testing and quality in ML systems TensorFlow London Meetup, May 2020 Danilo Sato @dtsato ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 2. 7000+ technologists with 43 offices in 14 countries We help clients become Modern Digital Businesses DELIVER VALUE MOVE FASTTHINK BIG
  • 3. #1 in Agile and Continuous Delivery 100+ books written ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 4.
  • 5. Techniques Continuous delivery for machine learning (CD4ML) TRIAL 7 https://www.thoughtworks.com/radar ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 6. CD4ML isn’t a technology or a tool; it is a practice and a set of principles. Quality is built into software and improvement is always possible. But machine learning systems have unique challenges; unlike deterministic software, it is difficult—or impossible—to understand the behavior of data-driven intelligent systems. This poses a huge challenge when it comes to deploying machine learning systems in accordance with CD principles. 6 PRODUCTIONIZING ML IS HARD Production systems should be: ● Reproducible ● Testable ● Auditable ● Continuously Improving HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO INTELLIGENT SYSTEMS? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 7. CD4ML isn’t a technology or a tool; it is a practice and a set of principles. Quality is built into software and improvement is always possible. But machine learning systems have unique challenges; unlike deterministic software, it is difficult—or impossible—to understand the behavior of data-driven intelligent systems. This poses a huge challenge when it comes to deploying machine learning systems in accordance with CD principles. 7 PRODUCTIONIZING ML IS HARD Production systems should be: ● Reproducible ● Testable ● Auditable ● Continuously Improving Machine Learning is: ● Non-deterministic ● Hard to test ● Hard to explain ● Hard to improve HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO INTELLIGENT SYSTEMS? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 8. MANY SOURCES OF CHANGE 8 ModelData Code + + Schema Sampling over Time Volume Algorithms More Training Experiments Business Needs Bug Fixes Configuration ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 9. “Continuous Delivery is the ability to get changes of all types — including new features, configuration changes, bug fixes and experiments — into production, or into the hands of users, safely and quickly in a sustainable way.” - Jez Humble & Dave Farley 9 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 10. PRINCIPLES OF CONTINUOUS DELIVERY 10 → Create a Repeatable, Reliable Process for Releasing Software → Automate Almost Everything → Build Quality In → Work in Small Batches → Keep Everything in Source Control → Done Means “Released” → Improve Continuously ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 11. TECHNICAL COMPONENTS OF CD4ML Implementation requires lots of tools, technologies, and architecture decisions to fully automate the end-to-end process. This presentation will focus on the testing and quality aspects of CD4ML. 11 DOING CD4ML IS STILL A HARD PROBLEM DISCOVERABLE AND ACCESSIBLE DATA REPRODUCIBLE MODEL TRAINING EXPERIMENTS TRACKING ELASTIC INFRASTRUCTURE VERSION CONTROL & ARTIFACTS REPOS MODEL SERVING MODEL DEPLOYMENT TESTING & QUALITY MONITORING & OBSERVABILITY CD ORCHESTRATION ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 https://martinfowler.com/articles/cd4ml.html
  • 12. “CLASSIC” SOFTWARE TEST PYRAMID 12 UI Tests Service Tests Unit Tests https://martinfowler.com/bliki/TestPyramid.html©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 Speed Cost
  • 13. AS SOFTWARE BECAME MORE COMPLEX 13 https://martinfowler.com/articles/microservice-testing©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 15. 15 ModelData Code + + ?? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 16. TESTS FOR DATA 16 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) - Adherence to schemas - Features can be used - Schema versioning and compatibility - Integration tests against (small) sample input - Adherence to privacy controls - On-demand quality checks
  • 17. TESTS FOR MODEL 17 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Compare against a simple model - Numerical stability (behaviour when NaN or infinite values appear) Unit Tests (Model Specification) Model Quality ML Training Pipeline - Training is reproducible (Watch out for sources of non-determinism – e.g. RNG seeds, initialization order) - Integration test
  • 18. 18 ModelData Code + + ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 19. 19 Model Performance Contract Tests Model Bias and Fairness Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) Unit Tests (Model Specification) Model Quality UI Tests Service Tests Unit Tests ML Training Pipeline ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Model evaluation against different validation datasets - Thresholds for model metrics and execution performance - Different data slices - Feature generation is same for training/serving - Model contract is adhered in production - When model is exported, test it still works TESTING WHERE THEY OVERLAP
  • 20. 20 Model Performance Contract Tests Model Bias and Fairness Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) Unit Tests (Model Specification) Model Quality UI Tests Service Tests Unit Tests End-to-End Tests Production Monitoring Exploratory Tests ML Training Pipeline ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Model degradation - Training/serving skew - Operational metrics (latency, throughput, resource usage) - Real impact! (KPIs)
  • 21. 21 “Inspection does not improve the quality, nor guarantee quality. Inspection is too late. The quality, good or bad, is already in the product.” - W. Edward Deming ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 22. QUESTIONS? 22 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 24. 2424 THANK YOU! Danilo Sato (dsato@thoughtworks.com) @dtsato ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020