SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Testing production
streaming applications
Gyula Fora & Matyas Orhidi
© 2019 Cloudera, Inc. All rights reserved. 2
Decomposing production applications
Pipeline
SourceConnectors
SinkConnectors
Operator unit testing
© 2019 Cloudera, Inc. All rights reserved. 4
Testing Stateless Operators
Stateless testing is simple, no magic involved
Steps
1. Instantiate operator
2. (Create Mock/ListCollector)
3. Send input
4. Validate output
Map
Input 1
Input 2
Input 3
Output 1
Output 2
Output 3
© 2019 Cloudera, Inc. All rights reserved. 5
Testing Stateless Operators
© 2019 Cloudera, Inc. All rights reserved. 6
Testing Stateless Operators
© 2019 Cloudera, Inc. All rights reserved. 7
Testing Stateful Operators
Map
Input 1
Watermark
Input 2
Output 1
Output 2
Output 3
Harness
We need to use the test harness to access
state and timely functionality.
Steps
1. Create operator test harness
2. Send inputs
3. Send watermarks
4. Validate output
© 2019 Cloudera, Inc. All rights reserved. 8
Testing Stateful Operators
© 2019 Cloudera, Inc. All rights reserved. 9
Testing Stateful Operators
Pipeline / flow testing
© 2019 Cloudera, Inc. All rights reserved. 11
Simple end-to-end tests
Steps
1. Prepare test input
2. Run application
3. Collect test output
4. Validate
Pipeline
Test
Input
Test
Output
Validate
© 2019 Cloudera, Inc. All rights reserved. 12
Simple end-to-end tests
© 2019 Cloudera, Inc. All rights reserved. 13
Simple end-to-end tests
© 2019 Cloudera, Inc. All rights reserved. 14
Simple end-to-end tests
Pros
• Mimics proper behavior
• Perfect for simple apps
Cons
• Hard to cover all cases
• Useless for complex applications
• Nearly impossible to control
ordering
• Hard to test windowing
Use it to validate simple pipeline logic.
Can be replaced by proper integration tests.
© 2019 Cloudera, Inc. All rights reserved. 15
“Manual” pipeline tests
Steps
1. Create manual Sources
2. Start application
3. Send input
4. Wait for output
5. Validate output
6. Repeat 3-5
7. Stop application
PipelineInput 1 Output 1
Input 2 Output 2
Watermark Output 3
Input 3 Output 4
Check in the flink-tutorials repo!
© 2019 Cloudera, Inc. All rights reserved. 16
“Manual” pipeline tests
© 2019 Cloudera, Inc. All rights reserved. 17
“Manual” pipeline tests
© 2019 Cloudera, Inc. All rights reserved. 18
“Manual” pipeline tests
Pros
• Full flow control
• Easy to test complex pipelines
• Easy watermark control
• Can cover corner cases
Cons
• Does not test race conditions
• Still pretty easy to miss corner
cases due to ordering
Use it to validate complex, timely dataflow logic and corner cases.
Integration testing
© 2019 Cloudera, Inc. All rights reserved. 20
Integration Testing
Pipeline
SourceConnectors
SinkConnectors
Embedded/Testing
Storage
Test Input
Input
Generator
Output
Validation
© 2019 Cloudera, Inc. All rights reserved. 21
Integration Testing
© 2019 Cloudera, Inc. All rights reserved. 22
Integration Testing
© 2019 Cloudera, Inc. All rights reserved. 23
Integration Testing
Pros
• Actually tests the “real thing”
• Covers connector config
• Can be used for performance
testing
Cons
• Resource intensive
• Similar caveats as simple
end-to-end testing
• Validation can be even more
complex
Every production application should be integration tested.
Wrapping up
© 2019 Cloudera, Inc. All rights reserved. 25
Overview of testing utilities
Great utils on all levels of our application logic
1. Functions/Operators
a. Unit test directly
b. Operator Test Harnesses
2. Pipeline/Flow
a. Simple end-to-end tests
b. Manual (JobTester) flow tests
3. Complete Application
a. Proper integration tests
Some are more user friendly than others...
© 2019 Cloudera, Inc. All rights reserved. 26
Thank you!
© 2019 Cloudera, Inc. All rights reserved. 27
References
https://flink.apache.org/news/2020/02/07/a-guide-for-unit-testing-in-apache-flink.html
https://www.ververica.com/flink-forward/resources/testing-stateful-streaming-applications
https://github.com/ottogroup/flink-spector
https://github.com/knaufk/flink-testing-pyramid
https://github.com/demiourgoi/flink-check/blob/master/flink-check/README.md
https://github.com/cloudera/flink-tutorials/tree/master/flink-stateful-tutorial/src/main/java/com/cloudera/streaming/examples/flink/utils/
testing

Contenu connexe

Tendances

Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Terraform and Weave GitOps: Build a Fully Automated Application Stack
Terraform and Weave GitOps: Build a Fully Automated Application StackTerraform and Weave GitOps: Build a Fully Automated Application Stack
Terraform and Weave GitOps: Build a Fully Automated Application StackWeaveworks
 
Intro to GitOps & Flux.pdf
Intro to GitOps & Flux.pdfIntro to GitOps & Flux.pdf
Intro to GitOps & Flux.pdfWeaveworks
 
Dealing with Merge Conflicts in Git
Dealing with Merge Conflicts in GitDealing with Merge Conflicts in Git
Dealing with Merge Conflicts in Gitgittower
 
Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)HungWei Chiu
 
CI-CD Jenkins, GitHub Actions, Tekton
CI-CD Jenkins, GitHub Actions, Tekton CI-CD Jenkins, GitHub Actions, Tekton
CI-CD Jenkins, GitHub Actions, Tekton Araf Karsh Hamid
 
DevOps : mission [im]possible ?
DevOps : mission [im]possible ?DevOps : mission [im]possible ?
DevOps : mission [im]possible ?rfelden
 
Grafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsGrafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsMarco Pracucci
 
Microservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsMicroservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsAraf Karsh Hamid
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Continuous integration
Continuous integrationContinuous integration
Continuous integrationhugo lu
 
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxGrafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxRomanKhavronenko
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
Introduction to Team Foundation Server (TFS) Online
Introduction to Team Foundation Server (TFS) OnlineIntroduction to Team Foundation Server (TFS) Online
Introduction to Team Foundation Server (TFS) OnlineDenis Voituron
 
Intro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps WorkshopIntro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps WorkshopWeaveworks
 
OpenTelemetry 101 FTW
OpenTelemetry 101 FTWOpenTelemetry 101 FTW
OpenTelemetry 101 FTWNGINX, Inc.
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
OpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platformOpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platformKangaroot
 
Yale Jenkins Show and Tell
Yale Jenkins Show and TellYale Jenkins Show and Tell
Yale Jenkins Show and TellE. Camden Fisher
 

Tendances (20)

Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Terraform and Weave GitOps: Build a Fully Automated Application Stack
Terraform and Weave GitOps: Build a Fully Automated Application StackTerraform and Weave GitOps: Build a Fully Automated Application Stack
Terraform and Weave GitOps: Build a Fully Automated Application Stack
 
Intro to GitOps & Flux.pdf
Intro to GitOps & Flux.pdfIntro to GitOps & Flux.pdf
Intro to GitOps & Flux.pdf
 
Dealing with Merge Conflicts in Git
Dealing with Merge Conflicts in GitDealing with Merge Conflicts in Git
Dealing with Merge Conflicts in Git
 
Quarkus bootstrap 2020
Quarkus bootstrap 2020Quarkus bootstrap 2020
Quarkus bootstrap 2020
 
Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)
 
CI-CD Jenkins, GitHub Actions, Tekton
CI-CD Jenkins, GitHub Actions, Tekton CI-CD Jenkins, GitHub Actions, Tekton
CI-CD Jenkins, GitHub Actions, Tekton
 
DevOps : mission [im]possible ?
DevOps : mission [im]possible ?DevOps : mission [im]possible ?
DevOps : mission [im]possible ?
 
Grafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsGrafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for Logs
 
Microservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsMicroservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native Apps
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Continuous integration
Continuous integrationContinuous integration
Continuous integration
 
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxGrafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Introduction to Team Foundation Server (TFS) Online
Introduction to Team Foundation Server (TFS) OnlineIntroduction to Team Foundation Server (TFS) Online
Introduction to Team Foundation Server (TFS) Online
 
Intro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps WorkshopIntro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps Workshop
 
OpenTelemetry 101 FTW
OpenTelemetry 101 FTWOpenTelemetry 101 FTW
OpenTelemetry 101 FTW
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
OpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platformOpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platform
 
Yale Jenkins Show and Tell
Yale Jenkins Show and TellYale Jenkins Show and Tell
Yale Jenkins Show and Tell
 

Similaire à Virtual Flink Forward 2020: Testing production streaming applications - Gyula Fora & Matyas Orhidi

Testing the Scalability of a Robust IoT System with Confidence
Testing the Scalability of a Robust IoT System with ConfidenceTesting the Scalability of a Robust IoT System with Confidence
Testing the Scalability of a Robust IoT System with ConfidenceHiveMQ
 
The Path to a Programmable Network
The Path to a Programmable NetworkThe Path to a Programmable Network
The Path to a Programmable NetworkMyNOG
 
CI/CD Pipeline Security: Advanced Continuous Delivery Best Practices: Securit...
CI/CD Pipeline Security: Advanced Continuous Delivery Best Practices: Securit...CI/CD Pipeline Security: Advanced Continuous Delivery Best Practices: Securit...
CI/CD Pipeline Security: Advanced Continuous Delivery Best Practices: Securit...Amazon Web Services
 
Automated Performance Testing for Desktop Applications by Ciprian Balea
Automated Performance Testing for Desktop Applications by Ciprian BaleaAutomated Performance Testing for Desktop Applications by Ciprian Balea
Automated Performance Testing for Desktop Applications by Ciprian Balea3Pillar Global
 
Viavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptxViavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptxmani723
 
Minikube Workshop Handout
Minikube Workshop HandoutMinikube Workshop Handout
Minikube Workshop HandoutAlfie Chen
 
Verify Your Kubernetes Clusters with Upstream e2e tests
Verify Your Kubernetes Clusters with Upstream e2e testsVerify Your Kubernetes Clusters with Upstream e2e tests
Verify Your Kubernetes Clusters with Upstream e2e testsKen'ichi Ohmichi
 
APImetrics Product Introduction
APImetrics Product IntroductionAPImetrics Product Introduction
APImetrics Product Introductionapimetrics
 
OSMC 2019 | The Telegraf Toolbelt: It Can Do That, Really? by David McKay
OSMC 2019 | The Telegraf Toolbelt: It Can Do That, Really? by David McKayOSMC 2019 | The Telegraf Toolbelt: It Can Do That, Really? by David McKay
OSMC 2019 | The Telegraf Toolbelt: It Can Do That, Really? by David McKayNETWAYS
 
Get to Green: How to Safely Refactor Legacy Code
Get to Green: How to Safely Refactor Legacy CodeGet to Green: How to Safely Refactor Legacy Code
Get to Green: How to Safely Refactor Legacy CodeGene Gotimer
 
DevOps for Mainframe: Open Source Fast Track
DevOps for Mainframe: Open Source Fast TrackDevOps for Mainframe: Open Source Fast Track
DevOps for Mainframe: Open Source Fast TrackDevOps.com
 
Enterprise Cloud with IBM & Chef (ChefConf 2013)
Enterprise Cloud with IBM & Chef (ChefConf 2013)Enterprise Cloud with IBM & Chef (ChefConf 2013)
Enterprise Cloud with IBM & Chef (ChefConf 2013)Michael Elder
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Databricks
 
2019 ibm io t exchange - meeting safety-related software audits
2019   ibm io t exchange - meeting safety-related software audits2019   ibm io t exchange - meeting safety-related software audits
2019 ibm io t exchange - meeting safety-related software auditsM Kevin McHugh
 
Innovating FIPS crypto validation in the Cloud - SEP321 - AWS re:Inforce 2019
Innovating FIPS crypto validation in the Cloud - SEP321 - AWS re:Inforce 2019 Innovating FIPS crypto validation in the Cloud - SEP321 - AWS re:Inforce 2019
Innovating FIPS crypto validation in the Cloud - SEP321 - AWS re:Inforce 2019 Amazon Web Services
 

Similaire à Virtual Flink Forward 2020: Testing production streaming applications - Gyula Fora & Matyas Orhidi (20)

Testing the Scalability of a Robust IoT System with Confidence
Testing the Scalability of a Robust IoT System with ConfidenceTesting the Scalability of a Robust IoT System with Confidence
Testing the Scalability of a Robust IoT System with Confidence
 
The Path to a Programmable Network
The Path to a Programmable NetworkThe Path to a Programmable Network
The Path to a Programmable Network
 
MicroShed Testing
MicroShed TestingMicroShed Testing
MicroShed Testing
 
CI/CD Pipeline Security: Advanced Continuous Delivery Best Practices: Securit...
CI/CD Pipeline Security: Advanced Continuous Delivery Best Practices: Securit...CI/CD Pipeline Security: Advanced Continuous Delivery Best Practices: Securit...
CI/CD Pipeline Security: Advanced Continuous Delivery Best Practices: Securit...
 
Automated Performance Testing for Desktop Applications by Ciprian Balea
Automated Performance Testing for Desktop Applications by Ciprian BaleaAutomated Performance Testing for Desktop Applications by Ciprian Balea
Automated Performance Testing for Desktop Applications by Ciprian Balea
 
Viavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptxViavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptx
 
Free OpUtils Training presentation
Free OpUtils Training presentationFree OpUtils Training presentation
Free OpUtils Training presentation
 
Minikube Workshop Handout
Minikube Workshop HandoutMinikube Workshop Handout
Minikube Workshop Handout
 
Verify Your Kubernetes Clusters with Upstream e2e tests
Verify Your Kubernetes Clusters with Upstream e2e testsVerify Your Kubernetes Clusters with Upstream e2e tests
Verify Your Kubernetes Clusters with Upstream e2e tests
 
OpUtils webinar
OpUtils webinarOpUtils webinar
OpUtils webinar
 
TechTalk: Report Bugs Like a Boss
TechTalk: Report Bugs Like a BossTechTalk: Report Bugs Like a Boss
TechTalk: Report Bugs Like a Boss
 
APImetrics Product Introduction
APImetrics Product IntroductionAPImetrics Product Introduction
APImetrics Product Introduction
 
OSMC 2019 | The Telegraf Toolbelt: It Can Do That, Really? by David McKay
OSMC 2019 | The Telegraf Toolbelt: It Can Do That, Really? by David McKayOSMC 2019 | The Telegraf Toolbelt: It Can Do That, Really? by David McKay
OSMC 2019 | The Telegraf Toolbelt: It Can Do That, Really? by David McKay
 
Get to Green: How to Safely Refactor Legacy Code
Get to Green: How to Safely Refactor Legacy CodeGet to Green: How to Safely Refactor Legacy Code
Get to Green: How to Safely Refactor Legacy Code
 
Ankur Goel Resume
Ankur Goel ResumeAnkur Goel Resume
Ankur Goel Resume
 
DevOps for Mainframe: Open Source Fast Track
DevOps for Mainframe: Open Source Fast TrackDevOps for Mainframe: Open Source Fast Track
DevOps for Mainframe: Open Source Fast Track
 
Enterprise Cloud with IBM & Chef (ChefConf 2013)
Enterprise Cloud with IBM & Chef (ChefConf 2013)Enterprise Cloud with IBM & Chef (ChefConf 2013)
Enterprise Cloud with IBM & Chef (ChefConf 2013)
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
 
2019 ibm io t exchange - meeting safety-related software audits
2019   ibm io t exchange - meeting safety-related software audits2019   ibm io t exchange - meeting safety-related software audits
2019 ibm io t exchange - meeting safety-related software audits
 
Innovating FIPS crypto validation in the Cloud - SEP321 - AWS re:Inforce 2019
Innovating FIPS crypto validation in the Cloud - SEP321 - AWS re:Inforce 2019 Innovating FIPS crypto validation in the Cloud - SEP321 - AWS re:Inforce 2019
Innovating FIPS crypto validation in the Cloud - SEP321 - AWS re:Inforce 2019
 

Plus de Flink Forward

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraFlink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!Flink Forward
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsFlink Forward
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesFlink Forward
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitFlink Forward
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkFlink Forward
 
Large Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionLarge Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionFlink Forward
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward
 

Plus de Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use cases
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
Large Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionLarge Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior Detection
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 

Dernier

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Dernier (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

Virtual Flink Forward 2020: Testing production streaming applications - Gyula Fora & Matyas Orhidi

  • 2. © 2019 Cloudera, Inc. All rights reserved. 2 Decomposing production applications Pipeline SourceConnectors SinkConnectors
  • 4. © 2019 Cloudera, Inc. All rights reserved. 4 Testing Stateless Operators Stateless testing is simple, no magic involved Steps 1. Instantiate operator 2. (Create Mock/ListCollector) 3. Send input 4. Validate output Map Input 1 Input 2 Input 3 Output 1 Output 2 Output 3
  • 5. © 2019 Cloudera, Inc. All rights reserved. 5 Testing Stateless Operators
  • 6. © 2019 Cloudera, Inc. All rights reserved. 6 Testing Stateless Operators
  • 7. © 2019 Cloudera, Inc. All rights reserved. 7 Testing Stateful Operators Map Input 1 Watermark Input 2 Output 1 Output 2 Output 3 Harness We need to use the test harness to access state and timely functionality. Steps 1. Create operator test harness 2. Send inputs 3. Send watermarks 4. Validate output
  • 8. © 2019 Cloudera, Inc. All rights reserved. 8 Testing Stateful Operators
  • 9. © 2019 Cloudera, Inc. All rights reserved. 9 Testing Stateful Operators
  • 10. Pipeline / flow testing
  • 11. © 2019 Cloudera, Inc. All rights reserved. 11 Simple end-to-end tests Steps 1. Prepare test input 2. Run application 3. Collect test output 4. Validate Pipeline Test Input Test Output Validate
  • 12. © 2019 Cloudera, Inc. All rights reserved. 12 Simple end-to-end tests
  • 13. © 2019 Cloudera, Inc. All rights reserved. 13 Simple end-to-end tests
  • 14. © 2019 Cloudera, Inc. All rights reserved. 14 Simple end-to-end tests Pros • Mimics proper behavior • Perfect for simple apps Cons • Hard to cover all cases • Useless for complex applications • Nearly impossible to control ordering • Hard to test windowing Use it to validate simple pipeline logic. Can be replaced by proper integration tests.
  • 15. © 2019 Cloudera, Inc. All rights reserved. 15 “Manual” pipeline tests Steps 1. Create manual Sources 2. Start application 3. Send input 4. Wait for output 5. Validate output 6. Repeat 3-5 7. Stop application PipelineInput 1 Output 1 Input 2 Output 2 Watermark Output 3 Input 3 Output 4 Check in the flink-tutorials repo!
  • 16. © 2019 Cloudera, Inc. All rights reserved. 16 “Manual” pipeline tests
  • 17. © 2019 Cloudera, Inc. All rights reserved. 17 “Manual” pipeline tests
  • 18. © 2019 Cloudera, Inc. All rights reserved. 18 “Manual” pipeline tests Pros • Full flow control • Easy to test complex pipelines • Easy watermark control • Can cover corner cases Cons • Does not test race conditions • Still pretty easy to miss corner cases due to ordering Use it to validate complex, timely dataflow logic and corner cases.
  • 20. © 2019 Cloudera, Inc. All rights reserved. 20 Integration Testing Pipeline SourceConnectors SinkConnectors Embedded/Testing Storage Test Input Input Generator Output Validation
  • 21. © 2019 Cloudera, Inc. All rights reserved. 21 Integration Testing
  • 22. © 2019 Cloudera, Inc. All rights reserved. 22 Integration Testing
  • 23. © 2019 Cloudera, Inc. All rights reserved. 23 Integration Testing Pros • Actually tests the “real thing” • Covers connector config • Can be used for performance testing Cons • Resource intensive • Similar caveats as simple end-to-end testing • Validation can be even more complex Every production application should be integration tested.
  • 25. © 2019 Cloudera, Inc. All rights reserved. 25 Overview of testing utilities Great utils on all levels of our application logic 1. Functions/Operators a. Unit test directly b. Operator Test Harnesses 2. Pipeline/Flow a. Simple end-to-end tests b. Manual (JobTester) flow tests 3. Complete Application a. Proper integration tests Some are more user friendly than others...
  • 26. © 2019 Cloudera, Inc. All rights reserved. 26 Thank you!
  • 27. © 2019 Cloudera, Inc. All rights reserved. 27 References https://flink.apache.org/news/2020/02/07/a-guide-for-unit-testing-in-apache-flink.html https://www.ververica.com/flink-forward/resources/testing-stateful-streaming-applications https://github.com/ottogroup/flink-spector https://github.com/knaufk/flink-testing-pyramid https://github.com/demiourgoi/flink-check/blob/master/flink-check/README.md https://github.com/cloudera/flink-tutorials/tree/master/flink-stateful-tutorial/src/main/java/com/cloudera/streaming/examples/flink/utils/ testing