SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
B A T C H
T O
S T R E A M
D A T A P I P E L I N E
Christina Lin
Redpanda
Developer Advocate
Agenda
• Quick Intro – 10-15 mins
• HANDS-ON – 35-40 mins
• Q & A – 5-10 mins
© 2024 REDPANDA DATA
Christina Lin
Developer Advocate, Redpanda
aka. The Redpanda Lady
SOA
WebSphere
DB2
Sybase
Oracle
MQ
J2EE
EJB
DevOps
Microservice
EIP
K8s
Agile
Integration
Data
Mesh
Active MQ
Living data stack
Resilience - handle failures and scale gracefully
Elasticity – infrastructure that can scale dynamically
Decentralization - data ownership, empowering
individual teams
Performance - low latency and high throughput
Autonomy – self service, define quality, and access
Nimble - efficient data movement
Distributed -distributed data processing for cloud native
Agility – quickly respond to change in data
© 2024 REDPANDA DATA
© 2024 REDPANDA DATA
An ordinary day of Data Engineer
© 2024 REDPANDA DATA
© 2024 REDPANDA DATA
Stateless
Streaming Pipeline
Transform
format Change, masking, filtering, validating
Dispatch, Wiretap
Spilt, multiple destination
Control
reroute
Normalize/ Denormalize Enrich
Multiple ingestion
Stateful
Streaming Pipeline
Complex event processing
Time-window based processing
Enrich
Multiple ingestion
Micro batch Pipeline
Transform for large output (Dataset)
Partitioning Split workload
Analytics
batch
Pipeline
Analytics large volume (legacy)
Transform large output (Dataset, legacy)
Transport large unstructured data
Better scalability for pipelines
Batch
CSV
Every 10 mins
CSV
Right away!
CSV
Stream
Batch
pipeline
Batch Processing
Batch
pipeline
TheWorkshop overview
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
VM
postgres:5432 cassandra:5432
postgresload.ipynb
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
The Batch
cassandra:5432
VM
postgres:5432 cassandra:5432
cassandraload.ipynb
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
The Batch
VM
postgres:5432 cassandra:5432
spark.ipynb
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
The Batch
CSV
CSV
CSV
VM
postgres:5432 cassandra:5432
Map.ipynb
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
Dashboard View
CSV
CSV
CSV
VM
postgres:5432cassandra:5432
redpanda-0:9092
© 2024 REDPANDA DATA
https://bit.ly/osdc-redpanda
Let’s Stream
VM
postgres:5432cassandra:5432
© 2024 REDPANDA DATA
https://bit.ly/osdc-redpanda
Let’s Stream
VM
postgres:5432cassandra:5432
Config
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
Let’s Stream
VM
postgres:5432cassandra:5432
Job manager
Flink Data
Stream
Java JAR
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
Let’s Stream
VM
postgres:5432cassandra:5432
Job manager
CSV
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
Let’s Stream
VM
postgres:5432cassandra:5432
Job manager
CSV
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
Let’s Stream
CSV
CSV
VM
postgres:5432cassandra:5432
SQL Client
SQL
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
The Batch
CSV
CSV
CSV
Batch
CSV
Batch
pipeline
Batch
pipeline
Batch Processing
Every 10 mins
CSV
Right away!
CSV
Stream
© 2024 REDPANDA DATA
https://bit.ly/odsc-redpanda
TheWorkshop overview
© 2024 REDPANDA DATA
Stateless
Streaming Pipeline
Transform
format Change, masking, filtering, validating
Dispatch, Wiretap
Spilt, multiple destination
Control
reroute
Normalize/ Denormalize Enrich
Multiple ingestion
Stateful
Streaming Pipeline
Complex event processing
Time-window based processing
Enrich
Multiple ingestion
Better scalability for pipelines
© 2024 REDPANDA DATA
Keep Learning
Streaming - Communication
Basics of K8s networking
Connectivity
Performance
Docs
Get a peak under the hood.
https://docs.redpanda.com/
Blogs
Keep up to date with Redpanda.
https://redpanda.com/blog
Slack
Engage with our community.
https://redpanda.com/slack
Code
Check out the source.
https://github.com/redpanda-data
Redpanda University
Free, self-paced online learning
https://university.redpanda.com

Contenu connexe

Similaire à ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, PostgreSQL, Redpanda, Debezium, and Benthos to master building advanced real-time data pipelines.

Visual Mapping of Clickstream Data
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream Data
DataWorks Summit
 
Riverbed Granite
Riverbed GraniteRiverbed Granite
Riverbed Granite
CTI Group
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitecture
Maheedhar Gunturu
 
#3 rsp success card
#3 rsp success card#3 rsp success card
#3 rsp success card
chanwitcs
 
Banv meetup-contrail
Banv meetup-contrailBanv meetup-contrail
Banv meetup-contrail
nvirters
 

Similaire à ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, PostgreSQL, Redpanda, Debezium, and Benthos to master building advanced real-time data pipelines. (20)

Build agile and elastic data pipeline
Build agile and elastic data pipelineBuild agile and elastic data pipeline
Build agile and elastic data pipeline
 
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream Data
 
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Riverbed Granite
Riverbed GraniteRiverbed Granite
Riverbed Granite
 
Steelhead DX for Datacenter-to-Datacenter optimization
Steelhead DX for Datacenter-to-Datacenter optimizationSteelhead DX for Datacenter-to-Datacenter optimization
Steelhead DX for Datacenter-to-Datacenter optimization
 
Building scalable data with kafka and spark
Building scalable data with kafka and sparkBuilding scalable data with kafka and spark
Building scalable data with kafka and spark
 
Wasp2 - IoT and Streaming Platform
Wasp2 - IoT and Streaming PlatformWasp2 - IoT and Streaming Platform
Wasp2 - IoT and Streaming Platform
 
Riverbed Within Local Gov
Riverbed Within Local GovRiverbed Within Local Gov
Riverbed Within Local Gov
 
6WINDGate™ - High Performance Networking for Data Centers
6WINDGate™ - High Performance Networking for Data Centers6WINDGate™ - High Performance Networking for Data Centers
6WINDGate™ - High Performance Networking for Data Centers
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitecture
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
 
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
 
#3 rsp success card
#3 rsp success card#3 rsp success card
#3 rsp success card
 
Navigate Data Service using AWS
Navigate Data Service using AWSNavigate Data Service using AWS
Navigate Data Service using AWS
 
Banv meetup-contrail
Banv meetup-contrailBanv meetup-contrail
Banv meetup-contrail
 

Plus de Christina Lin

Plus de Christina Lin (20)

Bangalore Meetup - Enable realtime machine learning with streaming data
Bangalore Meetup - Enable realtime machine learning with streaming dataBangalore Meetup - Enable realtime machine learning with streaming data
Bangalore Meetup - Enable realtime machine learning with streaming data
 
Kafka summit apac session
Kafka summit apac sessionKafka summit apac session
Kafka summit apac session
 
Serverless integration anatomy
Serverless integration anatomyServerless integration anatomy
Serverless integration anatomy
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshop
 
Agile integration cloud native developement
Agile integration   cloud native developementAgile integration   cloud native developement
Agile integration cloud native developement
 
Dev conf .in cloud native reference architecture .advance
Dev conf .in cloud native reference architecture .advanceDev conf .in cloud native reference architecture .advance
Dev conf .in cloud native reference architecture .advance
 
Camel k Taiwan Java user group
Camel k  Taiwan Java user groupCamel k  Taiwan Java user group
Camel k Taiwan Java user group
 
Devoxxma-API centric microservices Architecture
Devoxxma-API centric microservices ArchitectureDevoxxma-API centric microservices Architecture
Devoxxma-API centric microservices Architecture
 
JBoss Fuse - Fuse workshop EAP container
JBoss Fuse - Fuse workshop EAP containerJBoss Fuse - Fuse workshop EAP container
JBoss Fuse - Fuse workshop EAP container
 
Supercharge Your Integration Services
Supercharge Your Integration Services�Supercharge Your Integration Services�
Supercharge Your Integration Services
 
Improve business process with microservice integration
Improve business process with microservice integration �Improve business process with microservice integration �
Improve business process with microservice integration
 
Integrating BPM with Fuse
Integrating BPM with FuseIntegrating BPM with Fuse
Integrating BPM with Fuse
 
Scalable Integration with JBoss Fuse
Scalable Integration with JBoss FuseScalable Integration with JBoss Fuse
Scalable Integration with JBoss Fuse
 
JBoss Fuse - Fuse workshop Error Handling
JBoss Fuse - Fuse workshop Error HandlingJBoss Fuse - Fuse workshop Error Handling
JBoss Fuse - Fuse workshop Error Handling
 
JBoss Fuse Workshop 101 part 6
JBoss Fuse Workshop 101 part 6JBoss Fuse Workshop 101 part 6
JBoss Fuse Workshop 101 part 6
 
JBoss Fuse Workshop 101 part 5
JBoss Fuse Workshop 101 part 5JBoss Fuse Workshop 101 part 5
JBoss Fuse Workshop 101 part 5
 
JBoss Fuse Workshop 101 part 4
JBoss Fuse Workshop 101 part 4JBoss Fuse Workshop 101 part 4
JBoss Fuse Workshop 101 part 4
 
JBoss Fuse Workshop 101 part 3
JBoss Fuse Workshop 101 part 3JBoss Fuse Workshop 101 part 3
JBoss Fuse Workshop 101 part 3
 
JBoss Fuse Workshop 101 part 2
JBoss Fuse Workshop 101 part 2JBoss Fuse Workshop 101 part 2
JBoss Fuse Workshop 101 part 2
 
Jboss Fuse Workshop 101 part 1
Jboss Fuse Workshop 101 part 1Jboss Fuse Workshop 101 part 1
Jboss Fuse Workshop 101 part 1
 

Dernier

Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Lisi Hocke
 
Abortion Clinic in Midrand [(+27832195400*)]🏥Safe Abortion Pills In Midrand |...
Abortion Clinic in Midrand [(+27832195400*)]🏥Safe Abortion Pills In Midrand |...Abortion Clinic in Midrand [(+27832195400*)]🏥Safe Abortion Pills In Midrand |...
Abortion Clinic in Midrand [(+27832195400*)]🏥Safe Abortion Pills In Midrand |...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
Abortion Pill Prices Jane Furse ](+27832195400*)[🏥Women's Abortion Clinic in ...
Abortion Pill Prices Jane Furse ](+27832195400*)[🏥Women's Abortion Clinic in ...Abortion Pill Prices Jane Furse ](+27832195400*)[🏥Women's Abortion Clinic in ...
Abortion Pill Prices Jane Furse ](+27832195400*)[🏥Women's Abortion Clinic in ...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Dernier (20)

A Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdfA Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdf
 
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
 
Abortion Clinic in Bloemfontein [(+27832195400*)]🏥Safe Abortion Pills In Bloe...
Abortion Clinic in Bloemfontein [(+27832195400*)]🏥Safe Abortion Pills In Bloe...Abortion Clinic in Bloemfontein [(+27832195400*)]🏥Safe Abortion Pills In Bloe...
Abortion Clinic in Bloemfontein [(+27832195400*)]🏥Safe Abortion Pills In Bloe...
 
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
 
Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...
Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...
Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...
 
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
 
Abortion Clinic In Stanger ](+27832195400*)[ 🏥 Safe Abortion Pills In Stanger...
Abortion Clinic In Stanger ](+27832195400*)[ 🏥 Safe Abortion Pills In Stanger...Abortion Clinic In Stanger ](+27832195400*)[ 🏥 Safe Abortion Pills In Stanger...
Abortion Clinic In Stanger ](+27832195400*)[ 🏥 Safe Abortion Pills In Stanger...
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
Abortion Pill Prices Polokwane ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Polokwane ](+27832195400*)[ 🏥 Women's Abortion Clinic in...Abortion Pill Prices Polokwane ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Polokwane ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
 
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
 
Test Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdfTest Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdf
 
From Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST APIFrom Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST API
 
Weeding your micro service landscape.pdf
Weeding your micro service landscape.pdfWeeding your micro service landscape.pdf
Weeding your micro service landscape.pdf
 
Effective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeConEffective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeCon
 
Abortion Clinic in Midrand [(+27832195400*)]🏥Safe Abortion Pills In Midrand |...
Abortion Clinic in Midrand [(+27832195400*)]🏥Safe Abortion Pills In Midrand |...Abortion Clinic in Midrand [(+27832195400*)]🏥Safe Abortion Pills In Midrand |...
Abortion Clinic in Midrand [(+27832195400*)]🏥Safe Abortion Pills In Midrand |...
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
 
[GRCPP] Introduction to concepts (C++20)
[GRCPP] Introduction to concepts (C++20)[GRCPP] Introduction to concepts (C++20)
[GRCPP] Introduction to concepts (C++20)
 
Abortion Pill Prices Jane Furse ](+27832195400*)[🏥Women's Abortion Clinic in ...
Abortion Pill Prices Jane Furse ](+27832195400*)[🏥Women's Abortion Clinic in ...Abortion Pill Prices Jane Furse ](+27832195400*)[🏥Women's Abortion Clinic in ...
Abortion Pill Prices Jane Furse ](+27832195400*)[🏥Women's Abortion Clinic in ...
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
 

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, PostgreSQL, Redpanda, Debezium, and Benthos to master building advanced real-time data pipelines.