SlideShare une entreprise Scribd logo
1  sur  49
Télécharger pour lire hors ligne
Check out these resources:
Dean’s book
Webinars
etc.
Fast Data Architectures 

for Streaming Applications
Getting Answers Now from Data Sets that Never End
By Dean Wampler, Ph. D., VP of Fast Data Engineering
2
lightbend.com/products/fast-data-platform
Streaming Engines in Context…
Classic Batch Architecture:
Hadoop
Logs
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
YARN
Resource	
Manager
Node	
Manager
N
M
Batch
MapReduce
…
Spark
Flume
SqoopDBs
Logs
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
YARN
Resource	
Manager
Node	
Manager
N
M
Batch
MapReduce
…
Spark
Flume
SqoopDBs
Logs
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
YARN
Resource	
Manager
Node	
Manager
N
M
Batch
MapReduce
…
Spark
Flume
SqoopDBs
Logs
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
YARN
Resource	
Manager
Node	
Manager
N
M
Batch
MapReduce
…
Spark
Flume
SqoopDBs
Logs
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
YARN
Resource	
Manager
Node	
Manager
N
M
Batch
MapReduce
…
Spark
Flume
SqoopDBs
New Streaming, “Fast Data” Architecture
(but it also supports batch)
Mesos, Kubernetes, YARN, …
Cloud, on premise, …
Logs
Sockets
REST
ZooKeeper Cluster
ZK
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3 11
KaFa Cluster
Ka9a
Microservices
RP Go
Node.js …
2
4
7
8
9
10
Beam
Mesos, Kubernetes, YARN, …
Cloud, on premise, …
Logs
Sockets
REST
ZooKeeper Cluster
ZK
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3 11
KaFa Cluster
Ka9a
Microservices
RP Go
Node.js …
2
4
7
8
9
10
Beam
Mesos, Kubernetes, YARN, …
Cloud, on premise, …
Logs
Sockets
REST
ZooKeeper Cluster
ZK
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3 11
KaFa Cluster
Ka9a
Microservices
RP Go
Node.js …
2
4
7
8
9
10
Beam
Mesos, Kubernetes, YARN, …
Cloud, on premise, …
Logs
Sockets
REST
ZooKeeper Cluster
ZK
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3 11
KaFa Cluster
Ka9a
Microservices
RP Go
Node.js …
2
4
7
8
9
10
Beam
Mesos, Kubernetes, YARN, …
Cloud, on premise, …
Logs
Sockets
REST
ZooKeeper Cluster
ZK
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3 11
KaFa Cluster
Ka9a
Microservices
RP Go
Node.js …
2
4
7
8
9
10
Beam
• Why Kafka?
Service 1
Log &
Other Files
Internet
Services
Service 2
Service 3
Services
Services
N * M links ConsumersProducers
Before:
Service 1
Log &
Other Files
Internet
Services
Service 2
Service 3
Services
Services
N + M links ConsumersProducers
After:
Mesos, Kubernetes, YARN, …
Cloud, on premise, …
Logs
Sockets
REST
ZooKeeper Cluster
ZK
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3 11
KaFa Cluster
Ka9a
Microservices
RP Go
Node.js …
2
4
7
8
9
10
Beam
Mesos, Kubernetes, YARN, …
Cloud, on premise, …
Logs
Sockets
REST
ZooKeeper Cluster
ZK
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3 11
KaFa Cluster
Ka9a
Microservices
RP Go
Node.js …
2
4
7
8
9
10
Beam
Mesos, Kubernetes, YARN, …
Cloud, on premise, …
Logs
Sockets
REST
ZooKeeper Cluster
ZK
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3 11
KaFa Cluster
Ka9a
Microservices
RP Go
Node.js …
2
4
7
8
9
10
Beam
Streaming Engines
Features to Consider
• Low latency? How low?
• High volume? How high?
• Which kinds of data processing, analytics?
• Process data in bulk or individually?
•Bulk processing of records?
•Individual processing of events?
• Preferred application architecture?
• Low latency? How low?
www.spacex.com/news
• Low latency? How low?
• Real real time? pico- to microseconds
www.spacex.com/news
• Low latency? How low?
• < 100 microseconds?
tradinghub.co/watch-list-for-mar-26th-2015/
www.usa.philips.com/
• Low latency? How low?
• < 10 milliseconds?
money.cnn.com/2017/05/12/pf/credit-card-mistakes/index.html
• Low latency? How low?
• < 100s milliseconds?
github.com/keen/dashboards
coursera.org/learn/machine-learning
• Low latency? How low?
• < 1 second to minutes
ETL
Model	Training
storage
Data
Model
Training
Model
Serving
Other
Logic
Logs
Ka'a
Raw	Logs	Topic
Parsed	Logs	Topic
Ka'a
Streams
Job
• Low latency? How low?
• > 1 minute?
• Use short batch jobs
• High volume? How high?
• High volume? How high?
• < 1oK -100K per second?
drdobbs.com/web-development/	
soa-web-services-and-restful-systems/199902676
• High volume? How high?
• > 1M per second?
https://store.nest.com/product/thermostat/
• Which kinds of data processing, analytics?
• SQL?
SELECT		COUNT(*)	
FROM	my-iot-data	
GROUP	BY	zip-code
val	input	=	spark.read.	
		format(“parquet”).	
		stream(“my-iot-data”)	
input.groupBy(“zip-code”).	
		count()
• Which kinds of data processing, analytics?
• “Dataflow”?
val sc = new SparkContext("local[*]", "Inverted Idx")
sc.textFile("data/crawl")
.map { line => val Array(path, text) = line.split(“t”,2); (path, text
} flatMap {
case (path, text) => text.split(“”"W+""").map((_, path))
} map {
case (w, p) => ((w, p), 1)
} reduceByKey {
case (n1, n2) => n1 + n2
} map {
• Which kinds of data processing, analytics?
• ETL?
ETL
Logs
Ka'a
Raw	Logs	Topic
Parsed	Logs	Topic
Ka'a
Streams
Job
• Which kinds of data processing, analytics?
• Train and serve ML models?
storage
Data
Model
Training
Model
Serving
Other
Logic
• Process data in bulk or individually?
• Individual events (i.e., CEP).
• In bulk records (i.e., each datum’s identity
unimportant).
Microservice
Microservice
Microservice
Microservice
Service	
Actor	1
Event
Event
Event
Event
Event
Event
Router
Actor
Service	
Actor	2
…
SA13
SA11
SA12
SA23
SA21
SA22
SELECT		COUNT(*)	
FROM	my-iot-data	
GROUP	BY	zip-code
• Preferred application architecture
• Streaming library in an app?
• Distributed services running your job?
Mini-batch
Spark	
Streaming
Low Latency
Flink
Ka0a	Streams
Akka	Streams
Beam
…
Mini-batch
Spark	
Streaming
Low Latency
Flink
Ka0a	Streams
Akka	Streams
Beam
…
Best of Breed Streaming Engines
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Ka9a
2
4
7
8
9
10
Beam
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Ka9a
2
4
7
8
9
10
Beam
• Apache Beam
• (Formerly Google Dataflow)
• Define your flows; run with
Flink, Spark, etc.
• Beam is defining the state of
the art for streaming
semantics
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Ka9a
2
4
7
8
9
10
Beam
• Apache Flink
• Low-latency streaming
• Best Beam runner
• SQL, ML, etc.
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Ka9a
2
4
7
8
9
10
Beam
• Apache Spark
• Best known; large community
• Batch, mini-batch, and new
low-latency streaming
• SQL, ML, etc.
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Ka9a
2
4
7
8
9
10
Beam
• Akka Streams
• Low-latency streaming
• Rich dataflow language
• Rich APIs for microservices,
data sources and sinks
• Excellent for model serving
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Ka9a
2
4
7
8
9
10
Beam
• Kafka Streams
• Read, write Kafka topics
• Stream and Table abstractions
• SQL on streams
Low Latency and
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Ka9a
2
4
7
8
9
10
Beam
• Spark or Flink?
• Best for massive data sets
• Rich analytics
• Akka Streams or Kafka Streams
• Best for microservice
integration
• Wider flexibility
Check out these resources:
Dean’s book
Webinars
etc.
Fast Data Architectures 

for Streaming Applications
Getting Answers Now from Data Sets that Never End
By Dean Wampler, Ph. D., VP of Fast Data Engineering
48
lightbend.com/products/fast-data-platform
For more information on
Lightbend Fast Data Platform:
lightbend.com/fast-data-platform

Contenu connexe

Tendances

Operationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsOperationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsLightbend
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformLegacy Typesafe (now Lightbend)
 
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...Lightbend
 
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...Lightbend
 
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreTypesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreLegacy Typesafe (now Lightbend)
 
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google Cloud
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google CloudPakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google Cloud
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google CloudLightbend
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Helena Edelson
 
Akka A to Z: A Guide To The Industry’s Best Toolkit for Fast Data and Microse...
Akka A to Z: A Guide To The Industry’s Best Toolkit for Fast Data and Microse...Akka A to Z: A Guide To The Industry’s Best Toolkit for Fast Data and Microse...
Akka A to Z: A Guide To The Industry’s Best Toolkit for Fast Data and Microse...Lightbend
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lightbend
 
Making Scala Faster: 3 Expert Tips For Busy Development Teams
Making Scala Faster: 3 Expert Tips For Busy Development TeamsMaking Scala Faster: 3 Expert Tips For Busy Development Teams
Making Scala Faster: 3 Expert Tips For Busy Development TeamsLightbend
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...confluent
 
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...Lightbend
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams APIconfluent
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 confluent
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...confluent
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsLightbend
 
Do's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in productionDo's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in productionjglobal
 
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...Lightbend
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Akka and Kubernetes: Reactive From Code To Cloud
Akka and Kubernetes: Reactive From Code To CloudAkka and Kubernetes: Reactive From Code To Cloud
Akka and Kubernetes: Reactive From Code To CloudLightbend
 

Tendances (20)

Operationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsOperationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML Models
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
 
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
 
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...
 
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreTypesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
 
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google Cloud
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google CloudPakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google Cloud
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google Cloud
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
Akka A to Z: A Guide To The Industry’s Best Toolkit for Fast Data and Microse...
Akka A to Z: A Guide To The Industry’s Best Toolkit for Fast Data and Microse...Akka A to Z: A Guide To The Industry’s Best Toolkit for Fast Data and Microse...
Akka A to Z: A Guide To The Industry’s Best Toolkit for Fast Data and Microse...
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
Making Scala Faster: 3 Expert Tips For Busy Development Teams
Making Scala Faster: 3 Expert Tips For Busy Development TeamsMaking Scala Faster: 3 Expert Tips For Busy Development Teams
Making Scala Faster: 3 Expert Tips For Busy Development Teams
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
 
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed Applications
 
Do's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in productionDo's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in production
 
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Akka and Kubernetes: Reactive From Code To Cloud
Akka and Kubernetes: Reactive From Code To CloudAkka and Kubernetes: Reactive From Code To Cloud
Akka and Kubernetes: Reactive From Code To Cloud
 

Similaire à Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job

Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Chris Fregly
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Databricks
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...Amazon Web Services
 
Lightbend Fast Data Platform
Lightbend Fast Data PlatformLightbend Fast Data Platform
Lightbend Fast Data PlatformLightbend
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017Monal Daxini
 
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...Matt Stubbs
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?confluent
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per SecondAmazon Web Services
 
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web ServiceEvan Chan
 
700 Queries Per Second with Updates: Spark As A Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web Service700 Queries Per Second with Updates: Spark As A Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web ServiceSpark Summit
 
Architecting a Next Generation Data Platform
Architecting a Next Generation Data PlatformArchitecting a Next Generation Data Platform
Architecting a Next Generation Data Platformhadooparchbook
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsDatabricks
 
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons LearnedA Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons LearnedDatabricks
 
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applicationsTop 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applicationshadooparchbook
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...Databricks
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksAnyscale
 
Introduction to near real time computing
Introduction to near real time computingIntroduction to near real time computing
Introduction to near real time computingTao Li
 

Similaire à Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job (20)

Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
 
Lightbend Fast Data Platform
Lightbend Fast Data PlatformLightbend Fast Data Platform
Lightbend Fast Data Platform
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
 
Amazon Kinesis
Amazon KinesisAmazon Kinesis
Amazon Kinesis
 
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
 
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
 
700 Queries Per Second with Updates: Spark As A Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web Service700 Queries Per Second with Updates: Spark As A Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web Service
 
Architecting a Next Generation Data Platform
Architecting a Next Generation Data PlatformArchitecting a Next Generation Data Platform
Architecting a Next Generation Data Platform
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
 
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons LearnedA Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons Learned
 
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applicationsTop 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applications
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with Databricks
 
Introduction to near real time computing
Introduction to near real time computingIntroduction to near real time computing
Introduction to near real time computing
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
 

Plus de Lightbend

IoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaIoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaLightbend
 
How Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterHow Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterLightbend
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsLightbend
 
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesPutting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesLightbend
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesLightbend
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesLightbend
 
Cloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessCloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessLightbend
 
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondDigital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondLightbend
 
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Lightbend
 
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lightbend
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...Lightbend
 
Microservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightMicroservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightLightbend
 
Full Stack Reactive In Practice
Full Stack Reactive In PracticeFull Stack Reactive In Practice
Full Stack Reactive In PracticeLightbend
 
Akka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryAkka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryLightbend
 
Scala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowScala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowLightbend
 
Migrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsMigrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsLightbend
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsLightbend
 
Designing Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldDesigning Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldLightbend
 
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaScala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaLightbend
 
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On KubernetesHow To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On KubernetesLightbend
 

Plus de Lightbend (20)

IoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaIoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with Akka
 
How Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterHow Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a Cluster
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
 
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesPutting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and Microservices
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
 
Cloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessCloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful Serverless
 
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondDigital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
 
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
 
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
 
Microservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightMicroservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done Right
 
Full Stack Reactive In Practice
Full Stack Reactive In PracticeFull Stack Reactive In Practice
Full Stack Reactive In Practice
 
Akka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryAkka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love Story
 
Scala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowScala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To Know
 
Migrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsMigrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive Systems
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
 
Designing Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldDesigning Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native World
 
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaScala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
 
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On KubernetesHow To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
 

Dernier

CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 

Dernier (20)

CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 

Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job

  • 1.
  • 2. Check out these resources: Dean’s book Webinars etc. Fast Data Architectures 
 for Streaming Applications Getting Answers Now from Data Sets that Never End By Dean Wampler, Ph. D., VP of Fast Data Engineering 2 lightbend.com/products/fast-data-platform
  • 3. Streaming Engines in Context…
  • 10. New Streaming, “Fast Data” Architecture (but it also supports batch)
  • 11. Mesos, Kubernetes, YARN, … Cloud, on premise, … Logs Sockets REST ZooKeeper Cluster ZK Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaFa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  • 12. Mesos, Kubernetes, YARN, … Cloud, on premise, … Logs Sockets REST ZooKeeper Cluster ZK Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaFa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  • 13. Mesos, Kubernetes, YARN, … Cloud, on premise, … Logs Sockets REST ZooKeeper Cluster ZK Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaFa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  • 14. Mesos, Kubernetes, YARN, … Cloud, on premise, … Logs Sockets REST ZooKeeper Cluster ZK Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaFa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  • 15. Mesos, Kubernetes, YARN, … Cloud, on premise, … Logs Sockets REST ZooKeeper Cluster ZK Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaFa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  • 16. • Why Kafka? Service 1 Log & Other Files Internet Services Service 2 Service 3 Services Services N * M links ConsumersProducers Before: Service 1 Log & Other Files Internet Services Service 2 Service 3 Services Services N + M links ConsumersProducers After:
  • 17. Mesos, Kubernetes, YARN, … Cloud, on premise, … Logs Sockets REST ZooKeeper Cluster ZK Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaFa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  • 18. Mesos, Kubernetes, YARN, … Cloud, on premise, … Logs Sockets REST ZooKeeper Cluster ZK Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaFa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  • 19. Mesos, Kubernetes, YARN, … Cloud, on premise, … Logs Sockets REST ZooKeeper Cluster ZK Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaFa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  • 22. • Low latency? How low? • High volume? How high? • Which kinds of data processing, analytics? • Process data in bulk or individually? •Bulk processing of records? •Individual processing of events? • Preferred application architecture?
  • 23. • Low latency? How low? www.spacex.com/news
  • 24. • Low latency? How low? • Real real time? pico- to microseconds www.spacex.com/news
  • 25. • Low latency? How low? • < 100 microseconds? tradinghub.co/watch-list-for-mar-26th-2015/ www.usa.philips.com/
  • 26. • Low latency? How low? • < 10 milliseconds? money.cnn.com/2017/05/12/pf/credit-card-mistakes/index.html
  • 27. • Low latency? How low? • < 100s milliseconds? github.com/keen/dashboards coursera.org/learn/machine-learning
  • 28. • Low latency? How low? • < 1 second to minutes ETL Model Training storage Data Model Training Model Serving Other Logic Logs Ka'a Raw Logs Topic Parsed Logs Topic Ka'a Streams Job
  • 29. • Low latency? How low? • > 1 minute? • Use short batch jobs
  • 30. • High volume? How high?
  • 31. • High volume? How high? • < 1oK -100K per second? drdobbs.com/web-development/ soa-web-services-and-restful-systems/199902676
  • 32. • High volume? How high? • > 1M per second? https://store.nest.com/product/thermostat/
  • 33. • Which kinds of data processing, analytics? • SQL? SELECT COUNT(*) FROM my-iot-data GROUP BY zip-code val input = spark.read. format(“parquet”). stream(“my-iot-data”) input.groupBy(“zip-code”). count()
  • 34. • Which kinds of data processing, analytics? • “Dataflow”? val sc = new SparkContext("local[*]", "Inverted Idx") sc.textFile("data/crawl") .map { line => val Array(path, text) = line.split(“t”,2); (path, text } flatMap { case (path, text) => text.split(“”"W+""").map((_, path)) } map { case (w, p) => ((w, p), 1) } reduceByKey { case (n1, n2) => n1 + n2 } map {
  • 35. • Which kinds of data processing, analytics? • ETL? ETL Logs Ka'a Raw Logs Topic Parsed Logs Topic Ka'a Streams Job
  • 36. • Which kinds of data processing, analytics? • Train and serve ML models? storage Data Model Training Model Serving Other Logic
  • 37. • Process data in bulk or individually? • Individual events (i.e., CEP). • In bulk records (i.e., each datum’s identity unimportant). Microservice Microservice Microservice Microservice Service Actor 1 Event Event Event Event Event Event Router Actor Service Actor 2 … SA13 SA11 SA12 SA23 SA21 SA22 SELECT COUNT(*) FROM my-iot-data GROUP BY zip-code
  • 38. • Preferred application architecture • Streaming library in an app? • Distributed services running your job? Mini-batch Spark Streaming Low Latency Flink Ka0a Streams Akka Streams Beam … Mini-batch Spark Streaming Low Latency Flink Ka0a Streams Akka Streams Beam …
  • 39. Best of Breed Streaming Engines
  • 40. Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Ka9a 2 4 7 8 9 10 Beam
  • 41. Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Ka9a 2 4 7 8 9 10 Beam • Apache Beam • (Formerly Google Dataflow) • Define your flows; run with Flink, Spark, etc. • Beam is defining the state of the art for streaming semantics
  • 42.
  • 43. Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Ka9a 2 4 7 8 9 10 Beam • Apache Flink • Low-latency streaming • Best Beam runner • SQL, ML, etc.
  • 44. Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Ka9a 2 4 7 8 9 10 Beam • Apache Spark • Best known; large community • Batch, mini-batch, and new low-latency streaming • SQL, ML, etc.
  • 45. Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Ka9a 2 4 7 8 9 10 Beam • Akka Streams • Low-latency streaming • Rich dataflow language • Rich APIs for microservices, data sources and sinks • Excellent for model serving
  • 46. Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Ka9a 2 4 7 8 9 10 Beam • Kafka Streams • Read, write Kafka topics • Stream and Table abstractions • SQL on streams
  • 47. Low Latency and Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Ka9a 2 4 7 8 9 10 Beam • Spark or Flink? • Best for massive data sets • Rich analytics • Akka Streams or Kafka Streams • Best for microservice integration • Wider flexibility
  • 48. Check out these resources: Dean’s book Webinars etc. Fast Data Architectures 
 for Streaming Applications Getting Answers Now from Data Sets that Never End By Dean Wampler, Ph. D., VP of Fast Data Engineering 48 lightbend.com/products/fast-data-platform
  • 49. For more information on Lightbend Fast Data Platform: lightbend.com/fast-data-platform