SlideShare une entreprise Scribd logo
1  sur  61
1Confidential
Apache Kafka + Machine Learning
Analytic Models Applied to Real Time Stream Processing
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.kai-waehner.de
2Apache Kafka and Machine Learning
Agenda
1) Machine Learning in the Real World
2) Building an Analytic Model
3) Applying an Analytic Model in Real Time
4) Online Training of Models
3Apache Kafka and Machine Learning
Agenda
1) Machine Learning in the Real World
2) Building an Analytic Model
3) Applying an Analytic Model in Real Time
4) Online Training of Models
4Apache Kafka and Machine Learning
Machine Learning
... allows computers to find hidden insights without being
explicitly programmed where to look.
5Apache Kafka and Machine Learning
Real World Examples of Machine Learning
Spam Detection
Search Results +
Product Recommendation
Picture Detection
(Friends, Locations, Products)
Your Company
The Next Disruption:
Google Beats Go Champion
6Apache Kafka and Machine Learning
Leverage Machine Learning to Analyze and Act on Critical Business Moments
Seconds Minutes Hours
Price
Optimization
Predictive
Maintenance
Fraud
Detection
Cross
Selling
Transportation
Rerouting
Customer
Service
Inventory
Management
Windows of Opportunity
7Apache Kafka and Machine Learning
How to realize
these use cases?
8Apache Kafka and Machine Learning
Big Data Analytics
Volume
(terabytes,
petabytes)
Variety
(social networks,
blog posts, logs,
sensors, etc.)
Velocity
(„real time“)
Value
9Apache Kafka and Machine Learning
Big Data Analytics for Actionable Insights
From Insight to Action
(continuously closed loop)
10Apache Kafka and Machine Learning
Streaming Platform
Big Data Analytics
Database
IoT Device
Streaming
Producer
…..
DWH
Data
Integration
C
O
N
N
E
C
T
C
O
N
N
E
C
T
Data	Lake
Model
Building
Batch
Real
Time
Stream
Processing
REST
Interface
IoT Device
Mobile App
Streaming
Consumer
C
O
N
N
E
C
T
C
O
N
N
E
C
T
BI Tool
Messaging
Web
Application
Model
Schema Registry
/ Governance
1) Data Producer
2) Analytics Platform
3) Streaming Platform
4) Data Consumer
11Apache Kafka and Machine Learning
Agenda
1) Machine Learning in the Real World
2) Building an Analytic Model
3) Applying an Analytic Model in Real Time
4) Online Training of Models
12Apache Kafka and Machine Learning
Streaming Platform
Big Data Analytics
Database
IoT Device
Streaming
Producer
…..
DWH
Data
Integration
C
O
N
N
E
C
T
C
O
N
N
E
C
T
Data	Lake
Model
Building
Batch
Real
Time
Stream
Processing
REST
Interface
IoT Device
Mobile App
Streaming
Consumer
C
O
N
N
E
C
T
C
O
N
N
E
C
T
BI Tool
Messaging
Web
Application
Model
Schema Registry
/ Governance
1) Data Producer
2) Analytics Platform
3) Streaming Platform
4) Data Consumer
13Apache Kafka and Machine Learning
Hidden Technical Debt in Machine Learning Systems
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
Writing
source code
is not the
time-consuming
task!
!
14Apache Kafka and Machine Learning
Analytical Pipeline
1. Data Access
2. Data Preparation
3. Exploratory Data Analysis
4. Model Building
5. Model Execution
6. Model Validation
7. Deployment
15Apache Kafka and Machine Learning
Data Access
Find insights to create
added business value
by correlating
various data sources!
16Apache Kafka and Machine Learning
Data Preparation
http://www.slideshare.net/odsc/feature-engineering
Data Preparation
17Apache Kafka and Machine Learning
Exploratory Data Analysis
© Copyright 2000-2017 TIBCO Software Inc.
• Scripting
• Visual Analytics
• Machine Learning
18Apache Kafka and Machine Learning
Model Building
A model is a simplification of the truth
that helps you with decision making.
19Apache Kafka and Machine Learning
Model Execution (Coding)
Apply Model
to New Data
20Apache Kafka and Machine Learning
Model Execution (Tooling)
Apply Model
to New Data
21Apache Kafka and Machine Learning
Model Validation
https://genome.tugraz.at/proclassify/help/pages/XV.html
Cross-Validation
Procedure
22Apache Kafka and Machine Learning
Frameworks
and Tooling?
23Apache Kafka and Machine Learning
Languages, Frameworks and Tools
Many more ….
Portable Format
for Analytics (PFA)
24Apache Kafka and Machine Learning
Live Demos with Open Source Technologies
Development of Analytic Models
with R, TensorFlow, Apache Spark, H2O.ai, RapidMiner
25Apache Kafka and Machine Learning
Live Demo
Use Case:
Customer Churn Prediction
Machine Learning Algorithm:
Generalized Linear Model (GLM)
using Logistic Regression
Technology:
Open Source R
26Apache Kafka and Machine Learning
Live Demo
Use Case:
Airline Flight Delay Prediction
Machine Learning Algorithm:
Gradient Boosted Machines (GBM)
using Decision Trees
Technology:
H2O.ai
27Apache Kafka and Machine Learning
Live Demo
Use Case:
Predictive Maintenance
(Anomaly Detection in Telco Networks)
Deep Learning Algorithm:
Artificial Neural Networks (ANN)
using Autoencoders
Technology:
TensorFlow + Python API
28Apache Kafka and Machine Learning
Live Demo
Use Case:
Classification
(Prediction of Titanic Survivors)
Deep Learning Algorithm:
Recurrent Neural Networks (RNN)
Technology:
RapidMiner
29Apache Kafka and Machine Learning
Agenda
1) Machine Learning in the Real World
2) Building an Analytic Model
3) Applying an Analytic Model in Real Time
4) Online Training of Models
30Apache Kafka and Machine Learning
Analytical Pipeline
1. Data Access
2. Data Preparation
3. Exploratory Data Analysis
4. Model Building
5. Model Execution
6. Model Validation
7. Deployment
31Apache Kafka and Machine Learning
Streaming Platform
Big Data Analytics
Database
IoT Device
Streaming
Producer
…..
DWH
Data
Integration
C
O
N
N
E
C
T
C
O
N
N
E
C
T
Data	Lake
Model
Building
Batch
Real
Time
Stream
Processing
REST
Interface
IoT Device
Mobile App
Streaming
Consumer
C
O
N
N
E
C
T
C
O
N
N
E
C
T
BI Tool
Messaging
Web
Application
Model
Schema Registry
/ Governance
1) Data Producer
2) Analytics Platform
3) Streaming Platform
4) Data Consumer
32Apache Kafka and Machine Learning
Definition of Stream Processsing
Data at Rest Data in Motion
33Apache Kafka and Machine Learning
Key Concepts
34Apache Kafka and Machine Learning
Key Concepts
35Apache Kafka and Machine Learning
Key Concepts
36Apache Kafka and Machine Learning
Stream Processing
Use Cases
• Real Time Applications
• Stateful Streaming Analytics
• Stateless “Real Time ETL”
37Apache Kafka and Machine Learning
Event Processing Windows
Various Options for Windowing (Fixed, Sliding, Session, …)
38Apache Kafka and Machine Learning
How to
apply analytic models
to real time processing
without redevelopment?
39Apache Kafka and Machine Learning
Application of Analytic Models to Real Time without Redevelopment
Stream
Processing
H20.ai
R
Python
Spark ML
MATLAB
SAS
PMML
40Apache Kafka and Machine Learning
Streaming Analytics - Processing Pipeline
APIs
Adapters /
Channels
Integration
Messaging
Stream
Ingest
Transformation
Aggregation
Enrichment
Filtering
Stream
Preprocessing
Process
Management
Analytics
(Real Time)
Applications
& APIs
Analytics /
DW Reporting
Stream
Outcomes
• Contextual Rules
• Windowing
• Patterns
• Analytics
• Machine Learning
• …
Stream
Analytics
Index / SearchNormalization
Applying an Analytic Model
is just a piece of the puzzle!
41Apache Kafka and Machine Learning
Frameworks
and Tooling?
42Apache Kafka and Machine Learning
Frameworks and Products
OPEN SOURCE CLOSED SOURCE
PRODUCT
FRAMEWORK
Azure Microsoft
Stream Analytics
43Apache Kafka and Machine Learning
When to use Kafka Streams for Stream Processing?
44Apache Kafka and Machine Learning
When to use Kafka Streams for Stream Processing?
No need for a
Big Data cluster
Deploy in your
existing infrastructure
Kafka manages
scalability / fail-over
Focus on development
of business logic
in your department
45Apache Kafka and Machine Learning
Kafka Streams
Map, filter, aggregate,
apply analytic model,
„any business logic“
Input Stream
(Kafka Topic)
Kafka Cluster
Output Stream
(Kafka Topic)
Kafka Cluster
Stream Processing
Microservice
(Kafka Streams)
Deployed anywhere:
Docker, Kubernetes,
Mesos, Java App, …
46Apache Kafka and Machine Learning
A complete streaming microservices, ready for production at large-scale
Word
Count
App configuration
Define processing
(here: WordCount)
Start processing
47Apache Kafka and Machine Learning
Confluent Platform: the Free, Open-Source Streaming Platform
Open Source ExternalCommercial
Confluent Platform
Monitoring
Analytics
Custom Apps
Transformations
Real-time
Applications
…
CRM
Data Warehouse
Database
Hadoop
Data
Integration
…
Control Center
Auto-data
Balancing
Multi-Data
Center Replication
24/7 Support
Supported
Connectors
Clients
Schema
Registry
REST
Proxy
Apache Kafka
Kafka
Connect
Kafka
Streams
Kafka
Core
Database Changes Log Events loT Data Web Events …
48Apache Kafka and Machine Learning
Streaming Platform
Big Data Analytics
Database
IoT Device
Streaming
Producer
…..
DWH
Data
Integration
C
O
N
N
E
C
T
C
O
N
N
E
C
T
Data	Lake
Model
Building
Batch
Real
Time
Stream
Processing
REST
Interface
IoT Device
Mobile App
Streaming
Consumer
C
O
N
N
E
C
T
C
O
N
N
E
C
T
BI Tool
Messaging
Web
Application
Model
Schema Registry
/ Governance
1) Data Producer
2) Analytics Platform
3) Streaming Platform
4) Data Consumer
49Apache Kafka and Machine Learning
STREAMING PLATFORM
BIG DATAANALYTICS
Oracle DB
CoaP IoT
Kafka
Java Client
…..
HP Vertica
Data
Integration
F
L
U
M
E
H2O.ai,
Spark,
TensorFlow
Batch
Real
Time
Confluent
REST Proxy
MQTT IoT
iPhone App
Kafka
Go Client
C
K O
A N
F N
K E
A C
T
H
I
V
E
Grafana
Kafka
Java EE
Web App
Hadoop
C
K O
A N
F N
K E
A C
T
Confluent
Schema Registry
Kafka Streams
H2O.ai
Mesos
Kafka Streams
TensorFlow
Kubernetes
Avro
Avro
1) Data Producer
2) Analytics Platform
3) Streaming Platform
4) Data Consumer
50Apache Kafka and Machine Learning
Live Demos with Open Source Technologies
Development of Analytic Models
with Apache Kafka Messaging, Kafka Streams, Kafka Connect, Confluent Schema Registry
51Apache Kafka and Machine Learning
Live Demo
Use Case:
Airline Flight Delay Prediction
Machine Learning Algorithm:
Any! (in our example, H2O.ai GBM)
Streaming Platform:
Apache Kafka Core, Kafka Connect,
Kafka Streams, Confluent Schema Registry
52Apache Kafka and Machine Learning
H2O.ai Model + Kafka Streams
Filter
Map
1) Create H2O ML model
2) Configure Kafka Streams Application
3) Apply H2O ML model to Streaming Data
4) Start Kafka Streams App
53Apache Kafka and Machine Learning
End-to-End Stream Monitoring and Alerting
Confluent Control Center
Data Stream Monitoring and Alerting
Multi-cluster monitoring and management
Kafka Connect Configuration
• Message delivery?
• Delays?
• Where got it stuck?
• Lost messages?
• Broker issues?
• Performance?
http://docs.confluent.io/3.2.0/control-center/docs/monitoring.html
54Apache Kafka and Machine Learning
Agenda
1) Machine Learning in the Real World
2) Building an Analytic Model
3) Applying an Analytic Model in Real Time
4) Online Training of Models
55Apache Kafka and Machine Learning
Let’s improve
the analytic model
continuously…
56Apache Kafka and Machine Learning
Analytical Pipeline
1. Data Access
2. Data Preparation
3. Exploratory Data Analysis
4. Model Building
5. Model Execution
6. Model Validation
7. Deployment
Online
Training
Continuously train and improve the model with every new event
57Apache Kafka and Machine Learning
Online Model Training of Analytic Models
How to improve models?
1.Manual Update
2.Automated Batch
3.Real Time
58Apache Kafka and Machine Learning
STREAMING PLATFORM
BIG DATAANALYTICS
F
L
U
M
E
H2O.ai,
Spark,
TensorFlow
H
I
V
E
Kafka
Hadoop
Confluent
Schema Registry
Kafka Streams
H2O.ai
Mesos
Kafka Streams
TensorFlow
Kubernetes
Avro
Avro
1) Get new Input Event
via Kafka Topic
2) Improve Model in
Big Data Cluster
3) Update deployed Model
via Kafka Topic
4) Leverage
Improved Model
for new Events
59Apache Kafka and Machine Learning
Caveats for Online Model Training
• Processes and infrastructure not ready
• Validation needed before production
• Slows down the system
• Only a few ML implementations supported
• Many use cases do not need it
60Apache Kafka and Machine Learning
Key Take-Aways
Ø Insights are hidden in Historical Data on Big Data Platforms
Ø Machine Learning and Big Data Analytics find these Insights by building Analytics Models
Ø Streaming Platform uses these Models (without Redevelopment) to take Action in Real Time
61Apache Kafka and Machine Learning
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
LinkedIn
Questions? Feedback?
Please contact me!

Contenu connexe

Tendances

Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used forAljoscha Krettek
 
Apache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and ArchitecturesApache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and ArchitecturesKai Wähner
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
 
Top use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache KafkaTop use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache Kafkaconfluent
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com confluent
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsAmazon Web Services
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkDataWorks Summit
 
GCP for Apache Kafka® Users: Stream Ingestion and Processing
GCP for Apache Kafka® Users: Stream Ingestion and ProcessingGCP for Apache Kafka® Users: Stream Ingestion and Processing
GCP for Apache Kafka® Users: Stream Ingestion and Processingconfluent
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
 
Apache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial ServicesApache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial Servicesconfluent
 
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerCloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerDatabricks
 
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...HostedbyConfluent
 

Tendances (20)

Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Apache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and ArchitecturesApache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and Architectures
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Top use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache KafkaTop use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache Kafka
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming Applications
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache Flink
 
GCP for Apache Kafka® Users: Stream Ingestion and Processing
GCP for Apache Kafka® Users: Stream Ingestion and ProcessingGCP for Apache Kafka® Users: Stream Ingestion and Processing
GCP for Apache Kafka® Users: Stream Ingestion and Processing
 
kafka
kafkakafka
kafka
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
 
Apache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial ServicesApache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial Services
 
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerCloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
 
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
 

En vedette

Essential Capabilities of an IoT Cloud Platform - April 2017 AWS Online Tech ...
Essential Capabilities of an IoT Cloud Platform - April 2017 AWS Online Tech ...Essential Capabilities of an IoT Cloud Platform - April 2017 AWS Online Tech ...
Essential Capabilities of an IoT Cloud Platform - April 2017 AWS Online Tech ...Amazon Web Services
 
Get your mobile app in production in 3 months: Backend
Get your mobile app in production in 3 months: BackendGet your mobile app in production in 3 months: Backend
Get your mobile app in production in 3 months: BackendAckee
 
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...Kai Wähner
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...Spark Summit
 
Streaming platform Kafka in SK planet
Streaming platform Kafka in SK planetStreaming platform Kafka in SK planet
Streaming platform Kafka in SK planetByeongsu Kang
 
Spark machine learning & deep learning
Spark machine learning & deep learningSpark machine learning & deep learning
Spark machine learning & deep learninghoondong kim
 
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Kai Wähner
 

En vedette (7)

Essential Capabilities of an IoT Cloud Platform - April 2017 AWS Online Tech ...
Essential Capabilities of an IoT Cloud Platform - April 2017 AWS Online Tech ...Essential Capabilities of an IoT Cloud Platform - April 2017 AWS Online Tech ...
Essential Capabilities of an IoT Cloud Platform - April 2017 AWS Online Tech ...
 
Get your mobile app in production in 3 months: Backend
Get your mobile app in production in 3 months: BackendGet your mobile app in production in 3 months: Backend
Get your mobile app in production in 3 months: Backend
 
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Streaming platform Kafka in SK planet
Streaming platform Kafka in SK planetStreaming platform Kafka in SK planet
Streaming platform Kafka in SK planet
 
Spark machine learning & deep learning
Spark machine learning & deep learningSpark machine learning & deep learning
Spark machine learning & deep learning
 
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
 

Similaire à Kafka ML Real-Time Analytics

Kai Wähner, Technology Evangelist at Confluent: "Development of Scalable Mac...
Kai Wähner, Technology Evangelist at Confluent: "Development of  Scalable Mac...Kai Wähner, Technology Evangelist at Confluent: "Development of  Scalable Mac...
Kai Wähner, Technology Evangelist at Confluent: "Development of Scalable Mac...Dataconomy Media
 
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...Kai Wähner
 
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...Codemotion
 
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...Kai Wähner
 
Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
Machine Learning Trends of 2018 combined with the Apache Kafka EcosystemMachine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
Machine Learning Trends of 2018 combined with the Apache Kafka EcosystemKai Wähner
 
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehnerNitin Kumar
 
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...Codemotion
 
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesUnleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesKai Wähner
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridPaolo Castagna
 
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Kai Wähner
 
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...confluent
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Kai Wähner
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...confluent
 
Apache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesApache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesKai Wähner
 
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...confluent
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...confluent
 
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksOverview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksDataWorks Summit/Hadoop Summit
 
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksSlim Baltagi
 
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksSlim Baltagi
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLSingleStore
 

Similaire à Kafka ML Real-Time Analytics (20)

Kai Wähner, Technology Evangelist at Confluent: "Development of Scalable Mac...
Kai Wähner, Technology Evangelist at Confluent: "Development of  Scalable Mac...Kai Wähner, Technology Evangelist at Confluent: "Development of  Scalable Mac...
Kai Wähner, Technology Evangelist at Confluent: "Development of Scalable Mac...
 
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
 
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
 
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
 
Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
Machine Learning Trends of 2018 combined with the Apache Kafka EcosystemMachine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
 
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
2019 04 seattle_meetup___kafka_machine_learning___kai_waehner
 
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
Kai Waehner - Deep Learning at Extreme Scale in the Cloud with Apache Kafka a...
 
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesUnleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - Madrid
 
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
 
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
 
Apache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice ArchitecturesApache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka as Event Streaming Platform for Microservice Architectures
 
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
 
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksOverview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
 
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
 
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
 

Plus de Kai Wähner

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Kai Wähner
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKai Wähner
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareKai Wähner
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Kai Wähner
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Kai Wähner
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail IndustryKai Wähner
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Kai Wähner
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingKai Wähner
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesKai Wähner
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Kai Wähner
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Kai Wähner
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsKai Wähner
 
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationApache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationKai Wähner
 
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Kai Wähner
 
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureServerless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureKai Wähner
 

Plus de Kai Wähner (20)

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and Manufacturing
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and Logistics
 
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationApache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
 
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
 
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureServerless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
 

Dernier

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Dernier (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Kafka ML Real-Time Analytics

  • 1. 1Confidential Apache Kafka + Machine Learning Analytic Models Applied to Real Time Stream Processing Kai Waehner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.kai-waehner.de
  • 2. 2Apache Kafka and Machine Learning Agenda 1) Machine Learning in the Real World 2) Building an Analytic Model 3) Applying an Analytic Model in Real Time 4) Online Training of Models
  • 3. 3Apache Kafka and Machine Learning Agenda 1) Machine Learning in the Real World 2) Building an Analytic Model 3) Applying an Analytic Model in Real Time 4) Online Training of Models
  • 4. 4Apache Kafka and Machine Learning Machine Learning ... allows computers to find hidden insights without being explicitly programmed where to look.
  • 5. 5Apache Kafka and Machine Learning Real World Examples of Machine Learning Spam Detection Search Results + Product Recommendation Picture Detection (Friends, Locations, Products) Your Company The Next Disruption: Google Beats Go Champion
  • 6. 6Apache Kafka and Machine Learning Leverage Machine Learning to Analyze and Act on Critical Business Moments Seconds Minutes Hours Price Optimization Predictive Maintenance Fraud Detection Cross Selling Transportation Rerouting Customer Service Inventory Management Windows of Opportunity
  • 7. 7Apache Kafka and Machine Learning How to realize these use cases?
  • 8. 8Apache Kafka and Machine Learning Big Data Analytics Volume (terabytes, petabytes) Variety (social networks, blog posts, logs, sensors, etc.) Velocity („real time“) Value
  • 9. 9Apache Kafka and Machine Learning Big Data Analytics for Actionable Insights From Insight to Action (continuously closed loop)
  • 10. 10Apache Kafka and Machine Learning Streaming Platform Big Data Analytics Database IoT Device Streaming Producer ….. DWH Data Integration C O N N E C T C O N N E C T Data Lake Model Building Batch Real Time Stream Processing REST Interface IoT Device Mobile App Streaming Consumer C O N N E C T C O N N E C T BI Tool Messaging Web Application Model Schema Registry / Governance 1) Data Producer 2) Analytics Platform 3) Streaming Platform 4) Data Consumer
  • 11. 11Apache Kafka and Machine Learning Agenda 1) Machine Learning in the Real World 2) Building an Analytic Model 3) Applying an Analytic Model in Real Time 4) Online Training of Models
  • 12. 12Apache Kafka and Machine Learning Streaming Platform Big Data Analytics Database IoT Device Streaming Producer ….. DWH Data Integration C O N N E C T C O N N E C T Data Lake Model Building Batch Real Time Stream Processing REST Interface IoT Device Mobile App Streaming Consumer C O N N E C T C O N N E C T BI Tool Messaging Web Application Model Schema Registry / Governance 1) Data Producer 2) Analytics Platform 3) Streaming Platform 4) Data Consumer
  • 13. 13Apache Kafka and Machine Learning Hidden Technical Debt in Machine Learning Systems https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf Writing source code is not the time-consuming task! !
  • 14. 14Apache Kafka and Machine Learning Analytical Pipeline 1. Data Access 2. Data Preparation 3. Exploratory Data Analysis 4. Model Building 5. Model Execution 6. Model Validation 7. Deployment
  • 15. 15Apache Kafka and Machine Learning Data Access Find insights to create added business value by correlating various data sources!
  • 16. 16Apache Kafka and Machine Learning Data Preparation http://www.slideshare.net/odsc/feature-engineering Data Preparation
  • 17. 17Apache Kafka and Machine Learning Exploratory Data Analysis © Copyright 2000-2017 TIBCO Software Inc. • Scripting • Visual Analytics • Machine Learning
  • 18. 18Apache Kafka and Machine Learning Model Building A model is a simplification of the truth that helps you with decision making.
  • 19. 19Apache Kafka and Machine Learning Model Execution (Coding) Apply Model to New Data
  • 20. 20Apache Kafka and Machine Learning Model Execution (Tooling) Apply Model to New Data
  • 21. 21Apache Kafka and Machine Learning Model Validation https://genome.tugraz.at/proclassify/help/pages/XV.html Cross-Validation Procedure
  • 22. 22Apache Kafka and Machine Learning Frameworks and Tooling?
  • 23. 23Apache Kafka and Machine Learning Languages, Frameworks and Tools Many more …. Portable Format for Analytics (PFA)
  • 24. 24Apache Kafka and Machine Learning Live Demos with Open Source Technologies Development of Analytic Models with R, TensorFlow, Apache Spark, H2O.ai, RapidMiner
  • 25. 25Apache Kafka and Machine Learning Live Demo Use Case: Customer Churn Prediction Machine Learning Algorithm: Generalized Linear Model (GLM) using Logistic Regression Technology: Open Source R
  • 26. 26Apache Kafka and Machine Learning Live Demo Use Case: Airline Flight Delay Prediction Machine Learning Algorithm: Gradient Boosted Machines (GBM) using Decision Trees Technology: H2O.ai
  • 27. 27Apache Kafka and Machine Learning Live Demo Use Case: Predictive Maintenance (Anomaly Detection in Telco Networks) Deep Learning Algorithm: Artificial Neural Networks (ANN) using Autoencoders Technology: TensorFlow + Python API
  • 28. 28Apache Kafka and Machine Learning Live Demo Use Case: Classification (Prediction of Titanic Survivors) Deep Learning Algorithm: Recurrent Neural Networks (RNN) Technology: RapidMiner
  • 29. 29Apache Kafka and Machine Learning Agenda 1) Machine Learning in the Real World 2) Building an Analytic Model 3) Applying an Analytic Model in Real Time 4) Online Training of Models
  • 30. 30Apache Kafka and Machine Learning Analytical Pipeline 1. Data Access 2. Data Preparation 3. Exploratory Data Analysis 4. Model Building 5. Model Execution 6. Model Validation 7. Deployment
  • 31. 31Apache Kafka and Machine Learning Streaming Platform Big Data Analytics Database IoT Device Streaming Producer ….. DWH Data Integration C O N N E C T C O N N E C T Data Lake Model Building Batch Real Time Stream Processing REST Interface IoT Device Mobile App Streaming Consumer C O N N E C T C O N N E C T BI Tool Messaging Web Application Model Schema Registry / Governance 1) Data Producer 2) Analytics Platform 3) Streaming Platform 4) Data Consumer
  • 32. 32Apache Kafka and Machine Learning Definition of Stream Processsing Data at Rest Data in Motion
  • 33. 33Apache Kafka and Machine Learning Key Concepts
  • 34. 34Apache Kafka and Machine Learning Key Concepts
  • 35. 35Apache Kafka and Machine Learning Key Concepts
  • 36. 36Apache Kafka and Machine Learning Stream Processing Use Cases • Real Time Applications • Stateful Streaming Analytics • Stateless “Real Time ETL”
  • 37. 37Apache Kafka and Machine Learning Event Processing Windows Various Options for Windowing (Fixed, Sliding, Session, …)
  • 38. 38Apache Kafka and Machine Learning How to apply analytic models to real time processing without redevelopment?
  • 39. 39Apache Kafka and Machine Learning Application of Analytic Models to Real Time without Redevelopment Stream Processing H20.ai R Python Spark ML MATLAB SAS PMML
  • 40. 40Apache Kafka and Machine Learning Streaming Analytics - Processing Pipeline APIs Adapters / Channels Integration Messaging Stream Ingest Transformation Aggregation Enrichment Filtering Stream Preprocessing Process Management Analytics (Real Time) Applications & APIs Analytics / DW Reporting Stream Outcomes • Contextual Rules • Windowing • Patterns • Analytics • Machine Learning • … Stream Analytics Index / SearchNormalization Applying an Analytic Model is just a piece of the puzzle!
  • 41. 41Apache Kafka and Machine Learning Frameworks and Tooling?
  • 42. 42Apache Kafka and Machine Learning Frameworks and Products OPEN SOURCE CLOSED SOURCE PRODUCT FRAMEWORK Azure Microsoft Stream Analytics
  • 43. 43Apache Kafka and Machine Learning When to use Kafka Streams for Stream Processing?
  • 44. 44Apache Kafka and Machine Learning When to use Kafka Streams for Stream Processing? No need for a Big Data cluster Deploy in your existing infrastructure Kafka manages scalability / fail-over Focus on development of business logic in your department
  • 45. 45Apache Kafka and Machine Learning Kafka Streams Map, filter, aggregate, apply analytic model, „any business logic“ Input Stream (Kafka Topic) Kafka Cluster Output Stream (Kafka Topic) Kafka Cluster Stream Processing Microservice (Kafka Streams) Deployed anywhere: Docker, Kubernetes, Mesos, Java App, …
  • 46. 46Apache Kafka and Machine Learning A complete streaming microservices, ready for production at large-scale Word Count App configuration Define processing (here: WordCount) Start processing
  • 47. 47Apache Kafka and Machine Learning Confluent Platform: the Free, Open-Source Streaming Platform Open Source ExternalCommercial Confluent Platform Monitoring Analytics Custom Apps Transformations Real-time Applications … CRM Data Warehouse Database Hadoop Data Integration … Control Center Auto-data Balancing Multi-Data Center Replication 24/7 Support Supported Connectors Clients Schema Registry REST Proxy Apache Kafka Kafka Connect Kafka Streams Kafka Core Database Changes Log Events loT Data Web Events …
  • 48. 48Apache Kafka and Machine Learning Streaming Platform Big Data Analytics Database IoT Device Streaming Producer ….. DWH Data Integration C O N N E C T C O N N E C T Data Lake Model Building Batch Real Time Stream Processing REST Interface IoT Device Mobile App Streaming Consumer C O N N E C T C O N N E C T BI Tool Messaging Web Application Model Schema Registry / Governance 1) Data Producer 2) Analytics Platform 3) Streaming Platform 4) Data Consumer
  • 49. 49Apache Kafka and Machine Learning STREAMING PLATFORM BIG DATAANALYTICS Oracle DB CoaP IoT Kafka Java Client ….. HP Vertica Data Integration F L U M E H2O.ai, Spark, TensorFlow Batch Real Time Confluent REST Proxy MQTT IoT iPhone App Kafka Go Client C K O A N F N K E A C T H I V E Grafana Kafka Java EE Web App Hadoop C K O A N F N K E A C T Confluent Schema Registry Kafka Streams H2O.ai Mesos Kafka Streams TensorFlow Kubernetes Avro Avro 1) Data Producer 2) Analytics Platform 3) Streaming Platform 4) Data Consumer
  • 50. 50Apache Kafka and Machine Learning Live Demos with Open Source Technologies Development of Analytic Models with Apache Kafka Messaging, Kafka Streams, Kafka Connect, Confluent Schema Registry
  • 51. 51Apache Kafka and Machine Learning Live Demo Use Case: Airline Flight Delay Prediction Machine Learning Algorithm: Any! (in our example, H2O.ai GBM) Streaming Platform: Apache Kafka Core, Kafka Connect, Kafka Streams, Confluent Schema Registry
  • 52. 52Apache Kafka and Machine Learning H2O.ai Model + Kafka Streams Filter Map 1) Create H2O ML model 2) Configure Kafka Streams Application 3) Apply H2O ML model to Streaming Data 4) Start Kafka Streams App
  • 53. 53Apache Kafka and Machine Learning End-to-End Stream Monitoring and Alerting Confluent Control Center Data Stream Monitoring and Alerting Multi-cluster monitoring and management Kafka Connect Configuration • Message delivery? • Delays? • Where got it stuck? • Lost messages? • Broker issues? • Performance? http://docs.confluent.io/3.2.0/control-center/docs/monitoring.html
  • 54. 54Apache Kafka and Machine Learning Agenda 1) Machine Learning in the Real World 2) Building an Analytic Model 3) Applying an Analytic Model in Real Time 4) Online Training of Models
  • 55. 55Apache Kafka and Machine Learning Let’s improve the analytic model continuously…
  • 56. 56Apache Kafka and Machine Learning Analytical Pipeline 1. Data Access 2. Data Preparation 3. Exploratory Data Analysis 4. Model Building 5. Model Execution 6. Model Validation 7. Deployment Online Training Continuously train and improve the model with every new event
  • 57. 57Apache Kafka and Machine Learning Online Model Training of Analytic Models How to improve models? 1.Manual Update 2.Automated Batch 3.Real Time
  • 58. 58Apache Kafka and Machine Learning STREAMING PLATFORM BIG DATAANALYTICS F L U M E H2O.ai, Spark, TensorFlow H I V E Kafka Hadoop Confluent Schema Registry Kafka Streams H2O.ai Mesos Kafka Streams TensorFlow Kubernetes Avro Avro 1) Get new Input Event via Kafka Topic 2) Improve Model in Big Data Cluster 3) Update deployed Model via Kafka Topic 4) Leverage Improved Model for new Events
  • 59. 59Apache Kafka and Machine Learning Caveats for Online Model Training • Processes and infrastructure not ready • Validation needed before production • Slows down the system • Only a few ML implementations supported • Many use cases do not need it
  • 60. 60Apache Kafka and Machine Learning Key Take-Aways Ø Insights are hidden in Historical Data on Big Data Platforms Ø Machine Learning and Big Data Analytics find these Insights by building Analytics Models Ø Streaming Platform uses these Models (without Redevelopment) to take Action in Real Time
  • 61. 61Apache Kafka and Machine Learning Kai Waehner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.kai-waehner.de LinkedIn Questions? Feedback? Please contact me!