SlideShare une entreprise Scribd logo
1  sur  40
2013 © Trivadis
BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

WELCOME Twitter Storm:
Ereignisverarbeitung in
Echtzeit
Guido Schmutz

Java Forum Stuttgart 2013
4.7.2013
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
1
2013 © Trivadis
Trivadis ist führend bei der IT-Beratung, der Systemintegration,
dem Solution-Engineering und der Erbringung von IT-Services
mit Fokussierung auf und Technologien
im D-A-CH-Raum.
Unsere Leistungen erbringen wir aus den strategischen Geschäftsfeldern:
Trivadis Services übernimmt den korrespondierenden Betrieb
Ihrer IT Systeme.
Unser Unternehmen
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
B E T R I E B
2
2013 © Trivadis
Mit über 600 IT- und Fachexperten bei Ihnen vor Ort
3
11 Trivadis Niederlassungen mit

über 600 Mitarbeitenden
200 Service Level Agreements
Mehr als 4'000 Trainingsteilnehmer
Forschungs- und Entwicklungs-
budget: CHF 5.0 / EUR 4 Mio.
Finanziell unabhängig und

nachhaltig profitabel
Erfahrung aus mehr als 1'900
Projekten pro Jahr bei über 800
Kunden
Stand 12/2012
Hamburg
Düsseldorf
Frankfurt
Freiburg
München
Wien
Basel
ZürichBern
Lausanne
3
Stuttgart
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
3
2013 © Trivadis
Guido Schmutz
•  Working for Trivadis for more than 16 years
•  Oracle ACE Director for Fusion Middleware and SOA
•  Co-Author of different books
•  Consultant, Trainer Software Architect for Java, Oracle, SOA,
EDA, BigData und FastData
•  Member of Trivadis Architecture Board
•  Technology Manager @ Trivadis
•  More than 20 years of software development 

experience
•  Contact: guido.schmutz@trivadis.com
•  Blog: http://guidoschmutz.wordpress.com
•  Twitter: gschmutz
4.7.2013
4
Twitter Storm: Ereignisverarbeitung in Echtzeit
2013 © Trivadis
Big Data Definition (Gartner et al)
14.02.2013
Big Data 4 Sales
5
Velocity
Tera-, Peta-, Exa-, Zetta-, Yota- bytes and constantly growing
“Traditional” computing in RDBMS 

is not scalable enough. 

We search for “linear scalability”
“Only … structured information 

is not enough” – “95% of produced data in
unstructured”
Characteristics of Big Data: Its
Volume, Velocity and Variety in
combination
+ Veracity (IBM) - information uncertainty
+ Time to action ? – Big Data + Event Processing = Fast Data
2013 © Trivadis
14.02.2013
Big Data 4 Sales
6
2013 © Trivadis
Velocity
24. April 2013
Big Data und Fast Data
7
§  Velocity requirement examples:
§  Recommendation Engine
§  Predictive Analytics
§  Marketing Campaign Analysis
§  Customer Retention and Churn Analysis
§  Social Graph Analysis
§  Capital Markets Analysis
§  Risk Management
§  Rogue Trading
§  Fraud Detection
§  Retail Banking
§  Network Monitoring
§  Research and Development
2013 © Trivadis
Agenda
1.  What is Twitter Storm
2.  Main Concepts
3.  Trident
4.  BigData and FastData combined – Lambda Architecture
5.  Summary
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
8
2013 © Trivadis
What is Twitter Storm ?
A platform for doing analysis on streams of data as they come in, so you
can react to data as it happens.
•  A highly distributed real-time computation system
•  Provides general primitives to do real-time computation
•  To simplify working with queues & workers
•  scalable and fault-tolerant
•  complementary to Hadoop
Originated at Backtype, acquired by Twitter in 2011
Open Sourced late 2011
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
9
2013 © Trivadis
Hadoop vs. Storm
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
10
Storm
Real-time processing
Topologies run forever
Stateless nodes
Scalable
Guarantees no data loss
Open Source
=> Fast, reactive, real time procesing
Hadoop
Batch processing
Jobs runs to completion
Stateful nodes
Scalable
Guarantees no data loss
Open Source
=> big batch processing
=
!=
2013 © Trivadis
Idea: Simplify dealing with queue & workers
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
11
Scaling is painful – queue partitioning & worker deploy
Operational overhead – worker failures & queue backups
No guarantees on data processing
2013 © Trivadis
What does Storm provide?
§  At least once message processing
§  Fault-tolerant
§  Horizontal scalability
§  No intermediate queues
§  Less operational overhead
§  „Just works“
§  Broad set of use cases
§  Stream processing
§  Continous computation
§  Distributed RPC
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
12
2013 © Trivadis
Agenda
1.  What is Twitter Storm
2.  Main Concepts
3.  Trident
4.  BigData and FastData combined – Lambda Architecture
5.  Summary
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
13
2013 © Trivadis
Main Concepts – Shown based on Twitter example
Show the tweets related to Java Forum (use hashtag #JFS2013) and show
the count of the hashtags used
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
14
not showing #JFS2013
2013 © Trivadis
Main Concepts - Streams
Stream
•  an unbounded sequence of tuples
•  core abstraction in Storm
•  Defined with a schema that names the 

fields in the tuple
•  Value must be serializable
•  Every stream has an id
Topology
•  graph where each node is a spout or bolt
•  edges indicating which bolt subscribes 

to which streams
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
15
Source of
stream A
Source of
stream B
Subscribes: A
Emits: C
Subscribes: A
Emits: D
Subscribes: A & B
Emits: -
Subscribes: C & D
Emits: -
T T T T T T T T
2013 © Trivadis
Worker
•  executes a subset of a topology, may run one or more executors for one or
more components
Executor
•  thread spawned by worker
Task
•  performs the actual data processing
Main Concepts – Worker, Executor, Task
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
16
Worker Process
Task
Task
Task
Task
2013 © Trivadis
Main Concepts - Spout
A spout is the component which acts as the source of streams in a
topology
Generally spouts read tuples from an external source and emit them into
the topology
Spouts can emit more than one stream
Step 1: Use a Spout to subscribe to the Twitter Filter Streaming API to get
all the tweets containing the hashtag #JFS2013
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
17
Twitter
Stream
Tweets tweet
Tweet

Stream
2013 © Trivadis
Main Concepts - Spout
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
18
Twitter
Stream
Tweets tweet
Tweet

Stream
By calling this method, Storm requests that the
Spout emits tuples to the output collector
Called when a task for this
component is initialized
Used to declare streams and their schmeas. it is possible to declare
serveral streams and specifying the one to use when emiting tuples
2013 © Trivadis
Bolt is doing the processing of the message
•  filtering, functions, aggregations, joins, talking to databases….
•  Can do simple stream transformations
•  Complex transformations often require multiple steps and therefore multiple
bolts
Bolts can emit one or more streams
Step 2: Use a bolt to retrieve the hashtags for each tweet
Main Concepts - Bolt
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
19
Twitter
Stream
Tweets tweet
Hashtag

Splitter
Tweet

Stream
hashtag
2013 © Trivadis
Main Concepts - Bolt
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
20
Twitter
Stream
Tweets tweet
Hashtag

Splitter
Tweet

Stream
hashtag
Execute is called for each tuple
arriving on the input stream
Declares the output fields for the component and
is called when bolt is created
2013 © Trivadis
Main Concepts - Bolt
Step 3: Use a bolt to count the occurences of the hashtag in Redis NoSQL
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
21
Twitter
Stream
Tweets tweet
Hashtag

Splitter
Tweet

Stream
hashtag Hashtag

Counter
2013 © Trivadis
Main Concepts - Bolt
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
22
Twitter
Stream
Tweets tweet
Hashtag

Splitter
Tweet

Stream
hashtag Hashtag

Counter
Prepare is called when bolt is
created
Execute is called for each tuple
arriving on the input stream
2013 © Trivadis
Main Concepts - Topology
Step 4: Setup the topology with the necessary groupings (distribution)
Each Spout or Bolt are running N instances in parallel
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
23
Twitter
Stream
Tweets
Hashtag

Splitter
Tweet

Stream
Hashtag

Counter
Hashtag

Splitter
Hashtag

Counter
Shuffle Fields
Shuffle grouping is random grouping
Fields grouping is grouped by value, such that equal value results in equal task
All grouping replicates to all tasks
Global grouping makes all tuples go to one task
None grouping makes bolt run in the same thread as bold/spout it subscribes to
Direct grouping producer (task that emits) controls which consumer will receive
2013 © Trivadis
Main Concepts - Topology
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
24
Create Stream called „tweet-stream“
Run this bolt by 2 workers
Run this spout
by 1 worker
Subscribe to stream „tweet-stream“
using shuffle grouping
Create stream called „hashtag-splitter“
Subscribe to stream „hashtag-splitter“
using fields grouping on field „hashtag“
2013 © Trivadis
Sample – How does it work ?
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
25
Twitter
Stream
Mein Präsentation
„Twitter Storm:
Ereignisverarbeitung
in Echtzeit“ Morgen
um 15:35 am Java
Forum Stuttgart
#JFS2013 #storm
#fastdata CU there!
Hashtag

Splitter
Tweet

Stream
Hashtag

Counter
Hashtag

Splitter
Hashtag

Counter
2013 © Trivadis
Sample – How does it work ?
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
26
Twitter
Stream
Mein Präsentation
„Twitter Storm:
Ereignisverarbeitung
in Echtzeit“ Morgen
um 15:35 am Java
Forum Stuttgart
#JFS2013 #storm
#fastdata CU there!
Hashtag

Splitter
Tweet

Stream
Hashtag

Counter
Hashtag

Splitter
Hashtag

Counter
storm
fastdata
jsf2013
2013 © Trivadis
Sample – How does it work ?
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
27
Twitter
Stream
Mein Präsentation
„Twitter Storm:
Ereignisverarbeitung
in Echtzeit“ Morgen
um 15:35 am Java
Forum Stuttgart
#JFS2013 #storm
#fastdata CU there!
Hashtag

Splitter
Tweet

Stream
Hashtag

Counter
Hashtag

Splitter
Hashtag

Counter
storm
fastdata
jfs2013
INCR
jfs2013
INCR
storm
INCR
fastdata storm = 1
fastdata =1
jfs2013= 1
2013 © Trivadis
Agenda
1.  What is Twitter Storm
2.  Main Concepts
3.  Trident
4.  BigData and FastData combined – Lambda Architecture
5.  Summary
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
28
2013 © Trivadis
What is Trident?
•  High-Level abstraction on top of storm
•  Makes it easier to build topologies
•  Core data model is the stream
•  Processed as a series of batches
•  Stream is partitioned among nodes in cluster
•  5 kinds of operations in Trident
•  Operations that apply locally to each partition and cause no network
transfer
•  Repartitioning operations that don‘t change the contents
•  Aggregation operations that do network transfer
•  Operations on grouped streams
•  Merges and Joins
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
29
2013 © Trivadis
What is Trident?
Supports
•  Joins
•  Aggregations
•  Grouping
•  Functions
•  Filters
Similar to Hadoop and Pig/Cascading
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
30
2013 © Trivadis
Trident Concepts - Topology
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
31
Twitter
Stream
Tweets tweet
Hashtag

Splitter
Tweet

Stream
hashtag Hashtag

Normalizer
Persistent

Aggregate
hashtag
groupBylocal
Bolt Bolt
2013 © Trivadis
Trident Concepts - Function
•  takes in a set of input fields and emits zero or more tuples as output
•  fields of the output tuple are appended to the original input tuple in the
stream
•  If a function emits no tuples, the original input tuple is filtered out
•  Otherwise the input tuple is duplicated for each output tuple
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
32
2013 © Trivadis
Agenda
1.  What is Twitter Storm
2.  Main Concepts
3.  Trident
4.  BigData and FastData combined – Lambda Architecture
5.  Summary
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
33
2013 © Trivadis
Lambda Architecture
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
34
Immutable
data
Batch
View
Query
Data
Stream
Realtime
View
Incoming
Data
Serving Layer
Speed Layer
Batch Layer
A
B
C D
E
F
G
2013 © Trivadis
Lambda Architecture
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
35
Speed Layer
Precompute
Views
query
Source: Marz, N. & Warren, J. (2013) Big Data. Manning.
Batch Layer
Precomputed
information
All data
Incremented
information
Process stream
Incoming
Data
Batch
recompute
Realtime
increment
Serving Layer
batch view
batch view
real time view
real time view
Merge
2013 © Trivadis
Lambda Architecture
24. April 2013
Big Data und Fast Data
36
one possible product/framework mapping
Speed Layer
Precompute
Views
query
Batch Layer
Precomputed
information
All data
Incremented
information
Process stream
Incoming
Data
Batch
recompute
Realtime
increment
Serving Layer
batch view
batch view
real time view
real time view
Merge
2013 © Trivadis
Agenda
1.  What is Twitter Storm
2.  Main Concepts
3.  Trident
4.  BigData and FastData combined – Lambda Architecture
5.  Summary
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
37
2013 © Trivadis
Summary
Express real-time processing naturally
Not a Hadoop/MapReduce competitor
Supports other languages
Hard to debug
Object Serialization
Didn‘t cover
§  Reliability through guaranteed message processing
§  Distributed RPC
§  Storm cluster setup and deploy
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
38
2013 © Trivadis
Resources
Website (http://storm-project.net/)
Wiki (https://github.com/nathanmarz/storm/wiki)
Doc (https://github.com/nathanmarz/storm/wiki/Documentation)
Storm-starter project (https://github.com/nathanmarz/storm-starter)
Mailing list (http://groups.google.com/group/storm-user)
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
39
2013 © Trivadis
BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

Thank You!
Trivadis AG
Guido Schmutz

guido.schmutz@trivadis.com
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
40

Contenu connexe

Tendances

From Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleFrom Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleDr. Mirko Kämpf
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...SoftServe
 
Webinar: Fighting Fraud with Graph Databases
Webinar: Fighting Fraud with Graph DatabasesWebinar: Fighting Fraud with Graph Databases
Webinar: Fighting Fraud with Graph DatabasesDataStax
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Databricks
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDataStax
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...Kai Wähner
 
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013Kai Wähner
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyDataStax
 
Data Democratization at Nubank
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at NubankDatabricks
 
Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database DataStax
 
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...DataStax Academy
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...Mark Rittman
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Big Data Spain
 
Webinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's PerspectiveWebinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's PerspectiveDataStax
 
Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015DataWorks Summit
 
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Data Con LA
 
Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...DataStax
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Guido Schmutz
 

Tendances (20)

From Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleFrom Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on Scale
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
Webinar: Fighting Fraud with Graph Databases
Webinar: Fighting Fraud with Graph DatabasesWebinar: Fighting Fraud with Graph Databases
Webinar: Fighting Fraud with Graph Databases
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
 
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Data Democratization at Nubank
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at Nubank
 
Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database
 
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
 
Smart data for a predictive bank
Smart data for a predictive bankSmart data for a predictive bank
Smart data for a predictive bank
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
 
Webinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's PerspectiveWebinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's Perspective
 
Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015
 
Intuit Analytics Cloud 101
Intuit Analytics Cloud 101Intuit Analytics Cloud 101
Intuit Analytics Cloud 101
 
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
 
Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 

En vedette

HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917Chicago Hadoop Users Group
 
Making Big Data Analytics Interactive and Real-­Time
 Making Big Data Analytics Interactive and Real-­Time Making Big Data Analytics Interactive and Real-­Time
Making Big Data Analytics Interactive and Real-­TimeSeven Nguyen
 
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBMonitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBGeoffrey Anderson
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLCloudera, Inc.
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013Nathan Bijnens
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponCloudera, Inc.
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsHortonworks
 
Integration of Hive and HBase
Integration of Hive and HBaseIntegration of Hive and HBase
Integration of Hive and HBaseHortonworks
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Karthik Ramasamy
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationnathanmarz
 

En vedette (16)

HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 
Making Big Data Analytics Interactive and Real-­Time
 Making Big Data Analytics Interactive and Real-­Time Making Big Data Analytics Interactive and Real-­Time
Making Big Data Analytics Interactive and Real-­Time
 
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBMonitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETL
 
Intro to Pig UDF
Intro to Pig UDFIntro to Pig UDF
Intro to Pig UDF
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analytics
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
Spark and shark
Spark and sharkSpark and shark
Spark and shark
 
Integration of Hive and HBase
Integration of Hive and HBaseIntegration of Hive and HBase
Integration of Hive and HBase
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 

Similaire à Twitter Storm: Ereignisverarbeitung in Echtzeit

Kafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeKafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeGuido Schmutz
 
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In RealtimeJAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtimejazoon13
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Trivadis
 
Blueprints for the analysis of social media
Blueprints for the analysis of social mediaBlueprints for the analysis of social media
Blueprints for the analysis of social mediaGuido Schmutz
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Timothy Spann
 
Software Factory & TVD-REN the Vaadin framework of Trivadis
Software Factory & TVD-REN the Vaadin framework of TrivadisSoftware Factory & TVD-REN the Vaadin framework of Trivadis
Software Factory & TVD-REN the Vaadin framework of TrivadisClaude-Alain Glauser
 
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedApache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedGuido Schmutz
 
Apache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedApache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedGuido Schmutz
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...InfluxData
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12cProcessing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12cGuido Schmutz
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Impetus Technologies
 
Curated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center PlatformsCurated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center PlatformsAlejandro Rios Peña
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Brian Brazil
 
Event-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutzEvent-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutzTrivadis
 
S2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real WorldS2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real WorldSean Roberts
 
Multiple awr reports_parser
Multiple awr reports_parserMultiple awr reports_parser
Multiple awr reports_parserJacques Kostic
 

Similaire à Twitter Storm: Ereignisverarbeitung in Echtzeit (20)

Kafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeKafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtime
 
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In RealtimeJAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 
Blueprints for the analysis of social media
Blueprints for the analysis of social mediaBlueprints for the analysis of social media
Blueprints for the analysis of social media
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
 
Software Factory & TVD-REN the Vaadin framework of Trivadis
Software Factory & TVD-REN the Vaadin framework of TrivadisSoftware Factory & TVD-REN the Vaadin framework of Trivadis
Software Factory & TVD-REN the Vaadin framework of Trivadis
 
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedApache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
 
Apache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedApache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms compared
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12cProcessing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
 
Curated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center PlatformsCurated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center Platforms
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Monitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp DockerMonitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp Docker
 
Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)
 
Event-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutzEvent-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutz
 
S2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real WorldS2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real World
 
Multiple awr reports_parser
Multiple awr reports_parserMultiple awr reports_parser
Multiple awr reports_parser
 
Distributed Tracing
Distributed TracingDistributed Tracing
Distributed Tracing
 

Plus de Guido Schmutz

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as CodeGuido Schmutz
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureGuido Schmutz
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsGuido Schmutz
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Guido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureGuido Schmutz
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaGuido Schmutz
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming VisualisationGuido Schmutz
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Guido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 

Plus de Guido Schmutz (20)

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data Architecture
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data Architecture
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache Kafka
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using Kafka
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming Visualisation
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI Architecture
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 

Dernier

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Dernier (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

Twitter Storm: Ereignisverarbeitung in Echtzeit

  • 1. 2013 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
 WELCOME Twitter Storm: Ereignisverarbeitung in Echtzeit Guido Schmutz
 Java Forum Stuttgart 2013 4.7.2013 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 1
  • 2. 2013 © Trivadis Trivadis ist führend bei der IT-Beratung, der Systemintegration, dem Solution-Engineering und der Erbringung von IT-Services mit Fokussierung auf und Technologien im D-A-CH-Raum. Unsere Leistungen erbringen wir aus den strategischen Geschäftsfeldern: Trivadis Services übernimmt den korrespondierenden Betrieb Ihrer IT Systeme. Unser Unternehmen 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit B E T R I E B 2
  • 3. 2013 © Trivadis Mit über 600 IT- und Fachexperten bei Ihnen vor Ort 3 11 Trivadis Niederlassungen mit
 über 600 Mitarbeitenden 200 Service Level Agreements Mehr als 4'000 Trainingsteilnehmer Forschungs- und Entwicklungs- budget: CHF 5.0 / EUR 4 Mio. Finanziell unabhängig und
 nachhaltig profitabel Erfahrung aus mehr als 1'900 Projekten pro Jahr bei über 800 Kunden Stand 12/2012 Hamburg Düsseldorf Frankfurt Freiburg München Wien Basel ZürichBern Lausanne 3 Stuttgart 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 3
  • 4. 2013 © Trivadis Guido Schmutz •  Working for Trivadis for more than 16 years •  Oracle ACE Director for Fusion Middleware and SOA •  Co-Author of different books •  Consultant, Trainer Software Architect for Java, Oracle, SOA, EDA, BigData und FastData •  Member of Trivadis Architecture Board •  Technology Manager @ Trivadis •  More than 20 years of software development 
 experience •  Contact: guido.schmutz@trivadis.com •  Blog: http://guidoschmutz.wordpress.com •  Twitter: gschmutz 4.7.2013 4 Twitter Storm: Ereignisverarbeitung in Echtzeit
  • 5. 2013 © Trivadis Big Data Definition (Gartner et al) 14.02.2013 Big Data 4 Sales 5 Velocity Tera-, Peta-, Exa-, Zetta-, Yota- bytes and constantly growing “Traditional” computing in RDBMS 
 is not scalable enough. 
 We search for “linear scalability” “Only … structured information 
 is not enough” – “95% of produced data in unstructured” Characteristics of Big Data: Its Volume, Velocity and Variety in combination + Veracity (IBM) - information uncertainty + Time to action ? – Big Data + Event Processing = Fast Data
  • 7. 2013 © Trivadis Velocity 24. April 2013 Big Data und Fast Data 7 §  Velocity requirement examples: §  Recommendation Engine §  Predictive Analytics §  Marketing Campaign Analysis §  Customer Retention and Churn Analysis §  Social Graph Analysis §  Capital Markets Analysis §  Risk Management §  Rogue Trading §  Fraud Detection §  Retail Banking §  Network Monitoring §  Research and Development
  • 8. 2013 © Trivadis Agenda 1.  What is Twitter Storm 2.  Main Concepts 3.  Trident 4.  BigData and FastData combined – Lambda Architecture 5.  Summary 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 8
  • 9. 2013 © Trivadis What is Twitter Storm ? A platform for doing analysis on streams of data as they come in, so you can react to data as it happens. •  A highly distributed real-time computation system •  Provides general primitives to do real-time computation •  To simplify working with queues & workers •  scalable and fault-tolerant •  complementary to Hadoop Originated at Backtype, acquired by Twitter in 2011 Open Sourced late 2011 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 9
  • 10. 2013 © Trivadis Hadoop vs. Storm 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 10 Storm Real-time processing Topologies run forever Stateless nodes Scalable Guarantees no data loss Open Source => Fast, reactive, real time procesing Hadoop Batch processing Jobs runs to completion Stateful nodes Scalable Guarantees no data loss Open Source => big batch processing = !=
  • 11. 2013 © Trivadis Idea: Simplify dealing with queue & workers 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 11 Scaling is painful – queue partitioning & worker deploy Operational overhead – worker failures & queue backups No guarantees on data processing
  • 12. 2013 © Trivadis What does Storm provide? §  At least once message processing §  Fault-tolerant §  Horizontal scalability §  No intermediate queues §  Less operational overhead §  „Just works“ §  Broad set of use cases §  Stream processing §  Continous computation §  Distributed RPC 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 12
  • 13. 2013 © Trivadis Agenda 1.  What is Twitter Storm 2.  Main Concepts 3.  Trident 4.  BigData and FastData combined – Lambda Architecture 5.  Summary 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 13
  • 14. 2013 © Trivadis Main Concepts – Shown based on Twitter example Show the tweets related to Java Forum (use hashtag #JFS2013) and show the count of the hashtags used 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 14 not showing #JFS2013
  • 15. 2013 © Trivadis Main Concepts - Streams Stream •  an unbounded sequence of tuples •  core abstraction in Storm •  Defined with a schema that names the 
 fields in the tuple •  Value must be serializable •  Every stream has an id Topology •  graph where each node is a spout or bolt •  edges indicating which bolt subscribes 
 to which streams 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 15 Source of stream A Source of stream B Subscribes: A Emits: C Subscribes: A Emits: D Subscribes: A & B Emits: - Subscribes: C & D Emits: - T T T T T T T T
  • 16. 2013 © Trivadis Worker •  executes a subset of a topology, may run one or more executors for one or more components Executor •  thread spawned by worker Task •  performs the actual data processing Main Concepts – Worker, Executor, Task 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 16 Worker Process Task Task Task Task
  • 17. 2013 © Trivadis Main Concepts - Spout A spout is the component which acts as the source of streams in a topology Generally spouts read tuples from an external source and emit them into the topology Spouts can emit more than one stream Step 1: Use a Spout to subscribe to the Twitter Filter Streaming API to get all the tweets containing the hashtag #JFS2013 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 17 Twitter Stream Tweets tweet Tweet
 Stream
  • 18. 2013 © Trivadis Main Concepts - Spout 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 18 Twitter Stream Tweets tweet Tweet
 Stream By calling this method, Storm requests that the Spout emits tuples to the output collector Called when a task for this component is initialized Used to declare streams and their schmeas. it is possible to declare serveral streams and specifying the one to use when emiting tuples
  • 19. 2013 © Trivadis Bolt is doing the processing of the message •  filtering, functions, aggregations, joins, talking to databases…. •  Can do simple stream transformations •  Complex transformations often require multiple steps and therefore multiple bolts Bolts can emit one or more streams Step 2: Use a bolt to retrieve the hashtags for each tweet Main Concepts - Bolt 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 19 Twitter Stream Tweets tweet Hashtag
 Splitter Tweet
 Stream hashtag
  • 20. 2013 © Trivadis Main Concepts - Bolt 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 20 Twitter Stream Tweets tweet Hashtag
 Splitter Tweet
 Stream hashtag Execute is called for each tuple arriving on the input stream Declares the output fields for the component and is called when bolt is created
  • 21. 2013 © Trivadis Main Concepts - Bolt Step 3: Use a bolt to count the occurences of the hashtag in Redis NoSQL 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 21 Twitter Stream Tweets tweet Hashtag
 Splitter Tweet
 Stream hashtag Hashtag
 Counter
  • 22. 2013 © Trivadis Main Concepts - Bolt 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 22 Twitter Stream Tweets tweet Hashtag
 Splitter Tweet
 Stream hashtag Hashtag
 Counter Prepare is called when bolt is created Execute is called for each tuple arriving on the input stream
  • 23. 2013 © Trivadis Main Concepts - Topology Step 4: Setup the topology with the necessary groupings (distribution) Each Spout or Bolt are running N instances in parallel 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 23 Twitter Stream Tweets Hashtag
 Splitter Tweet
 Stream Hashtag
 Counter Hashtag
 Splitter Hashtag
 Counter Shuffle Fields Shuffle grouping is random grouping Fields grouping is grouped by value, such that equal value results in equal task All grouping replicates to all tasks Global grouping makes all tuples go to one task None grouping makes bolt run in the same thread as bold/spout it subscribes to Direct grouping producer (task that emits) controls which consumer will receive
  • 24. 2013 © Trivadis Main Concepts - Topology 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 24 Create Stream called „tweet-stream“ Run this bolt by 2 workers Run this spout by 1 worker Subscribe to stream „tweet-stream“ using shuffle grouping Create stream called „hashtag-splitter“ Subscribe to stream „hashtag-splitter“ using fields grouping on field „hashtag“
  • 25. 2013 © Trivadis Sample – How does it work ? 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 25 Twitter Stream Mein Präsentation „Twitter Storm: Ereignisverarbeitung in Echtzeit“ Morgen um 15:35 am Java Forum Stuttgart #JFS2013 #storm #fastdata CU there! Hashtag
 Splitter Tweet
 Stream Hashtag
 Counter Hashtag
 Splitter Hashtag
 Counter
  • 26. 2013 © Trivadis Sample – How does it work ? 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 26 Twitter Stream Mein Präsentation „Twitter Storm: Ereignisverarbeitung in Echtzeit“ Morgen um 15:35 am Java Forum Stuttgart #JFS2013 #storm #fastdata CU there! Hashtag
 Splitter Tweet
 Stream Hashtag
 Counter Hashtag
 Splitter Hashtag
 Counter storm fastdata jsf2013
  • 27. 2013 © Trivadis Sample – How does it work ? 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 27 Twitter Stream Mein Präsentation „Twitter Storm: Ereignisverarbeitung in Echtzeit“ Morgen um 15:35 am Java Forum Stuttgart #JFS2013 #storm #fastdata CU there! Hashtag
 Splitter Tweet
 Stream Hashtag
 Counter Hashtag
 Splitter Hashtag
 Counter storm fastdata jfs2013 INCR jfs2013 INCR storm INCR fastdata storm = 1 fastdata =1 jfs2013= 1
  • 28. 2013 © Trivadis Agenda 1.  What is Twitter Storm 2.  Main Concepts 3.  Trident 4.  BigData and FastData combined – Lambda Architecture 5.  Summary 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 28
  • 29. 2013 © Trivadis What is Trident? •  High-Level abstraction on top of storm •  Makes it easier to build topologies •  Core data model is the stream •  Processed as a series of batches •  Stream is partitioned among nodes in cluster •  5 kinds of operations in Trident •  Operations that apply locally to each partition and cause no network transfer •  Repartitioning operations that don‘t change the contents •  Aggregation operations that do network transfer •  Operations on grouped streams •  Merges and Joins 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 29
  • 30. 2013 © Trivadis What is Trident? Supports •  Joins •  Aggregations •  Grouping •  Functions •  Filters Similar to Hadoop and Pig/Cascading 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 30
  • 31. 2013 © Trivadis Trident Concepts - Topology 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 31 Twitter Stream Tweets tweet Hashtag
 Splitter Tweet
 Stream hashtag Hashtag
 Normalizer Persistent
 Aggregate hashtag groupBylocal Bolt Bolt
  • 32. 2013 © Trivadis Trident Concepts - Function •  takes in a set of input fields and emits zero or more tuples as output •  fields of the output tuple are appended to the original input tuple in the stream •  If a function emits no tuples, the original input tuple is filtered out •  Otherwise the input tuple is duplicated for each output tuple 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 32
  • 33. 2013 © Trivadis Agenda 1.  What is Twitter Storm 2.  Main Concepts 3.  Trident 4.  BigData and FastData combined – Lambda Architecture 5.  Summary 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 33
  • 34. 2013 © Trivadis Lambda Architecture 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 34 Immutable data Batch View Query Data Stream Realtime View Incoming Data Serving Layer Speed Layer Batch Layer A B C D E F G
  • 35. 2013 © Trivadis Lambda Architecture 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 35 Speed Layer Precompute Views query Source: Marz, N. & Warren, J. (2013) Big Data. Manning. Batch Layer Precomputed information All data Incremented information Process stream Incoming Data Batch recompute Realtime increment Serving Layer batch view batch view real time view real time view Merge
  • 36. 2013 © Trivadis Lambda Architecture 24. April 2013 Big Data und Fast Data 36 one possible product/framework mapping Speed Layer Precompute Views query Batch Layer Precomputed information All data Incremented information Process stream Incoming Data Batch recompute Realtime increment Serving Layer batch view batch view real time view real time view Merge
  • 37. 2013 © Trivadis Agenda 1.  What is Twitter Storm 2.  Main Concepts 3.  Trident 4.  BigData and FastData combined – Lambda Architecture 5.  Summary 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 37
  • 38. 2013 © Trivadis Summary Express real-time processing naturally Not a Hadoop/MapReduce competitor Supports other languages Hard to debug Object Serialization Didn‘t cover §  Reliability through guaranteed message processing §  Distributed RPC §  Storm cluster setup and deploy 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 38
  • 39. 2013 © Trivadis Resources Website (http://storm-project.net/) Wiki (https://github.com/nathanmarz/storm/wiki) Doc (https://github.com/nathanmarz/storm/wiki/Documentation) Storm-starter project (https://github.com/nathanmarz/storm-starter) Mailing list (http://groups.google.com/group/storm-user) 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 39
  • 40. 2013 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
 Thank You! Trivadis AG Guido Schmutz
 guido.schmutz@trivadis.com 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 40