Scalable complex event processing on samza @UBER

Scalable Complex Event Processing On
Samza @Uber
Shuyi Chen
Uber Technologies Inc.

● 6 continents, 70 countries, 400+ cities
● Transportation as reliable as running water, everywhere,
for everyone
Uber

Outline
● Motivation
● Architecture
● Limitations
● Challenges

Thousands of Kafka topics from different services

We can extract a lot of useful information from this
rich set of logs in real-time!

Multiple logins from the same IP within a short
interval

Partner accepted a trip
→ partner calls rider through the Uber APP
→ rider cancels the trip

Partners reject the second pickup of a UberPOOL
trip

Multiple logins from the same IP within a short
interval
Window Aggregation

Partner accepted a trip
→ partner calls rider through the Uber APP
→ rider cancels the trip
Pattern detection

Partners reject the second pickup of a UberPOOL
trip
Filter

Can we use declarative semantics to specify these
stream processing logics?

Complex event processing
● Combines data from multiple sources to infer events or patterns that suggest
more complicated circumstances
● CEP is used across many industries for various use cases, including:
○ Finance: Trade analysis, fraud detection
○ Airlines: Operations monitoring
○ Healthcare: Claims processing, patient monitoring
○ Energy and Telecommunications: Outage detection
● CEP uses declarative rule/query language to specify event processing logic

Siddhi: Complex event processing engine
● Lightweight, extensible, open source, released as a Java library
● Features supported
○ Filter
○ Join
○ Aggregation
○ Group by
○ Window
○ Pattern processing
○ Sequence processing
○ Event tables
○ Event-time processing
○ Declarative query language: SiddhiQL

How Siddhi works
● Specify processing logic declaratively with SiddhiQL

How Siddhi works
● Query is parsed at runtime into an execution plan runtime
● As events flow in, the execution plan runtime process events inside the CEP
engine according the query logic

How can we make it scalable at Uber scale?

Samza
● A distributed stream processing framework
○ Scalable
○ Built-in State management
○ Built-in fault tolerant
○ At-least-once message processing
● Good support from our data infra team

How can we make the stream processing output
useful?

Actions
● Generalize a set of common action templates to make it easy for services and
human to harness the power of realtime stream processing
● Currently we support
○ Make an RPC call
○ Invoke a Webhook endpoint
○ Index to ElasticSearch
○ Write Cassandra
○ Kafka
○ Statsd
○ Chat service
○ Email
○ Push notification

Actions
Real-time Scalable Complex Event Processing

Preprocessor
● Enrich raw Kafka events with business information

Shuffler
● Re-shuffle events
● Prefiltering for predicate pushdown

Complex event processor
● Parse Siddhi queries into execution plan runtime
● Process events in Siddhi execution plan runtime
● Checkpoint state regularly to ensure recovery upon crash/restart using
RocksDB

Action processor
● Execute actions upon the complex event output
● Support various kinds of actions for easy integration
● Implement configurable and finite action retry mechanism using RocksDB

No stream processing logic is hard-coded in the data
pipeline

REST API backend
● All queries, actions, shuffling logics and pre-filtering logics are stored
externally in Cassandra
● RESTFUL API for CRUD operations
● Data pipeline automatically reload the data upon update w/o job restart
○ fast data exploration
○ Realtime feedback loop
○ incremental DAG construction
● Decouple processing logic from the data pipeline

Unified management and monitoring
● Every use case
○ share the same data pipeline architecture
○ Use queries and actions to describe its processing logic
● A single monitoring template can be reused across different use cases

Applications
● Real-time fraud detection
● Real-time anomaly detection
● Real-time marketing campaign
● Real-time promotion
● Real-time monitoring
● Real-time feedback system
● Real-time analytics
● Real-time visualizations
● And etc.

Not a general purpose stream processing system

No dynamic topology
● The DAG is not dynamic
● Can not shuffle arbitrary number of times
● Ideally, we can chain multiple copies of the data pipeline to build arbitrary
DAG
○ Large DAG can be difficult to manage and monitor
○ Samza use Kafka as intermediate message queue between jobs, wide DAGs cause large load
on Kafka
○ Out of 40+ use cases we run in production, none requires it.

Out-of-order event handling
● Not a big concern
○ Events of the same rider/partner are usually seconds aparts
● K-slack extension in Siddhi for out-of-order event processing

Job deployment
● Samza job creation is semi-automated
○ Auto-generate standard job properties
○ JVM memory tuning
○ Samza parameter tuning, e.g. container count
● Integrate with in-house cluster job management system to simplify
start/restart/stop/upgrade of Samza jobs

Predicate pushdown
● Allow prefiltering of streams in shuffle stage
● Need manual configuration through Web UI
● In the future, we can automate this by query analysis

Broadcast stream
● We need broadcast stream to broadcast updates in storage backend to the
data pipeline
● No broadcast stream in Samza 0.9.1
● Override SystemStreamPartitionGrouper
● Samza 0.10.0 added broadcast support (SAMZA-676)

Unbalanced task workload
● Shufflers ingest multiple topics with different partition counts
● Default task partition assignment does not scale
● Override SystemStreamPartitionGrouper to balance the partitions across all
tasks

Large checkpointing state
● Samza use Kafka to log state changes
● Kafka message size limit to 1 MB by default
● Solution: we build logics to slice state into smaller pieces and checkpoint
them into Rocksdb

Synchronous checkpointing
● If state is large, time to checkpoint can be long
● Samza uses single-threaded model, unsafe to do it asynchronously
● Ongoing work on multi-thread support in Samza (SAMZA-863)

Exactly once state processing?
● Can not commit state and offset atomically
● No exactly once state processing

Debugging
● Need to inspect multiple logs to diagnose Samza job problems
○ Application master log
○ Multiple container logs
○ Log size is huge
○ Container logs are difficult to locate after job failure
● Sometimes, Samza job get stuck at launch, and no log can be found
○ YARN problem
○ Binary downloading problem

Upgrading Samza jobs
● Upgrade Samza jobs require a full restart, and can take minutes due to
○ Offset checkpointing topic too large → set retention to hours
○ Changelog topic too large → set retention or enable compaction in Kafka or host affinity
(SAMZA-617)
● To minimize the interruption during upgrade, it would be nice to have
○ Rolling restart
○ Per container restart

Our solution: non-interrupted handoff
● For critical jobs, we use replication during upgrade
○ Start a shadow job
○ Upgrade shadow
○ Switch primary and shadow
○ Upgrade primary
○ Switch back
● Downside: require 2x capacity during upgrade

Manage complicated DAG
● Samza uses Kafka as message queue for intermediate processing output
○ This enables sharing of shuffler or preprocessor output among multiple downstream Samza
jobs
○ Increase resource efficiency
● This gradually results in a large and complicated DAG
○ Complicated dependencies between jobs
○ Jobs closer to the sources of the DAG becoming more and more critical
● In practice, we isolate DAGs by logical groups

Scalable complex event processing on samza @UBER

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Scalable complex event processing on samza @UBER

Similaire à Scalable complex event processing on samza @UBER (20)

Dernier

Dernier (20)

Scalable complex event processing on samza @UBER