6. Asynchrony * partial failure
is hard2
Replication
Replay
Today:
Consistency criteria for fault-
tolerant distributed systems
Blazes: analysis and enforcement
7. This talk is all setup
Frame of mind:
1. Dataflow: a model of distributed computation
2. Anomalies: what can go wrong?
3. Remediation strategies
1. Component properties
2. Delivery mechanisms
Framework:
Blazes – coordination analysis and synthesis
8. Little boxes: the dataflow model
Generalization of distributed services
Components interact via asynchronous calls
(streams)
43. Component properties
• Convergence
– Component replicas receiving the same
messages reach the same state
– Rules out divergence
• Confluence
– Output streams have deterministic contents
– Rules out all stream anomalies
Confluent è convergent
54. Ordering – global coordination
Data
source
client
The first principle of successful scalability
is to batter the consistency mechanisms down to a minimum.
– James Hamilton
55. Preventing the anomalies
1. Understand component semantics
(And disallow certain compositions)
2. Constrain message delivery orders
1. Ordering
2. Barriers and sealing
56. Barriers – local coordination
Determinis:c
outputs
Data source
client
Order-sensitive
58. Sealing – continuous barriers
Do partitions of (infinite) input streams “end”?
Can components produce deterministic
results given “complete” input partitions?
Sealing: partition barriers for infinite streams
59. Sealing – continuous barriers
Finite partitions of infinite inputs are common
…in distributed systems
– Sessions
– Transactions
– Epochs / views
…and applications
– Auctions
– Chats
– Shopping carts
62. Grey boxes
Example: pub/sub
x = publish
y = subscribe
z = deliver
x
y
z
Determinis:c
but
unordered
Severity Label Confluent Stateless
1 CR X X
2 CW X
3 ORgate X
4 OWgate
x->z : CW
y->z : CWT
63. Grey boxes
Example: key/value store
x = put; y = get;
z = response
x
y
z
Determinis:c
but
unordered
Severity Label Confluent Stateless
1 CR X X
2 CW X
3 ORgate X
4 OWgate
x->z : OWkey
y->z : ORT
80. The Blazes frame of mind:
• Asynchronous dataflow model
• Focus on consistency of data in motion
– Component semantics
– Delivery mechanisms and costs
• Automatic, minimal coordination