2. Riak K/V
• Distributed Key-Value Store
• Based on Amazon’s Dynamo
• HTTP and Binary (Protocol Buffers) APIs
• Data access by {Bucket, Key}
• Javascript Map/Reduce
• Link Walking
• Pluggable Storage (Bitcask, InnoDB, ...)
3. High-Level Dynamo
• Decentralized (no “master” nodes)
• Homogeneous (all nodes can do anything)
• Vector clocks (no reliance on physical time)
• Gossip Protocol (no global state)
• Consistent Hashing for replica placement
(a local calculation for each node)
4. N, R, W Values
• N = number of replicas to store (on
distinct nodes)
• R = number of replica responses needed
for a successful read (specified per-request)
• W = number of replica responses needed
for a successful write (specified per-
request)
5.
6.
7.
8. Harvesting A
Framework
• We noticed that Riak code fell into one of
two categories
• Code specific to K/V storage
• “generic” distributed systems code
• So we split Riak into K/V and Core
9. Distributed
Coordination
• Making many machines act like one
• Division of labor
• Load balancing
• State storage
• Mutual exclusion/locking
13. Virtual Nodes
• Primary actor in a Dynamo-based system
• Handles load for (1/num_partitions)
• Implements commands dispatched from
clients
• Handles handoff when nodes join/leave
14. Preference Lists
• Lists of virtual nodes obtained by hashing a
request (document, sessionid, etc).
• Allows any node to compute document
locations
• Central to replication in Riak
• Down nodes are filtered out, replaced with
next-best nodes in the ring.
15. Ring Event Watchers
• Notified when ring state changes due to
node addition/removal
• API: ring_update(NewRing)
• Can modify ring state in an app-specific
fashion
16. Node Event Watchers
• Nodes run and advertise “services”
• API: service_update(Services)
• Active service list used to generate per-app
preference lists.
17. Use cases
• If distributed systems isn’t your core
business, outsource it!
• Providing a distribution layer on top of
non-distributed systems like:
• Couch, Redis, Memcached
• Implementing your own systems.
18. Current Status and
Roadmap
• Erlang-only now, but not for long (HTTP
and PB APIs coming)
• Some harvesting left to do (versioned
objects, ring/node handler utilities)
• Project templates - skeleton code for
writing Riak Core-based systems.
• Stronger consistency models (with a Paxos/
ZAB-like protocol)