The document provides an overview of Apache Geode, an open-source distributed data management platform. It discusses that Geode is a distributed key-value store that is highly available, low latency, and partition tolerant. It supports objects as keys and values and features like secondary indexes, querying, transactions, and WAN replication. Geode uses a lazy approach to place data into buckets on servers and handles failures through duplication. It ensures consistency using operations serialization and handles network partitions by allowing temporary inconsistent reads. Geode recovers redundancy by having servers copy missing buckets and improves data distribution through rebalancing using a greedy optimization algorithm.
4. What is Geode?
● Distributed key-value store Client
Put (key, value)
Server
Server
Server
5. ● Distributed key-value store
● Highly available
What is Geode?
Client
Put (key, value)
Server
Server
Server
6. ● Distributed key-value store
● Highly available
● Low Latency
What is Geode?
Client
Put (key, value)
Server
Server
< 1ms
Whoah!
7. ● Distributed key-value store
● Highly available
● Low Latency
● Consistent and Partition Tolerant
What is Geode
Client
Put (key, value)
Server
Server
Oh, no! A network partition!
8. ● Two types of regions
What is Geode
Client
Put (A)
Replicated
Server A
Server
Server
A
A
A
9. ● Two types of regions
What is Geode
Client
Put (A)
Replicated
Server A
Server
Server
A
A
A
Partitioned
Server A
Server
Server
B
A
10. What is Geode
● Keys and Values are Objects (Java, C++, C#, JSON)
● Has
○ Secondary Indexes & Querying
○ Continuous Queries
○ Transactions
○ Persistence
○ WAN replication
○ Event delivery
○ Parallel functions
○ ...
13. Components
1
3
Membership
Distributed Locks Replicated Regions
Partitioned Regions
Function Execution
Serialization Messaging Persistence
Indexes
Querying
WAN ReplicationStatisticsPartitioned Regions
- Partitioning & Routing
- High Availability
- Consistency
- Recovery and Rebalancing
14. ● A partitioned regions is divided into buckets
Partitioned Regions
Put (“Marie
Tharp”, value)
Bucket 0
Bucket 1
Bucket 2
Bucket 3
Bucket N
hash = “Marie Tharp”.hashCode()
bucket = hash % num_buckets
15. Server 2
Server 1
Server 3
● Buckets are mapped to servers
Partitioned Regions
Put (“Marie
Tharp”, value)
Bucket 0
Bucket 3
Bucket N
Bucket 1
Bucket 2
hash = “Marie Tharp”.hashCode()
bucket = hash % num_buckets
17. What about?
● How does data get to a bucket?
● How does geode handle failures?
● How does geode ensure data is consistent?
● How are lost bucket copies replaced?
● How do we improve data distribution?
18. Placing Buckets
● How does data get to a bucket?
● How does geode handle failures?
● How does geode ensure data is consistent?
● How are lost bucket copies replaced?
● How do we improve data distribution?
20. Server 2
Server 1
Client
Partitioned Regions - Lazy Creation
Put
(key, value)
Hash
Function
Put in
Bucket 2
Routing
Table
(empty)
Server 3 Proxy
21. Server 2
Server 1
Client
Partitioned Regions - Lazy Creation
Put
(key, value)
Hash
Function
Routing
Table
(empty)
Server 3
Bucket 2
key=value
Proxy
Create Bucket!
22. Server 2
Server 1
Client
Partitioned Regions - Lazy Discovery
Routing
Table
(empty)
Server 3
Bucket 2
key=value
Proxy
Reply -
Bucket
Metadata
Changed!
24. Server 2
Server 1
Client
Partitioned Regions - Lazy Discovery
Put
(key, value)
Hash
Function
Put in
Bucket
Bucket 2
key=value
Routing
Table
Bucket 2
Server 3
25. High Availability
● How does data get to a bucket?
● How does geode handle failures?
● How does geode ensure data is consistent?
● How are lost bucket copies replaced?
● How do we improve data distribution?
27. Server 2
Server 1
Client
Partitioned Regions - High Availability
Put
(key, value)
Hash
Function
Put in
Bucket
Routing
Table
Bucket 2
Server 3
Bucket 2
key=value
28. Server 2
Server 1
Client
Partitioned Regions - High Availability
Put
(key, value)
Hash
Function
Put in
Bucket
Routing
Table
Bucket 2
Server 3
Bucket 2
key=value
Bucket 2
key=value
29. Server 2
Server 1
Client
Partitioned Regions - Failover
Put
(key, value)
Hash
Function
Put in
Bucket
Bucket 2
key=value
Routing
Table
Bucket 2
Server 3
Bucket 2
key=value
30. Consistency
● How does data get to a bucket?
● How does geode handle failures?
● How does geode ensure data is consistent?
● How are lost bucket copies replaced?
● How do we add/remove servers?
31. Server 2
Server 1
Client 1
Consistency - Ships Passing in the Night
Put (key, value1)
Bucket 2
key=value1
Server 3
Client 2
Put (key, value2)
Bucket 2
key=value2
32. Server 2
Server 1
Client 1
Consistency - Ships Passing in the Night
Put (key, value1)
Bucket 2
key=value2
Server 3
Client 2
Put (key, value2)
Bucket 2
key=value1
33. Consistency
● How does data get to a bucket?
● How does geode handle failures?
● How does geode ensure data is consistent?
● How are lost bucket copies replaced?
● How do we improve data distribution?
35. Server 2
Server 1
Client 1
Consistency - Ships Passing in the Night
Put (key, value1)
Bucket 2
key=value2
Server 3
Client 2
Put (key, value2)
Bucket 2
key=value1
36. Server 2
Server 1
Client 1
Consistency
Put (key, value1)
Bucket 2
key=value2
Server 3
Client 2
Put (key, value2)
Bucket 2
key=value2
Operations on key
Serialized on primary
37. Server 2
Server 1
Client
Consistency - Lingering Operations
Put
(key, value)
Hash
Function
Put in
Bucket
Bucket 2
key=value
Routing
Table
Bucket 2
Server 3
Bucket 2
key=value
43. Server 2
Server 1
Client 1
Consistency - Network Partitions
Put (key, value1)
Bucket 2
key=value2
Client 2
Put (key, value2)
Bucket 2
key=value2
44. ● How does data get to a bucket?
● How does geode handle failures?
● How does geode ensure data is consistent?
● How are lost bucket copies replaced?
● How do we improve data distribution?
Restoring Redundancy
46. Partitioned Regions - Redundancy Recovery
Start
Server 4Server 2
Bucket 2
Redundancy
Provider
Redundancy
Provider
Server 3
Redundancy
Provider
Start
Start
47. Partitioned Regions - Redundancy Recovery
Server 4Server 2
Bucket 2
Redundancy
Provider
Redundancy
Provider
Server 3
Redundancy
Provider
Got a lock!
48. Partitioned Regions - Redundancy Recovery
Server 4Server 2
Bucket 2
Redundancy
Provider
Redundancy
Provider
Server 3
Redundancy
Provider
Bucket 2
Make a copy!
Copy Bucket
49. Partitioned Regions - Redundancy Recovery
Server 4Server 2
Bucket 2
Redundancy
Provider
Redundancy
Provider
Server 3
Redundancy
Provider
Nothing to Do
Bucket 2
50. Partitioned Regions - Redundancy Recovery
Nothing to Do
Server 4Server 2
Bucket 2
Redundancy
Provider
Redundancy
Provider
Server 3
Redundancy
Provider
Bucket 2
51. Rebalancing
● How does data get to a bucket?
● How does geode handle failures?
● How does geode ensure data is consistent?
● How are lost bucket copies replaced?
● How do improve data distribution?
53. Rebalancing - What are we optimizing
● Cost based optimizer
● Minimizes the variance in
bytes stored on each member
● Greedy algorithm
○ Maximize the
improvement in variance
per byte moved
Bucket 1
Bucket 3
Bucket 2Server 1
Bucket 1
Bucket 3
Bucket 2
Variance: 1600 Server 2
Server 3
60
0
0
54. Server 3
Server 1
Server 2
Rebalancing - What are we optimizing
● Cost based optimizer
● Minimizes the variance in
bytes stored on each member
● Greedy algorithm
○ Maximize the
improvement in variance
per byte moved
Bucket 1
Bucket 3
Bucket 2
Variance: 1050
45
15
0
55. Server 3
Server 1
Server 2
Rebalancing - What are we optimizing
● Cost based optimizer
● Minimizes the variance in
bytes stored on each member
● Greedy algorithm
○ Maximize the
improvement in variance
per byte moved
Bucket 1
Bucket 3
Bucket 2
Variance: 150
30
15
15
56. Rebalancing - what does it do?
Three Phases
1. Restore Redundancy
2. Optimize bucket distribution
3. Optimize primary distribution
Membership changes start from phase 1 again.
57. Putting it Together
● Start with the simple idea: Hashing
● Using - Laziness, Duplication, Bossyness and Greed
● Get
○ High Availability
○ Low Latency
○ Consistency