The document discusses challenges in distributed systems including:
1) Avoiding database transactions by using logical transactions and handling replication conflicts.
2) Handling database schema changes by adding fields lazily with feature toggles and migrating data in the background.
3) Ensuring consistent reads across replicas by having a separate API and handling cross-datacenter replication lag.
4) Dealing with multiple datacenters by pinning APIs, separating read/write services, using a SQL proxy, or client routing.
Computer 10: Lesson 10 - Online Crimes and Hazards
Arrested by the cap devoxx uk 2018
1. @aviranm
Aviran Mordo, Head of Wix Engineering
Handling Data in Distributed Systems
Arrested by the CAP
Twitter: @aviranm linkedin/aviran aviransplace.com
11. @aviranm
Logical DB
transaction
Saving a Wix
Site’s Data
Site
Pages
DB
Save
page(s)
1. Save each page as an
atomic operation
2. Finalize transaction by
sending site header
(pointers to pages)
Can generate orphaned pages, not a
problem in practice
Site
Header
DB
Save
header
Browser
Editor
Server
Save
page(s)
Save
header
List of
page
IDs
15. @aviranm
Pages
MySQL
Pages
MySQL
DB Conflicts can be safely
ignored as content is identical
Page ID is a content-based hash:
• Immutable data
• Idempotent operation
Avoiding
Replication
Conflicts
DC-2DC-1
19. @aviranm
Database
Changes
1. Add Fields
2. Remove Fields
3. Complete Schema /
Database Change
1.1. For adding metadata (non-
indexed fields)
Use a blob field for schema flexibility (JSON
works really well).
20. @aviranm
Database
Changes
1. Add Fields
2. Remove Fields
3. Complete Schema /
Database Change
1.1. For adding metadata (non-
indexed fields)
Use a blob field for schema flexibility (JSON
works really well).
1.2. If the fields are searchable
(indexed)
Use another table and join by primary key.
21. @aviranm
Database
Changes
1. Add Fields
2. Remove Fields
3. Complete Schema /
Database Change
1.1. For adding metadata (non-
indexed fields)
Use a blob field for schema flexibility (JSON
works really well).
1.2. If the fields are searchable
(indexed fields)
Use another table and join by primary key.
2. Stop using it in the code.
Do not do any DB schema changes.
22. @aviranm
Database
Changes
1. Add Fields
2. Remove Fields
3. Complete Schema /
Database Change
1.1. For adding metadata (non-
indexed fields)
Use a blob field for schema flexibility (JSON
works really well).
1.2. If the fields are searchable
(indexed fields)
Use another table and join by primary key.
2. Stop using it in the code.
Do not do any DB schema changes.
3. Lazy migration
24. @aviranm
Feature Toggle =
Code branch
Not just a Boolean, can also be a state.
Can have criteria:
Company employees
Specific users / group
Percentage of traffic
By GEO
By Language
By user-agent
User Profile based
Any other context…
FT
Open
New Code Old Code
FT
Open
http://github.com/wix/petri
25. @aviranm
New DB Schema
with Data Migration
Deploy the
new schema/DB
Plan a lazy
migration
path
controlled
by feature
toggle
26. @aviranm
Point of
No Return
Warning!
Distributed
Transaction
Fail on write to
old, “ignore"
failure on new
#1
Backward
compatibility
is a must!
Your old DB is
now read-only
and will not
change.
#2
Write to both (first old then
new) / Read from old
#3
Write to both / Read from
New, fallback to old
#6
Write and Read to new -
Remove migration code
#5
Eagerly migrate data in the
background
#4
Write only to New / Read
from new, fallback to old
Write to old / Read from
old
38. @aviranm
Cross DC Flows
DC-1 DC-2
Load Balancer Load Balancer
Product Service
Slave
DB
Master
DB
Read data
Replicate
Product Service
Slave
DB
Master
DB
Read data
ReplicateReplicate
40. @aviranm
Master DC
Configure Master DC in the LB
Configure API-level Stickiness
DC-1 GetConsistentProduct(…)
GetConsistentProduct(…)
Product Service
Slave
DB
Master
DB
Read data
Replicate
DC-1
Product Service
Slave
DB
Master
DB
Read data
Replicate
DC-2
Load Balancer Load Balancer
Replicate
42. @aviranm
Master DC
Configure Master DC in the LB
Configure Service-level Stickiness
DC-1
GetConsistentProduct(…)
Product Write
Service
Slave
DB
Master
DB
Replicate
DC-1
Slave
DB
Master
DB
Replicate
DC-2
Replicate
Load Balancer Load Balancer
Product Read
Service
Product Write
Service
Product Read
Service
44. @aviranm
Master DC
Configure Master DC in the SQL Proxy
DC-1
GetConsistentProduct(…)
Slave
DB
Master
DB
Replicate
DC-1
Slave
DB
Master
DB
Replicate
DC-2
Replicate
Load Balancer Load Balancer
Product Service Product Service
SQL Proxy SQL Proxy