In few years, Wix grew from a small startup with traditional system architecture (based on a monolithic server) to a company that serves 100 million users. Aviran Mordo explains how Wix evolved from a monolithic system to microservices, using some interesting patterns like CQRS to build a blazing-fast, highly scalable, and highly available system.
4. @aviranm
Wix in Numbers
Over 100M users
(website builders)
Static storage is
>5PB of data
~1500 people
work at Wix
3 data centers
+ clouds (Google,
Amazon, Azure)
3B HTTP
requests/day
5. @aviranm
Lighttpd
(file serving)
Wix
(Tomcat)
MySQL
DB
❤
Built for fast development
Stateful login (Tomcat session),
Ehcache, file uploads
No consideration for performance,
scalability & testing
Intended for short-term use
Tomcat, Hibernate,
Custom web
framework
Initial Architecture
6. @aviranm
Lighttpd
(file serving)
Wix
(Tomcat)
MySQL
DB
One monolithic giant that handled
everything – stability problem
Dependency between features
Changes in unrelated areas of the
system caused deployment of the
whole system
Failure in unrelated areas will cause
system wide downtime
Tomcat, Hibernate,
Custom web
framework
Initial Architecture
7. @aviranm
💣
Large and entangled codebase
Hard feature rollout
While at the same time,
the iPad was released
We needed to enable Wix to
move fast
Poor
Development
Velocity
15. @aviranm
Concerns & SLAs
▪ Data Validation
▪ Security / Authentication
▪ Data consistency
▪ Lots of data
Edit websites
▪ High availability
▪ High performance
▪ High traffic volume
▪ Long tail (mutable)
View Sites
(created by Wix Editor)
▪ High availability
▪ High performance
▪ Lots of static files
▪ Very high traffic volume
▪ Viewport optimization
▪ Long tail (immutable)
Serve Media
17. @aviranm
Risk Scoping
BA
DC
❤ Independent build and deployment
❤ Independent scalability
❤ Independent database
💣 Remaining risk - only from interactions with other services
22. @aviranm
Find Your
Critical Path ▪ DB outage with fast recovery =
replication
▪ Data poisoning/corruption = revisions
/ backup
▪ Make the data available at all times =
data distribution to multiple locations
/ providers
Protect the
Data
23. @aviranm
Browser
200 OK Notify
Save
page(s) Editor
Server
Save
page
MySQL
Active Sites
MySQL
ArchiveDC
Replication
MySQL
Active Sites
MySQL
Archive
WixMedia
(Google)
Upload
GCS
WixMedia
(Amazon)
S3
Save flow
24. @aviranm
Browser
200 OK Notify
Save
page(s) Editor
Server
Save
page
MySQL
Active Sites
MySQL
ArchiveDC
Replication
MySQL
Active Sites
MySQL
Archive
WixMedia
(Google)
Upload
GCS
WixMedia
(Amazon)
S3
Self-Healing Process
25. @aviranm
Browser
200 OK
Save
page(s) Editor
Server
Save
page
MySQL
Active Sites
MySQL
ArchiveDC
Replication
MySQL
Active Sites
MySQL
Archive
No DB Transactions
1. Save each page (JSON) as an
atomic operation
2. Page ID is a content based hash
(immutable/idempotent)
3. Finalize transaction by sending
site header (list of pages)
Can generate orphaned pages,
not a problem in practice
1
2
3
27. @aviranm
Wix Media Platform
(WixMP)
▪ Eventual-consistent distributed file system
(5PB user media files)
▪ Dynamic media processing
▪ Multi datacenter aware
▪ Runs on commodity servers & cloud
▪ CDN cache (immutable files)
▪ Automatic fallback across DCs
Critical Path -
Serve Fast
30. @aviranm
Public Segment Roles
▪ Routing (resolve URLs)
▪ Dispatching (to a renderer)
▪ Rendering (HTML,XML,TXT)
Public
Server
HTML
Renderer
HTML
SEO
Renderer
Flash
Renderer
Sitemap
Renderer
Robots.txt
Renderer
Flash
SEO
Renderer
www.example.com
33. @aviranm
Built for Speed
▪ Minimize out-of-process hops (2 DB, 1 RPC)
▪ Lookup tables are cached in memory (using Koboshi github.com/wix/Koboshi)
▪ Denormalized data – optimize for read by primary key (MySQL)
▪ Minimize business logic
34. @aviranm
How a Page Gets Rendered
JavaScript imports
JSON data
(site header + dynamic data)
Bootstrap HTML Template
that Contains Only Data
35. @aviranm
Offload the Rendering Work
to the Browser
The average Intel Core i750 can push
up to 7 GFLOPS without overclocking
Renderer
36. @aviranm
▪ Easy to parse
in JavaScript and Java/Scala
▪ Fairly compact text format
▪ Highly compressible
(5:1 even for small payloads)
▪ Easy to fix rendering bugs and
cross browsers issues
(just deploy a new code)
Why JSON
39. @aviranm
Browser
Serving a Site
A Sunny Day
http://example.wix.com
LB
Public
Renderer
HTTP
Request HTML
CDN WixMP
Resources / Media
Failure Points!
40. @aviranm
Browser
Serving a Site
DC Lost
LB
Public
Renderer
HTTP
Request
LB
Public
Renderer
LB
Public
Renderer
Change DNS
CDN WixMP
http://example.wix.com
41. @aviranm
Serving a Site
Public Lost
CDN WixMP
Browser
LB
Public
Renderer
HTTP
Request
Public
Renderer
LB
Public
Renderer
Fallback
to 2nd DC
http://example.wix.com
42. @aviranm
Serving a Site
Living in the Browser
CDN WixMP
Browser
LB
Public
Renderer
HTTP
Request HTML
CDN WixMP
JSON / Media
CDN
JSON / Media
Fallback
Editor
Pages
Fallback
Fallback
WixMP
http://example.wix.com
43. Summary
▪ Identify concerns and SLA for different parts of the system
▪ Build redundancy in critical path (for availability)
▪ De-normalize data (for performance)
▪ Minimize out-of-process hops (for performance)
▪ Take advantage of client’s CPU power