The story of the plan that was just crazy enough to work! Learn how Booking.com failed its way to success on a multi-year journey away from single-purpose storage-appliances, predatory-licensing, and over-complicated networking to create a unique storage solution for their hyper-scale private-cloud environment.
2. About
● Majority part of Booking Holdings (formerly Priceline Group)
● One of the largest e-Commerce websites in the world
● The largest online accommodation website in the world
● >1.5 Million properties in 220+ countries and territories
● 1.55 Million room nights booked every 24 hours
● >15,000 employees in 198 offices in 70 countries
● 1000s of LUNs, NFS shares, and S3 buckets
● Managed by a storage team of only 4 people
(as of October, 2019)
9. Attributes of SDS Hardware
●Compact Chassis
●Standard Form Factor
●Standard Power
●Standard Cooling
●Easy to Service
"Shermans and T34s"
●Extensible Design
●Off the Shelf Ordering
●Optimized Supply Chain
●Multi Purpose
●Cost Effective
MAXIMUM RE-USABILITY
11. Building a BOMB
● 2U max height
● 90cm max depth
● ~500W power
● 100Gb networking
● Non-volatile memory
● NVMe and high-capacity disk
● Maximize Terabytes/Watt
● Broadest software eco-system possible
● Simple enough for remote hands to install
???
Thinking inside the box....
(Bill of Materials)
12. The plan "crazy enough to work"
Image credit: Dutch National Archives Image credit: U.S. Public Domain
13. The Booking BOM Gen1 "SFF" Intel
● 2 x Intel Skylake 6146 CPUs
● 12 x 32GB DDR4 RAM (384GB total)
● 12 x 16GB NVDIMM-N (192GB total)
● 1 x 100Gb NIC
● 4 - 16 x 15.36TB (245.76TB total)
● 1 x HHHL FPGA storage accelerator (optional)
● Tool-less L-bracket rails
● Color-coded C13 to C14 power-cords
● 50cm and 1m network cables in the box
● Installs in < 5mins
14. The Booking BOM Gen2 "SFF" AMD
● 1 x AMD EPYC Rome 7402P 32-core CPU
● 12 x 128GB DDR4 RAM (1.5TB total)
● 4 x 32GB NVDIMM-N (128GB total)
● 1 x 100Gb NIC
● 4 - 24 x 15.36TB U.2 NVMe (368.64TB max)
● 1 x HHHL FPGA storage accelerator (optional)
● Tool-less L-bracket rails
● Color-coded C13 to C14 power-cords
● 50cm and 1m network cables in the box
● Installs in < 5mins
15. The Booking BOM Gen1 "LFF"
● 2 x Intel Skylake 6132 CPUs
● 12 x 32GB DDR4 RAM (384GB total)
● 12 x 16GB NVDIMM-N (192GB total)
● 1 x 100Gb NIC
● 12 x 14TB 7200rpm SATA disks (224TB total)
● 2 x 15.36TB HHHL NVMe (30.72 TB total)
● Tool-less L-bracket rails
● Color-coded C13 to C14 power-cords
● 50cm and 1m network cables in the box
● Installs in < 5mins
18. Re-thinking the solution...
12 + 4 = 16 x 14TB = 224TB Disk Capacity
2 x 15.36 = 30.72TB NVMe SSD Capacity
12 x 14TB Disks 4 x 14TB Disks
1 x 100Gb NIC
2 x 15.36TB NVMe
19. How complex is this?
1 2
3 4
4 cables x (2U - 1) / 1 rack = factor 4 complexity
21. What we have achieved so far...
● Deployed ~100 2U storage nodes
● Eliminated dedicated storage racks
● Cut power draw by over 50%
● Gone "all in" on software defined storage
● Eliminated storage hardware maintenance
● Switched entirely to software subscriptions
● Increased utilization while reducing costs
● Put total storage spending on a downward trend
● ...despite continued high data growth!
23. What is your Unthinkable?
Image credit: Michael Coppins - Wikimedia Commons
24. ● Plans you do not want to execute can yield great outcomes
● Define your own Unthinkable when it comes to SDS
● Build a BOM and don't compromise
● Recruit allies - you will need them!
● Expect failure and work through it
● Take your time
Closing.thoughts