Ceph at salesforce ceph day external presentation

•

1 j'aime•386 vues

Sameer Tiwari

Few highlights of the work done at Salesforce on Ceph in 2016

Technologie

Ceph at Salesforce
Sameer Tiwari - Principal Architect, Storage Cloud
stiwari@salesforce.com
@techsameer
https://www.linkedin.com/in/sameer-tiwari-1961311/
3/17/2017 - Ceph Day at San Jose

Data Types
Structured Customer Data: Mostly transactional data on RDBMS
Unstructured Customer Data: Immutable blobs on home grown distributed storage system
SAN usage across multiple use cases
Backups: Both commercial solutions and internal systems
Caching : Immutable structured blobs
Events : On HDFS (plus other systems along the way)
Logs : On HDFS (plus other systems along the way)

Storage Technologies Used
File Storage
NOSQL
HBase
HDFS
SAN
SDS (Software
Designed Store)
on scale-out
commodity
hardware

Uses for Ceph
Block Store
Backend for RDBMs (Maybe with BK for journal)
Various size (to >> local disk) mountable disk on the cloud
Re-mountable storage for VMs
Replace some SAN scenarios
Blob Store
General purpose blob store
Sharing of data across users
Examples : VM/Container images, Core Dumps, Large file transfer, Customer Data, IoT

Ceph Rados
RadosGW
Private
Cloud
SF
Services
Hardware Storage SKU farm
10+ GigE Network
SF Blob
Service
Cloud Applications
RDBMS
SANs
Org specific
operations
SF Block
Service
Salesforce Infrastructure and Ceph

Current Status
Experimenting with multiple small test clusters (~100 nodes)
Machines are generally with lots of RAM, few SSDs and a bunch of HDDs
Currently on a single 10G Network, moving to much bigger
Machines are spread across lots of racks, but in a single room (very little over provisioning)
Testing only rbd
Simple crushmap mods for creating SSD only pools, and availability zones
Very high magnitude of scale: multiple clusters, across multiple DC, each multi-tenant
Operationalize for a very different and challenging requirement

Performance numbers (using fio to provide test load)
SSD only pool with 12 machines, 2X12 CPU, 128G, 2X480G SSD
Random R / W for 8K blocks, 70/30 ratio

Performance numbers (using fio to provide test load)
SSD only pool with 12 machines, 2X12 CPU, 128G, 2X480G SSD
Sequential Write for 128K blocks

Performance numbers (using fio to provide test load)
SSD only pool with 12 machines, 2X12 CPU, 128G, 2X480G SSD
Random Read for 8K blocks

Experiments
Pre-work: Hookup metrics, logs and alerts to Salesforce Infrastructure
Fio perf on mounted client side block device with XFS
Testing lots and lots of failure scenarios (think chaos monkey)
More focus on slow devices (network, host, disk)
Crushmap settings for heterogenous environments (will build a tool to generate this
automatically)
Set up a CI/CD pipeline
Running Ceph in a dockerized environment with Kubernetes
Ability to patch a deployed cluster (OS, Docker, Ceph)
Going over the code, line by line

Future
Read from any replica (inconsistent reads should help in tail latency)
Can reads search the journal (should help in tail latency)
Need pluggability in RGW, there is a pre_exec() in rgw_op.cc OR
Extend the RGWHandler class, or use the pre_exec() call in RGWOp class

Challenges of Storage Services at Salesforce
Scale brings problems all its own - more hardware to fail or act funny, regular cap add, hw
changes
Multiple dimensions of multi-tenancy
External Customers (isolation, auth/encryption, security, perf, availability, durability, etc.)
Service supporting many many use cases and internal platforms
Running large # of clusters in large # of data centers

Questions?
Sameer Tiwari - Principal Architect, Storage Cloud
● stiwari@salesforce.com
● @techsameer
● https://www.linkedin.com/in/sameer-tiwari-1961311/

Recommandé

Ceph Object Storage at SpreadshirtJens Hadlich

Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Jens Hadlich

Ceph Day Seoul - The Anatomy of Ceph I/OCeph Community

Red Hat Gluster StorageKatsutoshi Kojima

Life as a GlusterFS Consultant with Ivan RossiGluster.org

New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju

GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GRTheophanis Kontogiannis

Ceph Object Storage Reference Architecture Performance and Sizing GuideKaran Singh

Recommandé

Ceph Object Storage at SpreadshirtJens Hadlich

Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Jens Hadlich

Ceph Day Seoul - The Anatomy of Ceph I/OCeph Community

Red Hat Gluster StorageKatsutoshi Kojima

Life as a GlusterFS Consultant with Ivan RossiGluster.org

New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju

GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GRTheophanis Kontogiannis

Ceph Object Storage Reference Architecture Performance and Sizing GuideKaran Singh

Ceph Day KL - Ceph Tiering with High Performance ArchiectureCeph Community

BluestorePatrick McGarry

An intro to Ceph and big data - CERN Big Data WorkshopPatrick McGarry

Backup / Restore to Cloud Storage with esXpress and CloudArray softwareTwinStrata

Containers and DatabasesFernando Ike

Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...Lviv Startup Club

Using Ceph for Large Hadron Collider DataRob Gardner

Update on Crimson - the Seastarized Ceph - Seastar SummitScyllaDB

QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry

SUSE Storage: Sizing and Performance (Ceph)Lars Marowsky-Brée

Ceph at Spreadshirt (June 2016)Jens Hadlich

Efficient data maintaince in GlusterFS using DatabasesJoseph Elwin Fernandes

Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ontico

OSDC 2013 | Scale-Out made easy: Petabyte storage with Ceph by Martin Gerhard...NETWAYS

Ceph Day KL - Ceph on All-Flash Storage Ceph Community

Red Hat Storage Server Administration Deep DiveRed_Hat_Storage

Hadoop over rgwzhouyuan

VirtualStor Extreme - Software Defined Scale-Out All Flash StorageGIGABYTE Technology

Quick-and-Easy Deployment of a Ceph Storage ClusterPatrick Quairoli

Ceph Day Santa Clara: Ceph Performance & Benchmarking Ceph Community

Ceph Day San Jose - Ceph at Salesforce Ceph Community

Red Hat Storage 2014 - Product(s) OverviewMarcel Hergaarden

Contenu connexe

Tendances

Ceph Day KL - Ceph Tiering with High Performance ArchiectureCeph Community

BluestorePatrick McGarry

An intro to Ceph and big data - CERN Big Data WorkshopPatrick McGarry

Backup / Restore to Cloud Storage with esXpress and CloudArray softwareTwinStrata

Containers and DatabasesFernando Ike

Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...Lviv Startup Club

Using Ceph for Large Hadron Collider DataRob Gardner

Update on Crimson - the Seastarized Ceph - Seastar SummitScyllaDB

QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry

SUSE Storage: Sizing and Performance (Ceph)Lars Marowsky-Brée

Ceph at Spreadshirt (June 2016)Jens Hadlich

Efficient data maintaince in GlusterFS using DatabasesJoseph Elwin Fernandes

Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ontico

OSDC 2013 | Scale-Out made easy: Petabyte storage with Ceph by Martin Gerhard...NETWAYS

Ceph Day KL - Ceph on All-Flash Storage Ceph Community

Red Hat Storage Server Administration Deep DiveRed_Hat_Storage

Hadoop over rgwzhouyuan

VirtualStor Extreme - Software Defined Scale-Out All Flash StorageGIGABYTE Technology

Quick-and-Easy Deployment of a Ceph Storage ClusterPatrick Quairoli

Ceph Day Santa Clara: Ceph Performance & Benchmarking Ceph Community

Tendances (20)

Ceph Day KL - Ceph Tiering with High Performance Archiecture

Bluestore

An intro to Ceph and big data - CERN Big Data Workshop

Backup / Restore to Cloud Storage with esXpress and CloudArray software

Containers and Databases

Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...

Using Ceph for Large Hadron Collider Data

Update on Crimson - the Seastarized Ceph - Seastar Summit

QCT Ceph Solution - Design Consideration and Reference Architecture

SUSE Storage: Sizing and Performance (Ceph)

Ceph at Spreadshirt (June 2016)

Efficient data maintaince in GlusterFS using Databases

Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)

OSDC 2013 | Scale-Out made easy: Petabyte storage with Ceph by Martin Gerhard...

Ceph Day KL - Ceph on All-Flash Storage

Red Hat Storage Server Administration Deep Dive

Hadoop over rgw

VirtualStor Extreme - Software Defined Scale-Out All Flash Storage

Quick-and-Easy Deployment of a Ceph Storage Cluster

Ceph Day Santa Clara: Ceph Performance & Benchmarking

Similaire à Ceph at salesforce ceph day external presentation

Ceph Day San Jose - Ceph at Salesforce Ceph Community

Red Hat Storage 2014 - Product(s) OverviewMarcel Hergaarden

Open ebs 101LibbySchulze

OSDC 2015: John Spray | The Ceph Storage SystemNETWAYS

Ceph as software define storageMahmoud Shiri Varamini

Distributed Filesystems ReviewSchubert Zhang

Varrow datacenter storage today and tomorrowpittmantony

3487570solarisyougood

Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hari Shankar Sreekumar

Hadoop ArchitectureDelhi/NCR HUG

Storage, San And Business Continuity OverviewAlan McSweeney

Experience In Building Scalable Web Sites Through Infrastructure's ViewPhuwadon D

Hadoop Research Shreyansh Ajit kumar

Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFSUSE Italy

PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...Equnix Business Solutions

Keynote: Building Tomorrow's Ceph - Ceph Day Frankfurt Ceph Community

Storage Networksprakashjjaya

Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Community

Hbase: an introductionJean-Baptiste Poullet

Ceph Day Bring Ceph To EnterpriseAlex Lau

Similaire à Ceph at salesforce ceph day external presentation (20)

Ceph Day San Jose - Ceph at Salesforce

Red Hat Storage 2014 - Product(s) Overview

Open ebs 101

OSDC 2015: John Spray | The Ceph Storage System

Ceph as software define storage

Distributed Filesystems Review

Varrow datacenter storage today and tomorrow

3487570

Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)

Hadoop Architecture

Storage, San And Business Continuity Overview

Experience In Building Scalable Web Sites Through Infrastructure's View

Hadoop Research

Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF

PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...

Keynote: Building Tomorrow's Ceph - Ceph Day Frankfurt

Storage Networks

Ceph Day Shanghai - Hyper Converged PLCloud with Ceph

Hbase: an introduction

Ceph Day Bring Ceph To Enterprise

Dernier

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Artificial Intelligence: Facts and MythsJoaquim Jorge

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Histor y of HAM Radio presentation slidevu2urc

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Dernier (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Artificial Intelligence: Facts and Myths

Breaking the Kubernetes Kill Chain: Host Path Mount

A Domino Admins Adventures (Engage 2024)

Finology Group – Insurtech Innovation Award 2024

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

IAC 2024 - IA Fast Track to Search Focused AI Solutions

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Histor y of HAM Radio presentation slide

CNv6 Instructor Chapter 6 Quality of Service

Driving Behavioral Change for Information Management through Data-Driven Gree...

What Are The Drone Anti-jamming Systems Technology?

GenCyber Cyber Security Day Presentation

Ceph at salesforce ceph day external presentation

1. Ceph at Salesforce Sameer Tiwari - Principal Architect, Storage Cloud stiwari@salesforce.com @techsameer https://www.linkedin.com/in/sameer-tiwari-1961311/ 3/17/2017 - Ceph Day at San Jose

2. Data Types Structured Customer Data: Mostly transactional data on RDBMS Unstructured Customer Data: Immutable blobs on home grown distributed storage system SAN usage across multiple use cases Backups: Both commercial solutions and internal systems Caching : Immutable structured blobs Events : On HDFS (plus other systems along the way) Logs : On HDFS (plus other systems along the way)

3. Storage Technologies Used File Storage NOSQL HBase HDFS SAN SDS (Software Designed Store) on scale-out commodity hardware

4. Uses for Ceph Block Store Backend for RDBMs (Maybe with BK for journal) Various size (to >> local disk) mountable disk on the cloud Re-mountable storage for VMs Replace some SAN scenarios Blob Store General purpose blob store Sharing of data across users Examples : VM/Container images, Core Dumps, Large file transfer, Customer Data, IoT

5. Ceph Rados RadosGW Private Cloud SF Services Hardware Storage SKU farm 10+ GigE Network SF Blob Service Cloud Applications RDBMS SANs Org specific operations SF Block Service Salesforce Infrastructure and Ceph

6. Current Status Experimenting with multiple small test clusters (~100 nodes) Machines are generally with lots of RAM, few SSDs and a bunch of HDDs Currently on a single 10G Network, moving to much bigger Machines are spread across lots of racks, but in a single room (very little over provisioning) Testing only rbd Simple crushmap mods for creating SSD only pools, and availability zones Very high magnitude of scale: multiple clusters, across multiple DC, each multi-tenant Operationalize for a very different and challenging requirement

7. Performance numbers (using fio to provide test load) SSD only pool with 12 machines, 2X12 CPU, 128G, 2X480G SSD Random R / W for 8K blocks, 70/30 ratio

8. Performance numbers (using fio to provide test load) SSD only pool with 12 machines, 2X12 CPU, 128G, 2X480G SSD Sequential Write for 128K blocks

9. Performance numbers (using fio to provide test load) SSD only pool with 12 machines, 2X12 CPU, 128G, 2X480G SSD Random Read for 8K blocks

10. Experiments Pre-work: Hookup metrics, logs and alerts to Salesforce Infrastructure Fio perf on mounted client side block device with XFS Testing lots and lots of failure scenarios (think chaos monkey) More focus on slow devices (network, host, disk) Crushmap settings for heterogenous environments (will build a tool to generate this automatically) Set up a CI/CD pipeline Running Ceph in a dockerized environment with Kubernetes Ability to patch a deployed cluster (OS, Docker, Ceph) Going over the code, line by line

11. Future Read from any replica (inconsistent reads should help in tail latency) Can reads search the journal (should help in tail latency) Need pluggability in RGW, there is a pre_exec() in rgw_op.cc OR Extend the RGWHandler class, or use the pre_exec() call in RGWOp class

12. Challenges of Storage Services at Salesforce Scale brings problems all its own - more hardware to fail or act funny, regular cap add, hw changes Multiple dimensions of multi-tenancy External Customers (isolation, auth/encryption, security, perf, availability, durability, etc.) Service supporting many many use cases and internal platforms Running large # of clusters in large # of data centers

13.

14. Questions? Sameer Tiwari - Principal Architect, Storage Cloud ● stiwari@salesforce.com ● @techsameer ● https://www.linkedin.com/in/sameer-tiwari-1961311/