SlideShare une entreprise Scribd logo
1  sur  73
Télécharger pour lire hors ligne
1 million writes per sec. on 60 nodes with Cassandra and EBS
© 2015. All Rights Reserved.
1 Million Writes Per Second w/60
nodes. !
EBS and C*!
Jim Plush - Sr Director of Engineering, CrowdStrike!
Dennis Opacki - Sr Cloud Systems Architect!
An Introduction
to CrowdStrike
We Are CyberSecurity Technology Company
We Detect, Prevent And Respond To All Attack Types In Real Time,
Protecting Organizations From Catastrophic Breaches
We Provide Next Generation Endpoint Protection, Threat Intelligence & Pre &Post
IR Services
NEXT-GEN
ENDPOINT
INCIDENT
RESPONSE
THREAT
INTEL
http://www.crowdstrike.com/introduction-to-crowdstrike-falcon-host/
CrowdStrike Scale
•  Cloud based endpoint protection
•  Single customer can generate > 2TB daily
•  500K+ Events Per Second
•  Multi PetaBytes of managed data
© 2015. All Rights Reserved.
Truisms???
•  HTTPs is too slow to run everywhere
•  All you need is anti-virus
•  Never run Cassandra on EBS
© 2015. All Rights Reserved.
© 2015. All Rights Reserved.
What is EBS?
EBS Data Volume
EBS Data Volume
/mnt/foo
/mnt/bar
EC2 Instance
§  Network Mounted Hard Drive
§  Ability to snapshot data
§  Data encryption at rest & in flight
Existing EBS Assumptions
•  Jittery I/O aka: Noisy neighbors
•  Single Point of Failure in a Region
•  Cost is too damn high
•  Bad Volumes (dd and destroy)
© 2015. All Rights Reserved.
A recent project: initial requirements
•  1PB of incoming event data from millions of devices
•  Modeled as a graph
•  1 million writes per second (burst)
•  Age data out after x days
•  95% write 5% read
© 2015. All Rights Reserved.
We Tried
•  Cassandra + Titan
•  Sharding?
•  Neo4J
•  PostgreSQL, MySQL, SQLite
•  LevelDB/RocksDB
© 2015. All Rights Reserved.
We have to make this work
•  Cassandra had the properties we needed
•  Time for a new approach?
© 2015. All Rights Reserved. http://techblog.netflix.com/2014/07/revisiting-1-million-writes-per-second.html
Number of Machines for 1PB
© 2015. All Rights Reserved.
0.
450.
900.
1350.
1800.
2250.
I2.xlarge c4.2XL EBS
Yearly Cost for 1PB Cluster
© 2015. All Rights Reserved.
0.
4.
8.
12.
16.
I2.xlarge-on demand I2.xlarge-reserved c4.2xl - on demand c4.2xl - reserved
Millionsof$
With EBS
Initial Launch
Date Tiered Compaction
© 2015. All Rights Reserved.
…more details by Jeff Jirsa, CrowdStrike
Cassandra Summit 2015 - DTCS
Initial Launch
•  Cassandra 2.0.12 (DSE)
•  m3.2xlarge 8 core
•  Single 4TB EBS GP2 ~10,000 IOPS
•  Default tunings
© 2015. All Rights Reserved.
Performance was terrible
•  12 node cluster
•  ~60K writes per second RF2
•  ~10K writes per 8 core box
•  We went to the experts
© 2015. All Rights Reserved.
© 2015. All Rights Reserved.
Cassandra Summit 2014
Family Search asked the
same question:
Where’s the bottleneck?
https://www.youtube.com/watch?v=Qfzg7gcSK-g
IOPS Available
© 2015. All Rights Reserved.
0.
12500.
25000.
37500.
50000.
I2.xlarge c4.2xlarge
© 2015. All Rights Reserved.
1.3K IOPS?
© 2015. All Rights Reserved.
IOPS
I see you there,
but I can’t reach you!
© 2015. All Rights Reserved.
The magic gates
opened…
We hit 1 million
writes per second
RF3 on 60 nodes
© 2015. All Rights Reserved.
Testing Setup!
Testing Methodology
•  Each test run
•  clean C* instances
•  old test keyspaces dropped
•  13+TBs of data loaded during read testing
•  20 C4.4XL Stress Writers each with their own 1BB sequence
© 2015. All Rights Reserved.
Cluster Topology
© 2015. All Rights Reserved.
Stress Node
10 Instances
AZ: 1A
Stress Nodes
10 Instances
AZ: 1B
20 C* Nodes
AZ: 1A
20 C* Nodes
AZ: 1B
20 C* Nodes
AZ: 1C
OpsCenter
EBS
© 2015. All Rights Reserved.
Cassandra Stress 2.1.x
© 2015. All Rights Reserved.
bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops
(insert=1) no-warmup -pop seq=1..1000000000 -mode native cql3 -node 10.10.10.XX -rate
threads=1000 -errors ignore!
© 2015. All Rights Reserved.
PCSTAT - Al Tobey
http://www.datastax.com/dev/blog/compaction-improvements-in-cassandra-21
https://github.com/tobert/pcstat
© 2015. All Rights Reserved.
Netflix Test - What is C* capable of?
Netflix Test
© 2015. All Rights Reserved.
1+ Million Writes Per second RF:3 3+ Million Local Writes Per second
NICE!
Netflix Test
© 2015. All Rights Reserved.
Netflix Test
© 2015. All Rights Reserved.
No Dropped Mutations, system healthy at 1.1M after 50 mins
Netflix Test
© 2015. All Rights Reserved.
I/O Util is not peggedCommit Disk = Steady!
Netflix Test
© 2015. All Rights Reserved.
Low IO Wait
Netflix Test
© 2015. All Rights Reserved.
95th Latency = Reasonable
Netflix Test - Read Fail
© 2015. All Rights Reserved.
compression={'chunk_length_kb': '64', 'sstable_compression': 'LZ4Compressor'}
https://issues.apache.org/jira/browse/CASSANDRA-10249
https://issues.apache.org/jira/browse/CASSANDRA-8894
Data Drive Pegged L
Reading Data
•  24 hour read test
•  over 10TBs of data in the CF
•  sustained > 350K reads per
second over 24 hours
•  1M reads/per sec peak
•  CL ONE
•  12 C4.4XL stress boxes
© 2015. All Rights Reserved.
Reading Data
© 2015. All Rights Reserved.
Reading Data
© 2015. All Rights Reserved.
Reading Data
© 2015. All Rights Reserved.
Not Pegged J
Reading Data
© 2015. All Rights Reserved.
7.2ms 95th latency
Netflix Test resource usage
•  180 Less Cores (45 less i2.xlarge instances)
•  24 hour test (sans data transfer cost)
–  Netflix cluster/stress
•  Cost: ~$6300
•  285 i2.xlarge $0.85 per hour
–  CrowdStrike cluster/stress with EBS cost
•  Cost: ~$2600
•  60 C4.4XL $0.88 per hour
Read Notes with EBS
•  Our test was a single 10K IOPS volume
•  More/Bigger Reads?
–  PIOPS gives you as much throughput as you need
–  RAID0 multiple EBS volumes
/mnt/data
EBS Vol1 EBS Vol2
© 2015. All Rights Reserved.
What Unlocked Performance!
Major Tweaks
•  Ubuntu HVM types
•  Enhanced Networking
•  now faster than PVM
•  Ubuntu distro tuned for cloud workloads
•  XFS Filesystem
© 2015. All Rights Reserved.
Major Tweaks
•  Major Tweaks
•  Cassandra 2.1
•  Java 8
•  G1 Garbage Collector - cassandra-env
© 2015. All Rights Reserved.
https://issues.apache.org/jira/browse/CASSANDRA-7486
Major Tweaks
•  C4.4XL 16 core, EBS Optimized
•  4TB, 10,000 IOPS EBS GP2 Encrypted Data Drive
–  160MB/s throughput
•  1TB 3000 IOPS EBS GP2 Encrypted Commit Log Drive
© 2015. All Rights Reserved.
Major Tweaks
•  cassandra-env.sh
•  MAX_HEAP_SIZE=8G
•  JVM_OPTS=“$JVM_OPTS —XX:+UseG1GC”
•  Lots of other minor tweaks
© 2015. All Rights Reserved.
cassandra-env.sh
© 2015. All Rights Reserved.
Put PID in batch mode
Mask CPU0 from the process to reduce context switching
Magic From Al Tobey
YAML Settings
•  cassandra.yaml (based on 16 core)
•  concurrent_reads: 32
•  concurrent_writes: 64
•  memtable_flush_writers: 8
•  trickle_fsync: true
•  trickle_fsync_interval_in_kb: 1000
•  native_transport_max_threads: 256
•  concurrent_compactors: 4
© 2015. All Rights Reserved.
cassandra.yaml
© 2015. All Rights Reserved.
We found a good portion of the CPU load was
being used for internode compression which
reduced write throughput
internode_compression: none
Lessons Learned
•  EBS was never the bottleneck during testing, GP2 is legit
•  If you’re doing batching, write to the same rowkey in the batch
•  Builtin types like list and map come at a performance penalty
•  30% hit on our writes using Map type
•  DTCS is very young (see Jeff Jirsa’s talk)
•  2.1 Stress Tool is tricky but great for modeling workloads
•  How will compression affect your read path?
© 2015. All Rights Reserved.
© 2015. All Rights Reserved.
Test your own!
https://github.com/CrowdStrike/cassandra-tools
It’s just python
•  launch 20 nodes in us-east1
•  python launch.py launch --nodes=20 —config=c4-ebs-hvm
—az=us-east-1a
•  bootstrap the new nodes with C*, RAID/Format disks, etc…
•  fab -u ubuntu bootstrapcass21:config=c4-highperf
•  run arbitrary commands
•  fab -u ubuntu cmd:config=c4-highperf,cmd="sudo rm -rf /
mnt/cassandra/data/summit_stress"
© 2015. All Rights Reserved.
Run custom stress profiles… multi-node support
ubuntu@ip-10-10-10.XX:~$ python runstress.py --profile=stress10 —seednode=10.10.10.XX —-threads=50!
!
!
Going to run: /home/ubuntu/apache-cassandra-2.1.5/tools/bin/cassandra-stress user duration=100000m
cl=ONE profile=/home/ubuntu/summit_stress.yaml ops(insert=1,simple=9) no-warmup -pop
seq=1..1000000000 -mode native cql3 -node 10.10.10.XX -rate threads=50 -errors ignore!
© 2015. All Rights Reserved.
ubuntu@ip-10-10-10.XX:~$ python runstress.py --profile=stress10 --seednode=10.10.10.XX --threads=50!
!
Going to run: /home/ubuntu/apache-cassandra-2.1.5/tools/bin/cassandra-stress user duration=100000m
cl=ONE profile=/home/ubuntu/summit_stress.yaml ops(insert=1,simple=9) no-warmup -pop
seq=1000000001..2000000000 -mode native cql3 -node 10.10.10.XX -rate threads=50 -errors ignore!
export NODENUM=1!
export NODENUM=2!
Where are we today?
•  ~3 months on our EBS based cluster
•  Hundreds of TBs of graph data and growing in C*
•  Billions of vertices/edges
•  Changing perceptions?
Special thanks to
© 2015. All Rights Reserved.
•  Leif Jackson
•  Marcus King
•  Alan Hannan
•  Jeff Jirsa
•  Al Tobey
•  Nick Panahi
•  J.B. Langston
•  Marcus Eriksson
•  Iian Finlayson
•  Dani Traphagen
EBS heading into 2016
© 2015. All Rights Reserved.
4TB	
  (10k	
  IOPS)	
  GP2	
  
IO	
  Hit?	
  Not	
  enough	
  to	
  phase	
  C*	
  
© 2015. All Rights Reserved.
	
  
	
  
So	
  why	
  the	
  hate	
  for	
  EBS?	
  
© 2015. All Rights Reserved.
Following	
  the	
  Crowd	
  –	
  Trust	
  Issues	
  
	
  
•  Used	
  instance-­‐store	
  image	
  and	
  ephemeral	
  
drives	
  
•  Painful	
  to	
  stop/start	
  instances,	
  resize	
  
•  Couldn’t	
  avoid	
  scheduled	
  maintenance	
  (i.e.	
  
Reboot-­‐a-­‐palooza)	
  
•  EncrypUon	
  required	
  shenanigans	
  
© 2015. All Rights Reserved.
Guess	
  What?	
  
•  We	
  sUll	
  had	
  failures	
  
•  Now	
  we	
  get	
  to	
  rebuild	
  from	
  scratch	
  
© 2015. All Rights Reserved.
What	
  do	
  you	
  mean	
  my	
  volume	
  is	
  “stuck”?	
  
	
  
•  April	
  2011	
  –	
  Ne[lix,	
  Reddit	
  and	
  Quora	
  
•  October	
  2012	
  –	
  Reddit,	
  Imgur,	
  Heroku	
  
•  August	
  2013	
  –	
  Vine,	
  AirBNB	
  
EBS’s	
  Troubled	
  Childhood	
  
© 2015. All Rights Reserved.
h`p://techblog.ne[lix.com/2011/04/lessons-­‐
ne[lix-­‐learned-­‐from-­‐aws-­‐outage.html	
  
	
  
•  Spread	
  services	
  across	
  mulUple	
  regions	
  
•  Test	
  failure	
  scenarios	
  regularly	
  (Chaos	
  Monkey)	
  
•  Make	
  Cassandra	
  databases	
  more	
  resilient	
  by	
  avoiding	
  
EBS	
  
Kiss	
  of	
  Death	
  
© 2015. All Rights Reserved.
Amazon	
  moves	
  quickly	
  and	
  quietly:	
  
	
  
•  March	
  2011	
  –	
  New	
  EBS	
  GM	
  
•  July	
  2012	
  –	
  Provisioned	
  IOPs	
  
•  May	
  2014	
  –	
  NaUve	
  EncrypUon	
  
•  Jun	
  2014	
  –	
  GP2	
  (game	
  changer)	
  
•  Mar	
  2015	
  –	
  16TB	
  /	
  10K	
  GP2/	
  20K	
  PIOPS	
  
	
  
	
  
RedempUon	
  
© 2015. All Rights Reserved.
•  PrioriUzed	
  EBS	
  availability	
  and	
  consistency	
  beyond	
  features	
  and	
  
funcUonality	
  
•  Compartmentalized	
  the	
  control	
  plane	
  -­‐	
  broke	
  cross-­‐AZ	
  dependencies	
  
for	
  running	
  volumes	
  
•  Simplified	
  workflows	
  to	
  favor	
  sustained	
  operaUon	
  
•  Tested	
  and	
  simulated	
  via	
  TLA+/PlusCal	
  -­‐	
  be`er	
  understood	
  corner	
  cases	
  
•  Dedicated	
  a	
  large	
  fracUon	
  of	
  engineering	
  resources	
  to	
  reliability	
  and	
  
performance	
  
	
  
RedempUon	
  
© 2015. All Rights Reserved.
Reliability	
  
	
  
EBS	
  Team	
  targets	
  99.999%	
  availability	
  
	
  
	
  exceeding	
  expectaUons	
  
© 2015. All Rights Reserved.
Crowdstrike	
  Today	
  
In	
  past	
  12	
  months,	
  zero	
  EBS-­‐related	
  failures	
  
•  Thousands	
  of	
  GP2	
  data	
  volumes	
  (~2PB	
  data)	
  
•  TransiUoning	
  all	
  systems	
  to	
  EBS	
  root	
  drives	
  
•  Moved	
  all	
  data	
  stores	
  to	
  EBS	
  (C*,	
  Kapa,	
  
ElasUcsearch,	
  Postgres,	
  etc)	
  
© 2015. All Rights Reserved.
Staying	
  Safe	
  -­‐	
  Architecture	
  
•  Select	
  a	
  region	
  with	
  >2	
  AZs	
  (e.g	
  us-­‐east-­‐1	
  or	
  us-­‐
west-­‐2)	
  
	
  
•  Use	
  EBS	
  GP2	
  or	
  PIOPs	
  storage	
  
•  Separate	
  volumes	
  for	
  data	
  and	
  commit	
  logs	
  
© 2015. All Rights Reserved.
Staying	
  Safe	
  -­‐	
  Ops	
  
•  Use	
  EBS	
  volume	
  monitoring	
  
•  Pre-­‐warm	
  EBS	
  volumes?	
  
•  Schedule	
  snapshots	
  for	
  consistent	
  backups	
  
© 2015. All Rights Reserved.
Most	
  Importantly	
  
•  Challenge	
  assumpUons	
  
•  Stay	
  current	
  on	
  AWS	
  blog	
  
•  Talk	
  with	
  your	
  peers	
  
Thank you
@jimplush
@opacki

Contenu connexe

Tendances

Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internalsKostas Tzoumas
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceBrendan Gregg
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistentconfluent
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesDataWorks Summit
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitFlink Forward
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache KafkaPaul Brebner
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkSamy Dindane
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안SANG WON PARK
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariDataWorks Summit
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizonThejas Nair
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBill Liu
 
Understanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitUnderstanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitSpark Summit
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
 

Tendances (20)

Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache Ambari
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizon
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudi
 
Understanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitUnderstanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And Profit
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 

En vedette

(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per SecondAmazon Web Services
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...DataStax
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Dave Gardner
 
Deep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceDeep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceAmazon Web Services
 
How to run an Enterprise PHP Shop
How to run an Enterprise PHP ShopHow to run an Enterprise PHP Shop
How to run an Enterprise PHP ShopJim Plush
 
Event-Stream Processing with Kafka
Event-Stream Processing with KafkaEvent-Stream Processing with Kafka
Event-Stream Processing with KafkaTim Lossen
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compactionMIJIN AN
 
PostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLPostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLCockroachDB
 
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...Amazon Web Services
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detailMIJIN AN
 
AWS - an introduction to bursting (GP2 - T2)
AWS - an introduction to bursting (GP2 - T2)AWS - an introduction to bursting (GP2 - T2)
AWS - an introduction to bursting (GP2 - T2)Rasmus Ekman
 
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...Amazon Web Services
 
Maximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk PerformanceMaximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk PerformanceAmazon Web Services
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
Cassandra Tuning - Above and Beyond (Matija Gobec, SmartCat) | Cassandra Summ...
Cassandra Tuning - Above and Beyond (Matija Gobec, SmartCat) | Cassandra Summ...Cassandra Tuning - Above and Beyond (Matija Gobec, SmartCat) | Cassandra Summ...
Cassandra Tuning - Above and Beyond (Matija Gobec, SmartCat) | Cassandra Summ...DataStax
 
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016DataStax
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSDataStax Academy
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
 

En vedette (20)

(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
 
Running Cassandra in AWS
Running Cassandra in AWSRunning Cassandra in AWS
Running Cassandra in AWS
 
Deep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceDeep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS Performance
 
How to run an Enterprise PHP Shop
How to run an Enterprise PHP ShopHow to run an Enterprise PHP Shop
How to run an Enterprise PHP Shop
 
Event-Stream Processing with Kafka
Event-Stream Processing with KafkaEvent-Stream Processing with Kafka
Event-Stream Processing with Kafka
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
PostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLPostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQL
 
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
AWS - an introduction to bursting (GP2 - T2)
AWS - an introduction to bursting (GP2 - T2)AWS - an introduction to bursting (GP2 - T2)
AWS - an introduction to bursting (GP2 - T2)
 
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
 
Maximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk PerformanceMaximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk Performance
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Cassandra Tuning - Above and Beyond (Matija Gobec, SmartCat) | Cassandra Summ...
Cassandra Tuning - Above and Beyond (Matija Gobec, SmartCat) | Cassandra Summ...Cassandra Tuning - Above and Beyond (Matija Gobec, SmartCat) | Cassandra Summ...
Cassandra Tuning - Above and Beyond (Matija Gobec, SmartCat) | Cassandra Summ...
 
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 

Similaire à 1 Million Writes per second on 60 nodes with Cassandra and EBS

Approaching hyperconvergedopenstack
Approaching hyperconvergedopenstackApproaching hyperconvergedopenstack
Approaching hyperconvergedopenstackIkuo Kumagai
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014Amazon Web Services
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Amazon Web Services
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CachePer Buer
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCoburn Watson
 
Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Fwdays
 
Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5Keisuke Takahashi
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…Sergey Dzyuban
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Hajime Tazaki
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Community
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureDanielle Womboldt
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stackNikos Kormpakis
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfhik_lhz
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationBigstep
 
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017Cloud Native Day Tel Aviv
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Dave Holland
 

Similaire à 1 Million Writes per second on 60 nodes with Cassandra and EBS (20)

Approaching hyperconvergedopenstack
Approaching hyperconvergedopenstackApproaching hyperconvergedopenstack
Approaching hyperconvergedopenstack
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish Cache
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performance
 
Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...
 
Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stack
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmf
 
Tuning Linux for MongoDB
Tuning Linux for MongoDBTuning Linux for MongoDB
Tuning Linux for MongoDB
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
 
HPC in the Cloud
HPC in the CloudHPC in the Cloud
HPC in the Cloud
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 

Dernier

Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 

Dernier (20)

Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 

1 Million Writes per second on 60 nodes with Cassandra and EBS

  • 1. 1 million writes per sec. on 60 nodes with Cassandra and EBS
  • 2. © 2015. All Rights Reserved. 1 Million Writes Per Second w/60 nodes. ! EBS and C*! Jim Plush - Sr Director of Engineering, CrowdStrike! Dennis Opacki - Sr Cloud Systems Architect!
  • 3. An Introduction to CrowdStrike We Are CyberSecurity Technology Company We Detect, Prevent And Respond To All Attack Types In Real Time, Protecting Organizations From Catastrophic Breaches We Provide Next Generation Endpoint Protection, Threat Intelligence & Pre &Post IR Services NEXT-GEN ENDPOINT INCIDENT RESPONSE THREAT INTEL http://www.crowdstrike.com/introduction-to-crowdstrike-falcon-host/
  • 4. CrowdStrike Scale •  Cloud based endpoint protection •  Single customer can generate > 2TB daily •  500K+ Events Per Second •  Multi PetaBytes of managed data © 2015. All Rights Reserved.
  • 5. Truisms??? •  HTTPs is too slow to run everywhere •  All you need is anti-virus •  Never run Cassandra on EBS © 2015. All Rights Reserved.
  • 6. © 2015. All Rights Reserved. What is EBS? EBS Data Volume EBS Data Volume /mnt/foo /mnt/bar EC2 Instance §  Network Mounted Hard Drive §  Ability to snapshot data §  Data encryption at rest & in flight
  • 7. Existing EBS Assumptions •  Jittery I/O aka: Noisy neighbors •  Single Point of Failure in a Region •  Cost is too damn high •  Bad Volumes (dd and destroy) © 2015. All Rights Reserved.
  • 8. A recent project: initial requirements •  1PB of incoming event data from millions of devices •  Modeled as a graph •  1 million writes per second (burst) •  Age data out after x days •  95% write 5% read © 2015. All Rights Reserved.
  • 9. We Tried •  Cassandra + Titan •  Sharding? •  Neo4J •  PostgreSQL, MySQL, SQLite •  LevelDB/RocksDB © 2015. All Rights Reserved.
  • 10. We have to make this work •  Cassandra had the properties we needed •  Time for a new approach? © 2015. All Rights Reserved. http://techblog.netflix.com/2014/07/revisiting-1-million-writes-per-second.html
  • 11. Number of Machines for 1PB © 2015. All Rights Reserved. 0. 450. 900. 1350. 1800. 2250. I2.xlarge c4.2XL EBS
  • 12. Yearly Cost for 1PB Cluster © 2015. All Rights Reserved. 0. 4. 8. 12. 16. I2.xlarge-on demand I2.xlarge-reserved c4.2xl - on demand c4.2xl - reserved Millionsof$ With EBS
  • 13. Initial Launch Date Tiered Compaction © 2015. All Rights Reserved. …more details by Jeff Jirsa, CrowdStrike Cassandra Summit 2015 - DTCS
  • 14. Initial Launch •  Cassandra 2.0.12 (DSE) •  m3.2xlarge 8 core •  Single 4TB EBS GP2 ~10,000 IOPS •  Default tunings © 2015. All Rights Reserved.
  • 15. Performance was terrible •  12 node cluster •  ~60K writes per second RF2 •  ~10K writes per 8 core box •  We went to the experts © 2015. All Rights Reserved.
  • 16. © 2015. All Rights Reserved. Cassandra Summit 2014 Family Search asked the same question: Where’s the bottleneck? https://www.youtube.com/watch?v=Qfzg7gcSK-g
  • 17. IOPS Available © 2015. All Rights Reserved. 0. 12500. 25000. 37500. 50000. I2.xlarge c4.2xlarge
  • 18. © 2015. All Rights Reserved. 1.3K IOPS?
  • 19. © 2015. All Rights Reserved. IOPS I see you there, but I can’t reach you!
  • 20.
  • 21. © 2015. All Rights Reserved. The magic gates opened… We hit 1 million writes per second RF3 on 60 nodes
  • 22. © 2015. All Rights Reserved. Testing Setup!
  • 23. Testing Methodology •  Each test run •  clean C* instances •  old test keyspaces dropped •  13+TBs of data loaded during read testing •  20 C4.4XL Stress Writers each with their own 1BB sequence © 2015. All Rights Reserved.
  • 24. Cluster Topology © 2015. All Rights Reserved. Stress Node 10 Instances AZ: 1A Stress Nodes 10 Instances AZ: 1B 20 C* Nodes AZ: 1A 20 C* Nodes AZ: 1B 20 C* Nodes AZ: 1C OpsCenter
  • 25. EBS © 2015. All Rights Reserved.
  • 26. Cassandra Stress 2.1.x © 2015. All Rights Reserved. bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops (insert=1) no-warmup -pop seq=1..1000000000 -mode native cql3 -node 10.10.10.XX -rate threads=1000 -errors ignore!
  • 27. © 2015. All Rights Reserved. PCSTAT - Al Tobey http://www.datastax.com/dev/blog/compaction-improvements-in-cassandra-21 https://github.com/tobert/pcstat
  • 28. © 2015. All Rights Reserved. Netflix Test - What is C* capable of?
  • 29. Netflix Test © 2015. All Rights Reserved. 1+ Million Writes Per second RF:3 3+ Million Local Writes Per second NICE!
  • 30. Netflix Test © 2015. All Rights Reserved.
  • 31. Netflix Test © 2015. All Rights Reserved. No Dropped Mutations, system healthy at 1.1M after 50 mins
  • 32. Netflix Test © 2015. All Rights Reserved. I/O Util is not peggedCommit Disk = Steady!
  • 33. Netflix Test © 2015. All Rights Reserved. Low IO Wait
  • 34. Netflix Test © 2015. All Rights Reserved. 95th Latency = Reasonable
  • 35. Netflix Test - Read Fail © 2015. All Rights Reserved. compression={'chunk_length_kb': '64', 'sstable_compression': 'LZ4Compressor'} https://issues.apache.org/jira/browse/CASSANDRA-10249 https://issues.apache.org/jira/browse/CASSANDRA-8894 Data Drive Pegged L
  • 36. Reading Data •  24 hour read test •  over 10TBs of data in the CF •  sustained > 350K reads per second over 24 hours •  1M reads/per sec peak •  CL ONE •  12 C4.4XL stress boxes © 2015. All Rights Reserved.
  • 37. Reading Data © 2015. All Rights Reserved.
  • 38. Reading Data © 2015. All Rights Reserved.
  • 39. Reading Data © 2015. All Rights Reserved. Not Pegged J
  • 40. Reading Data © 2015. All Rights Reserved. 7.2ms 95th latency
  • 41. Netflix Test resource usage •  180 Less Cores (45 less i2.xlarge instances) •  24 hour test (sans data transfer cost) –  Netflix cluster/stress •  Cost: ~$6300 •  285 i2.xlarge $0.85 per hour –  CrowdStrike cluster/stress with EBS cost •  Cost: ~$2600 •  60 C4.4XL $0.88 per hour
  • 42. Read Notes with EBS •  Our test was a single 10K IOPS volume •  More/Bigger Reads? –  PIOPS gives you as much throughput as you need –  RAID0 multiple EBS volumes /mnt/data EBS Vol1 EBS Vol2
  • 43. © 2015. All Rights Reserved. What Unlocked Performance!
  • 44. Major Tweaks •  Ubuntu HVM types •  Enhanced Networking •  now faster than PVM •  Ubuntu distro tuned for cloud workloads •  XFS Filesystem © 2015. All Rights Reserved.
  • 45. Major Tweaks •  Major Tweaks •  Cassandra 2.1 •  Java 8 •  G1 Garbage Collector - cassandra-env © 2015. All Rights Reserved. https://issues.apache.org/jira/browse/CASSANDRA-7486
  • 46. Major Tweaks •  C4.4XL 16 core, EBS Optimized •  4TB, 10,000 IOPS EBS GP2 Encrypted Data Drive –  160MB/s throughput •  1TB 3000 IOPS EBS GP2 Encrypted Commit Log Drive © 2015. All Rights Reserved.
  • 47. Major Tweaks •  cassandra-env.sh •  MAX_HEAP_SIZE=8G •  JVM_OPTS=“$JVM_OPTS —XX:+UseG1GC” •  Lots of other minor tweaks © 2015. All Rights Reserved.
  • 48. cassandra-env.sh © 2015. All Rights Reserved. Put PID in batch mode Mask CPU0 from the process to reduce context switching Magic From Al Tobey
  • 49. YAML Settings •  cassandra.yaml (based on 16 core) •  concurrent_reads: 32 •  concurrent_writes: 64 •  memtable_flush_writers: 8 •  trickle_fsync: true •  trickle_fsync_interval_in_kb: 1000 •  native_transport_max_threads: 256 •  concurrent_compactors: 4 © 2015. All Rights Reserved.
  • 50. cassandra.yaml © 2015. All Rights Reserved. We found a good portion of the CPU load was being used for internode compression which reduced write throughput internode_compression: none
  • 51. Lessons Learned •  EBS was never the bottleneck during testing, GP2 is legit •  If you’re doing batching, write to the same rowkey in the batch •  Builtin types like list and map come at a performance penalty •  30% hit on our writes using Map type •  DTCS is very young (see Jeff Jirsa’s talk) •  2.1 Stress Tool is tricky but great for modeling workloads •  How will compression affect your read path? © 2015. All Rights Reserved.
  • 52. © 2015. All Rights Reserved. Test your own! https://github.com/CrowdStrike/cassandra-tools
  • 53. It’s just python •  launch 20 nodes in us-east1 •  python launch.py launch --nodes=20 —config=c4-ebs-hvm —az=us-east-1a •  bootstrap the new nodes with C*, RAID/Format disks, etc… •  fab -u ubuntu bootstrapcass21:config=c4-highperf •  run arbitrary commands •  fab -u ubuntu cmd:config=c4-highperf,cmd="sudo rm -rf / mnt/cassandra/data/summit_stress" © 2015. All Rights Reserved.
  • 54. Run custom stress profiles… multi-node support ubuntu@ip-10-10-10.XX:~$ python runstress.py --profile=stress10 —seednode=10.10.10.XX —-threads=50! ! ! Going to run: /home/ubuntu/apache-cassandra-2.1.5/tools/bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops(insert=1,simple=9) no-warmup -pop seq=1..1000000000 -mode native cql3 -node 10.10.10.XX -rate threads=50 -errors ignore! © 2015. All Rights Reserved. ubuntu@ip-10-10-10.XX:~$ python runstress.py --profile=stress10 --seednode=10.10.10.XX --threads=50! ! Going to run: /home/ubuntu/apache-cassandra-2.1.5/tools/bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops(insert=1,simple=9) no-warmup -pop seq=1000000001..2000000000 -mode native cql3 -node 10.10.10.XX -rate threads=50 -errors ignore! export NODENUM=1! export NODENUM=2!
  • 55. Where are we today? •  ~3 months on our EBS based cluster •  Hundreds of TBs of graph data and growing in C* •  Billions of vertices/edges •  Changing perceptions?
  • 56. Special thanks to © 2015. All Rights Reserved. •  Leif Jackson •  Marcus King •  Alan Hannan •  Jeff Jirsa •  Al Tobey •  Nick Panahi •  J.B. Langston •  Marcus Eriksson •  Iian Finlayson •  Dani Traphagen
  • 57. EBS heading into 2016 © 2015. All Rights Reserved.
  • 58. 4TB  (10k  IOPS)  GP2   IO  Hit?  Not  enough  to  phase  C*  
  • 59. © 2015. All Rights Reserved.     So  why  the  hate  for  EBS?  
  • 60. © 2015. All Rights Reserved. Following  the  Crowd  –  Trust  Issues     •  Used  instance-­‐store  image  and  ephemeral   drives   •  Painful  to  stop/start  instances,  resize   •  Couldn’t  avoid  scheduled  maintenance  (i.e.   Reboot-­‐a-­‐palooza)   •  EncrypUon  required  shenanigans  
  • 61. © 2015. All Rights Reserved. Guess  What?   •  We  sUll  had  failures   •  Now  we  get  to  rebuild  from  scratch  
  • 62. © 2015. All Rights Reserved. What  do  you  mean  my  volume  is  “stuck”?     •  April  2011  –  Ne[lix,  Reddit  and  Quora   •  October  2012  –  Reddit,  Imgur,  Heroku   •  August  2013  –  Vine,  AirBNB   EBS’s  Troubled  Childhood  
  • 63. © 2015. All Rights Reserved. h`p://techblog.ne[lix.com/2011/04/lessons-­‐ ne[lix-­‐learned-­‐from-­‐aws-­‐outage.html     •  Spread  services  across  mulUple  regions   •  Test  failure  scenarios  regularly  (Chaos  Monkey)   •  Make  Cassandra  databases  more  resilient  by  avoiding   EBS   Kiss  of  Death  
  • 64. © 2015. All Rights Reserved. Amazon  moves  quickly  and  quietly:     •  March  2011  –  New  EBS  GM   •  July  2012  –  Provisioned  IOPs   •  May  2014  –  NaUve  EncrypUon   •  Jun  2014  –  GP2  (game  changer)   •  Mar  2015  –  16TB  /  10K  GP2/  20K  PIOPS       RedempUon  
  • 65. © 2015. All Rights Reserved. •  PrioriUzed  EBS  availability  and  consistency  beyond  features  and   funcUonality   •  Compartmentalized  the  control  plane  -­‐  broke  cross-­‐AZ  dependencies   for  running  volumes   •  Simplified  workflows  to  favor  sustained  operaUon   •  Tested  and  simulated  via  TLA+/PlusCal  -­‐  be`er  understood  corner  cases   •  Dedicated  a  large  fracUon  of  engineering  resources  to  reliability  and   performance     RedempUon  
  • 66. © 2015. All Rights Reserved. Reliability     EBS  Team  targets  99.999%  availability      exceeding  expectaUons  
  • 67. © 2015. All Rights Reserved. Crowdstrike  Today   In  past  12  months,  zero  EBS-­‐related  failures   •  Thousands  of  GP2  data  volumes  (~2PB  data)   •  TransiUoning  all  systems  to  EBS  root  drives   •  Moved  all  data  stores  to  EBS  (C*,  Kapa,   ElasUcsearch,  Postgres,  etc)  
  • 68.
  • 69. © 2015. All Rights Reserved. Staying  Safe  -­‐  Architecture   •  Select  a  region  with  >2  AZs  (e.g  us-­‐east-­‐1  or  us-­‐ west-­‐2)     •  Use  EBS  GP2  or  PIOPs  storage   •  Separate  volumes  for  data  and  commit  logs  
  • 70. © 2015. All Rights Reserved. Staying  Safe  -­‐  Ops   •  Use  EBS  volume  monitoring   •  Pre-­‐warm  EBS  volumes?   •  Schedule  snapshots  for  consistent  backups  
  • 71. © 2015. All Rights Reserved. Most  Importantly   •  Challenge  assumpUons   •  Stay  current  on  AWS  blog   •  Talk  with  your  peers  
  • 72.