SlideShare a Scribd company logo
1 of 44
Download to read offline
High Scalability
Basics of scale and availability
Who am I?
•   Jonathan Keebler @keebler keebler.net
•   Built video player for all CTV properties
•   Worked on news sites like CP24, CTV, TSN
•   CTO, Founder of ScribbleLive
•   Bootstrapped a high scalability startup
     – Credit card limit wasn’t that high, had to find cheap
       ways to handle the load of top tier news sites

                                                               2
Sample load test




17 x Windows Server 2008, 2 x Varnish, 4 x nginx, 1 x SQL Server 2008


                                                                        3
Scalability vs Availability
• Often talked about separately
• Can’t have one without the other
• Let’s talk about the basic building blocks




                                               4
Building blocks
•   Content Distribution Network (CDN)
•   Load-balancer
•   Reverse proxy
•   Caching server
•   Origin server

                                         5
Basic hosting structure




                          6
Basic hosting structure


 Akamai       Amazon ELB   nginx   Varnish   LAMP
 CloudFront   F5                   Squid     ASP.NET
 EdgeCast     HAProxy              aiCache   node.js




                                                       7
Basic hosting structure
 + Monitoring   + Monitoring   + Monitoring   + Monitoring   + Monitoring




 Akamai         Amazon ELB     nginx          Varnish        LAMP
 CloudFront     F5                            Squid          ASP.NET
 EdgeCast       HAProxy                       aiCache        node.js




                                                                       8
Monitor or die
• If you aren’t monitoring your stack, you
  have NO IDEA what’s going on
• Pingdom/WatchMouse/Gomez not enough
  – Don’t help you when you’re trying to figure out
    what’s going wrong
  – You need actionable metrics

                                                      9
Monitor or die
• Outside monitoring e.g. Pingdom, Gomez
   – DNS problems, localized problem, SLA
• Inside monitoring e.g. New Relic, CloudWatch,
  Server Density
   – High latency, CPU spikes, memory crunch,
     peek-a-boo servers, rogue processes, SQL
     queries per second, SQL wait time, SQL locks,
     disk usage, disk IO performance, page file
     usage, network traffic, requests per second, 10
New Relic
• Dashboard




                          11
Alerting
• Don’t send them to your email
  – Try to work with notifications coming in every
    second
• PagerDuty
• Don’t over do it = alert fatigue


                                                     12
Basic hosting structure
• Now back to our servers...




                                13
Load-balancers
• Bandwidth limits on dedicated boxes
  harder to work around
• F5s are great boxes, but have lousy live
  reporting = can get into trouble quick
• Adding/removing servers sucks
• DNS load-balancing sucks for everyone
                                             14
nginx
• Fantastic at handling massive number of
  requests (low CPU, low memory)
• Easy to configure and change on-the-fly
• Gzip, modify headers, host names
• Proxy with error intercept
• No query string or IF-statement* support

                                             15
Varnish
• Caching server but so much more
• Fantastic at handling massive number of
  requests (low CPU, low memory)
• Easy to configure and change on-the-fly
• Protect your origin servers
• Deals with errors from origin servers

                                            16
Origin servers
• Whatever tweaks you make will never help
  enough
   – e.g. If your disk IO is becoming a problem, it’s
     already too late to save you
• Keep them stock so you don’t blow your mind,
  easier to deploy
• Handle any query string hacking in Varnish
                                                        17
Databases
• No silver bullet
• Two options:
   – Shard (split your data between servers)
   – Cluster (many boxes working together as one)
• Shards commonly used today
   – Lots of work on code level, no incremental IDs
• Clusters have a single point of failure
   – Try upgrading one and tell me they don’t

                                                      18
Discussion
• What stack do you use?
• What database do you use?
• SQL vs NoSQL




                              19
High Scalability
Content Distribution Networks
Basics
• Worldwide network of DNS load-balanced
  reverse proxies
• Not magic
• Can achieve 99% offload if you do it right
• Have to understand your requests

                                               21
Market leaders
• Akamai: market leader, $$$, most options, yearly
  contracts, pay for GB + request headers
• CloudFront: built on AWS, cheaper, pay-as-you-
  go, less features, new features coming quickly,
  GB + pay-per-request
• EdgeCast (pay-as-you-go through GoGrid),
  CloudFlare (optimizer, security, easy!)

                                                 22
Tiered distribution
• More points-of-presence (POPs) = less caching if
  your traffic is global
• Need to put a layer of servers between POPs
  and your servers
• Sophisticated setups throttle requests
   – if 100 come in at same time, only 1 gets
     through
                                                 23
Cache keys
• Need to have same query string to get cached
  result
• Some CDNs can ignore params
   – important if you need a random number on the
     query string to prevent browser caching
• Cool options: case sensitive/insensitive, cache
  differently based on cookie, headers
                                                24
Invalidations suck
• Trying to get CDN to drop its cache is hard
   – takes a long time to reach all POPs
   – triggers thundering herd
   – takes out all caching for a bit
• Build the ability to change query strings at the code layer
   – e.g. add version number to JS/CSS URLs. When you
     rollout, breaks cache

                                                            25
How long to cache for?
• As long as you need, but no longer
• Make sure you think about error case i.e.
  what if an error gets cached
  – Some CDNs let you set your own rules for that
  – Remember, invalidations suck


                                                26
Thundering herds




                   27
Thundering herds
• When you rollout or have high latency, all your
  timeouts align
   – Origins get slammed at regular interval by POPs
• Random TTLs are your friend
   – Just +/- a few minutes can be a big help
   – TIP: break into C in Varnish


                                                   28
Don’t build your own*
• You will never be as smart as Akamai/Amazon
• You will never be able to bring on new servers
  fast enough to scale
• Spend your time building awesome software
• Build your own caching layer for the POPs (and
  just in-case, to protect your origin servers)


                                                   29
Discussion
• What CDN do you use?
• War stories




                          30
High Scalability
Caching in Code
Why do I need this?
•   You can’t cache every request
•   You can’t cache POST requests
•   Protect the database!
•   The longer you can go before you have to
    shard your database, the better

                                               32
What is it?
• In-process, in-memory caching
• Static variables work great
  – TIP: .NET: static variables are scoped in the
    thread, WHY?!
• Custom memory stores
• Whatever you want, just not the disk
                                                    33
Isn’t that what Memcached is for?
• Memcached is in-memory BUT so is your database
  – Advantages of Memcached over your database:
     • Cheaper to replicate
     • Fast lookups...if your db sucks
  – Disadvantages:
     • Still has network latency, higher than db lookup (unless
       your db sucks)
     • IT’S NOT A DATABASE!

                                                                  34
Getting started
•   Think about your data + classes
•   TTLs based on knowledge of your data
•   Random TTLs (avoid the thundering herd)
•   Use concurrent, thread-safe objects
•   Wrap your code in try-catch
     – Caching isn’t worth breaking your site for

                                                    35
Updating cache
•   Use semaphores (that Comp Sci degree is finally going to come in handy)
•   Semaphores should always unlock on their own
     – Your thread could die/timeout at any time. You don’t want to lock forever
•   Use a separate thread for the lookup. Why should one user suffer?
•   Using a datetime semaphore is usually the best
     – keep a time when the next update will take place
     – 1st thread to hit that time, immediately adds some seconds to the time.
       Buys itself enough time to do lookup
     – Any blocked thread gets cached data. DON’T LOCK


                                                                                 36
Populating cache for first time
• How do you prevent thundering herd before
  cache?
• Ok, you may have to lock. But be smart about it.
• Are you sure your database can’t handle it?
• This is where other caching layers help: CDN
  throttling, Varnish throttling, Memcached, read-
  only databases
                                                     37
Garbage collection
• Keep counters for metrics e.g. how many hits to the cached
  object, datetime of last request for that object
• Every X something, run your garbage collection
   – Use semaphores
   – Don’t get rid of the most used objects
• You are going to collide with running code
   – try-catch is your friend
• Don’t be afraid to dump the cache and start over

                                                               38
Watch out for references
• If you are storing something in a cache object, you
  can save a lot of memory by passing reference to
  object
• Don’t forget about the reference
• Watch out for garbage collection trying to destroy it
• Updating cache operation might involve updating an
  existing object

                                                          39
The curse
• More servers = more caches = less
  efficient
• Discipline: can’t throw more servers at the
  problem



                                                40
Totally worth it!




Requests per minute to origin servers


                                        41
Totally worth it!




CPU of 1 x SQL Server 2008 database


                                      42
Discussion
• What do you use to cache at a code layer?
• War stories




                                          43
Thank you!
• Jonathan Keebler
• jonathan@scribblelive.com
• @keebler




                              44

More Related Content

What's hot

Red Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSRed Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSGlusterFS
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...lisapaglia
 
OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaKamesh Pemmaraju
 
The Complete MariaDB Server Tutorial - Percona Live 2015
The Complete MariaDB Server Tutorial - Percona Live 2015The Complete MariaDB Server Tutorial - Percona Live 2015
The Complete MariaDB Server Tutorial - Percona Live 2015Colin Charles
 
2012-11-30-scalable game servers
2012-11-30-scalable game servers2012-11-30-scalable game servers
2012-11-30-scalable game serversWooga
 
Managing storage on Prem and in Cloud
Managing storage on Prem and in CloudManaging storage on Prem and in Cloud
Managing storage on Prem and in CloudHoward Marks
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutSander Temme
 
Container Attached Storage with OpenEBS - CNCF Paris Meetup
Container Attached Storage with OpenEBS - CNCF Paris MeetupContainer Attached Storage with OpenEBS - CNCF Paris Meetup
Container Attached Storage with OpenEBS - CNCF Paris MeetupMayaData Inc
 
Designing for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacampDesigning for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacampMichael Montano
 

What's hot (10)

Red Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSRed Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFS
 
AppFabric Velocity
AppFabric VelocityAppFabric Velocity
AppFabric Velocity
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
 
OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of Alabama
 
The Complete MariaDB Server Tutorial - Percona Live 2015
The Complete MariaDB Server Tutorial - Percona Live 2015The Complete MariaDB Server Tutorial - Percona Live 2015
The Complete MariaDB Server Tutorial - Percona Live 2015
 
2012-11-30-scalable game servers
2012-11-30-scalable game servers2012-11-30-scalable game servers
2012-11-30-scalable game servers
 
Managing storage on Prem and in Cloud
Managing storage on Prem and in CloudManaging storage on Prem and in Cloud
Managing storage on Prem and in Cloud
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
 
Container Attached Storage with OpenEBS - CNCF Paris Meetup
Container Attached Storage with OpenEBS - CNCF Paris MeetupContainer Attached Storage with OpenEBS - CNCF Paris Meetup
Container Attached Storage with OpenEBS - CNCF Paris Meetup
 
Designing for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacampDesigning for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacamp
 

Viewers also liked

Life on the Edge with ESI
Life on the Edge with ESILife on the Edge with ESI
Life on the Edge with ESIKit Chan
 
Container Security via Monitoring and Orchestration - Container Security Summit
Container Security via Monitoring and Orchestration - Container Security SummitContainer Security via Monitoring and Orchestration - Container Security Summit
Container Security via Monitoring and Orchestration - Container Security SummitDavid Timothy Strauss
 

Viewers also liked (6)

Planning LAMP infrastructure
Planning LAMP infrastructurePlanning LAMP infrastructure
Planning LAMP infrastructure
 
Cassandra-Powered Distributed DNS
Cassandra-Powered Distributed DNSCassandra-Powered Distributed DNS
Cassandra-Powered Distributed DNS
 
Life on the Edge with ESI
Life on the Edge with ESILife on the Edge with ESI
Life on the Edge with ESI
 
Valhalla at Pantheon
Valhalla at PantheonValhalla at Pantheon
Valhalla at Pantheon
 
Cassandra queuing
Cassandra queuingCassandra queuing
Cassandra queuing
 
Container Security via Monitoring and Orchestration - Container Security Summit
Container Security via Monitoring and Orchestration - Container Security SummitContainer Security via Monitoring and Orchestration - Container Security Summit
Container Security via Monitoring and Orchestration - Container Security Summit
 

Similar to High Scalability Toronto: Meetup #2

Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownDataStax
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talkSatish Mehta
 
Performance_Out.pptx
Performance_Out.pptxPerformance_Out.pptx
Performance_Out.pptxsanjanabal
 
Performance out
Performance outPerformance out
Performance outJack Huang
 
Performance out
Performance outPerformance out
Performance outJack Huang
 
Performance out
Performance outPerformance out
Performance outJack Huang
 
Performance out
Performance outPerformance out
Performance outJack Huang
 
Performance out
Performance outPerformance out
Performance outJack Huang
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in JavaRuben Badaró
 
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun DuynsteeSolr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun Duynsteelucenerevolution
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 

Similar to High Scalability Toronto: Meetup #2 (20)

Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talk
 
Performance_Out.pptx
Performance_Out.pptxPerformance_Out.pptx
Performance_Out.pptx
 
2 7
2 72 7
2 7
 
Performance out
Performance outPerformance out
Performance out
 
Performance out
Performance outPerformance out
Performance out
 
Performance out
Performance outPerformance out
Performance out
 
Performance out
Performance outPerformance out
Performance out
 
Performance out
Performance outPerformance out
Performance out
 
Performance out
Performance outPerformance out
Performance out
 
title
titletitle
title
 
Performance out
Performance outPerformance out
Performance out
 
Performance out
Performance outPerformance out
Performance out
 
Performance out
Performance outPerformance out
Performance out
 
Performance out
Performance outPerformance out
Performance out
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun DuynsteeSolr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 

High Scalability Toronto: Meetup #2

  • 1. High Scalability Basics of scale and availability
  • 2. Who am I? • Jonathan Keebler @keebler keebler.net • Built video player for all CTV properties • Worked on news sites like CP24, CTV, TSN • CTO, Founder of ScribbleLive • Bootstrapped a high scalability startup – Credit card limit wasn’t that high, had to find cheap ways to handle the load of top tier news sites 2
  • 3. Sample load test 17 x Windows Server 2008, 2 x Varnish, 4 x nginx, 1 x SQL Server 2008 3
  • 4. Scalability vs Availability • Often talked about separately • Can’t have one without the other • Let’s talk about the basic building blocks 4
  • 5. Building blocks • Content Distribution Network (CDN) • Load-balancer • Reverse proxy • Caching server • Origin server 5
  • 7. Basic hosting structure Akamai Amazon ELB nginx Varnish LAMP CloudFront F5 Squid ASP.NET EdgeCast HAProxy aiCache node.js 7
  • 8. Basic hosting structure + Monitoring + Monitoring + Monitoring + Monitoring + Monitoring Akamai Amazon ELB nginx Varnish LAMP CloudFront F5 Squid ASP.NET EdgeCast HAProxy aiCache node.js 8
  • 9. Monitor or die • If you aren’t monitoring your stack, you have NO IDEA what’s going on • Pingdom/WatchMouse/Gomez not enough – Don’t help you when you’re trying to figure out what’s going wrong – You need actionable metrics 9
  • 10. Monitor or die • Outside monitoring e.g. Pingdom, Gomez – DNS problems, localized problem, SLA • Inside monitoring e.g. New Relic, CloudWatch, Server Density – High latency, CPU spikes, memory crunch, peek-a-boo servers, rogue processes, SQL queries per second, SQL wait time, SQL locks, disk usage, disk IO performance, page file usage, network traffic, requests per second, 10
  • 12. Alerting • Don’t send them to your email – Try to work with notifications coming in every second • PagerDuty • Don’t over do it = alert fatigue 12
  • 13. Basic hosting structure • Now back to our servers... 13
  • 14. Load-balancers • Bandwidth limits on dedicated boxes harder to work around • F5s are great boxes, but have lousy live reporting = can get into trouble quick • Adding/removing servers sucks • DNS load-balancing sucks for everyone 14
  • 15. nginx • Fantastic at handling massive number of requests (low CPU, low memory) • Easy to configure and change on-the-fly • Gzip, modify headers, host names • Proxy with error intercept • No query string or IF-statement* support 15
  • 16. Varnish • Caching server but so much more • Fantastic at handling massive number of requests (low CPU, low memory) • Easy to configure and change on-the-fly • Protect your origin servers • Deals with errors from origin servers 16
  • 17. Origin servers • Whatever tweaks you make will never help enough – e.g. If your disk IO is becoming a problem, it’s already too late to save you • Keep them stock so you don’t blow your mind, easier to deploy • Handle any query string hacking in Varnish 17
  • 18. Databases • No silver bullet • Two options: – Shard (split your data between servers) – Cluster (many boxes working together as one) • Shards commonly used today – Lots of work on code level, no incremental IDs • Clusters have a single point of failure – Try upgrading one and tell me they don’t 18
  • 19. Discussion • What stack do you use? • What database do you use? • SQL vs NoSQL 19
  • 21. Basics • Worldwide network of DNS load-balanced reverse proxies • Not magic • Can achieve 99% offload if you do it right • Have to understand your requests 21
  • 22. Market leaders • Akamai: market leader, $$$, most options, yearly contracts, pay for GB + request headers • CloudFront: built on AWS, cheaper, pay-as-you- go, less features, new features coming quickly, GB + pay-per-request • EdgeCast (pay-as-you-go through GoGrid), CloudFlare (optimizer, security, easy!) 22
  • 23. Tiered distribution • More points-of-presence (POPs) = less caching if your traffic is global • Need to put a layer of servers between POPs and your servers • Sophisticated setups throttle requests – if 100 come in at same time, only 1 gets through 23
  • 24. Cache keys • Need to have same query string to get cached result • Some CDNs can ignore params – important if you need a random number on the query string to prevent browser caching • Cool options: case sensitive/insensitive, cache differently based on cookie, headers 24
  • 25. Invalidations suck • Trying to get CDN to drop its cache is hard – takes a long time to reach all POPs – triggers thundering herd – takes out all caching for a bit • Build the ability to change query strings at the code layer – e.g. add version number to JS/CSS URLs. When you rollout, breaks cache 25
  • 26. How long to cache for? • As long as you need, but no longer • Make sure you think about error case i.e. what if an error gets cached – Some CDNs let you set your own rules for that – Remember, invalidations suck 26
  • 28. Thundering herds • When you rollout or have high latency, all your timeouts align – Origins get slammed at regular interval by POPs • Random TTLs are your friend – Just +/- a few minutes can be a big help – TIP: break into C in Varnish 28
  • 29. Don’t build your own* • You will never be as smart as Akamai/Amazon • You will never be able to bring on new servers fast enough to scale • Spend your time building awesome software • Build your own caching layer for the POPs (and just in-case, to protect your origin servers) 29
  • 30. Discussion • What CDN do you use? • War stories 30
  • 32. Why do I need this? • You can’t cache every request • You can’t cache POST requests • Protect the database! • The longer you can go before you have to shard your database, the better 32
  • 33. What is it? • In-process, in-memory caching • Static variables work great – TIP: .NET: static variables are scoped in the thread, WHY?! • Custom memory stores • Whatever you want, just not the disk 33
  • 34. Isn’t that what Memcached is for? • Memcached is in-memory BUT so is your database – Advantages of Memcached over your database: • Cheaper to replicate • Fast lookups...if your db sucks – Disadvantages: • Still has network latency, higher than db lookup (unless your db sucks) • IT’S NOT A DATABASE! 34
  • 35. Getting started • Think about your data + classes • TTLs based on knowledge of your data • Random TTLs (avoid the thundering herd) • Use concurrent, thread-safe objects • Wrap your code in try-catch – Caching isn’t worth breaking your site for 35
  • 36. Updating cache • Use semaphores (that Comp Sci degree is finally going to come in handy) • Semaphores should always unlock on their own – Your thread could die/timeout at any time. You don’t want to lock forever • Use a separate thread for the lookup. Why should one user suffer? • Using a datetime semaphore is usually the best – keep a time when the next update will take place – 1st thread to hit that time, immediately adds some seconds to the time. Buys itself enough time to do lookup – Any blocked thread gets cached data. DON’T LOCK 36
  • 37. Populating cache for first time • How do you prevent thundering herd before cache? • Ok, you may have to lock. But be smart about it. • Are you sure your database can’t handle it? • This is where other caching layers help: CDN throttling, Varnish throttling, Memcached, read- only databases 37
  • 38. Garbage collection • Keep counters for metrics e.g. how many hits to the cached object, datetime of last request for that object • Every X something, run your garbage collection – Use semaphores – Don’t get rid of the most used objects • You are going to collide with running code – try-catch is your friend • Don’t be afraid to dump the cache and start over 38
  • 39. Watch out for references • If you are storing something in a cache object, you can save a lot of memory by passing reference to object • Don’t forget about the reference • Watch out for garbage collection trying to destroy it • Updating cache operation might involve updating an existing object 39
  • 40. The curse • More servers = more caches = less efficient • Discipline: can’t throw more servers at the problem 40
  • 41. Totally worth it! Requests per minute to origin servers 41
  • 42. Totally worth it! CPU of 1 x SQL Server 2008 database 42
  • 43. Discussion • What do you use to cache at a code layer? • War stories 43
  • 44. Thank you! • Jonathan Keebler • jonathan@scribblelive.com • @keebler 44