SlideShare une entreprise Scribd logo
1  sur  39
Avoiding Full GCs with
MemStore-Local Allocation Buffers
                 Todd Lipcon
              todd@cloudera.com
Twitter: @tlipcon      #hbase IRC: tlipcon




            February 22, 2011
Outline

  Background

  HBase and GC

  A solution

  Summary
Intro / who am I?
     Been working on data stuff for a few years
     HBase, HDFS, MR committer
     Cloudera engineer since March ’09
Motivation
     HBase users want to use large heaps
         Bigger block caches make for better hit rates
         Bigger memstores make for larger and more
         efficient flushes
         Machines come with 24G-48G RAM
     But bigger heaps mean longer GC pauses
         Around 10 seconds/GB on my boxes.
         Several minute GC pauses wreak havoc
GC Disasters
   1. Client requests stalled
           1 minute “latency” is just as bad as unavailability
   2. ZooKeeper sessions stop pinging
           The dreaded “Juliet Pause” scenario
   3. Triggers all kinds of other nasty bugs
Yo Concurrent
   Mark-and-Sweep (CMS)!

What part of Concurrent didn’t
      you understand?
Java GC Background
     Java’s GC is generational
         Generational hypothesis: most objects either die
         young or stick around for quite a long time
         Split the heap into two “generations” - young (aka
         new) and old (aka tenured)
     Use different algorithms for the two generations
     We usually recommend -XX:+UseParNewGC
     -XX:+UseConcMarkSweepGC
         Young generation: Parallel New collector
         Old generation: Concurrent-mark-sweep
The Parallel New collector in 60 seconds
     Divide the young generation into eden,
     survivor-0, and survivor-1
     One survivor space is from-space and the other
     is to-space
     Allocate all objects in eden
     When eden fills up, stop the world and copy
     live objects from eden and from-space into
     to-space, swap from and to
         Once an object has been copied back and forth N
         times, copy it to the old generation
         N is the “Tenuring Threshold” (tunable)
The CMS collector in 60 seconds
A bit simplified, sorry...
            Several phases:
               1. initial-mark (stop-the-world) - marks roots (eg
                  thread stacks)
               2. concurrent-mark - traverse references starting at
                  roots, marking what’s live
               3. concurrent-preclean - another pass of the same
                  (catch new objects)
               4. remark (stop-the-world) - any last changed/new
                  objects
               5. concurrent-sweep - clean up dead objects to
                  update free space tracking
            Note: dead objects free up space, but it’s not
            contiguous. We’ll come back to this later!
CMS failure modes
   1. When young generation collection happens, it
      needs space in the old gen. What if CMS is
      already in the middle of concurrent work, but
      there’s no space?
          The dreaded concurrent mode failure! Stop
          the world and collect.
          Solution: lower value of
          -XX:CMSInitiatingOccupancyFraction so
          CMS starts working earlier
   2. What if there’s space in the old generation, but
      not enough contiguous space to promote a
      large object?
          We need to compact the old generation (move all
          free space to be contiguous)
          This is also stop-the-world! Kaboom!
OK... so life sucks.

What can we do about it?
Step 1. Hypothesize
     Setting the initiating occupancy fraction low
     puts off GC, but it eventually happens no
     matter what
     We see promotion failed followed by long
     GC pause, even when 30% of the heap is free.
     Why? Must be fragmentation!
Step 2. Measure
     Let’s make some graphs:
     -XX:PrintFLSStatistics=1
     -XX:PrintCMSStatistics=1
     -XX:+PrintGCDetails
     -XX:+PrintGCDateStamps -verbose:gc
     -Xloggc:/.../logs/gc-$(hostname).log
     FLS Statistics: verbose information about the
     state of the free space inside the old generation

         Free space - total amount of free space
         Num blocks - number of fragments it’s spread into
         Max chunk size
     parse-fls-statistics.py → R and
     ggplot2
3 YCSB workloads, graphed
Workload 1
Insert-only
Workload 2
Read-only with cache churn
Workload 3
Read-only with no cache churn




          So boring I didn’t make a graph!
          All allocations are short lived → stay in young
          gen
Recap
What we have learned?

             Fragmentation is what causes long GC pauses
             Write load seems to cause fragmentation
             Read load (LRU cache churn) isn’t nearly so
             bad1




       1
           At least for my test workloads
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          A
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          AB
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABC
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCD
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCDE
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCDEABCEDDAECBACEBCED
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCDEABCEDDAECBACEBCED
          Now B’s memstore fills up and flushes. We’re
          left with:
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCDEABCEDDAECBACEBCED
          Now B’s memstore fills up and flushes. We’re
          left with:
          A CDEA CEDDAEC ACE CED
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCDEABCEDDAECBACEBCED
          Now B’s memstore fills up and flushes. We’re
          left with:
          A CDEA CEDDAEC ACE CED
          Looks like fragmentation!
Also known as swiss cheese




If every write is exactly the same size, it’s fine -
we’ll fill in those holes. But this is seldom true.
A solution
     Crucial issue is that memory allocations for a
     given memstore aren’t next to each other in
     the old generation.
     When we free an entire memstore we only get
     tiny blocks of free space
     What if we ensure that the memory for a
     memstore is made of large blocks?
     Enter the MemStore Local Allocation Buffer
     (MSLAB)
What’s an MSLAB?
    Each MemStore has an instance of
    MemStoreLAB.
    MemStoreLAB has a 2MB curChunk with
    nextFreeOffset starting at 0.
    Before inserting a KeyValue that points to
    some byte[], copy the data into curChunk
    and increment nextFreeOffset by data.length
    Insert a KeyValue pointing inside curChunk
    instead of the original data.
    If a chunk fills up, just make a new one.
    This is all lock-free, using atomic
    compare-and-swap instructions.
How does this help?
     The original data to be inserted becomes very
     short-lived, and dies in the young generation.
     The only data in the old generation is made of
     2MB chunks
     Each chunk only belongs to one memstore.
     When we flush, we always free up 2MB chunks,
     and avoid the swiss cheese effect.
     Next time we allocate, we need exactly 2MB
     chunks again, and there will definitely be space.
Does it work?
It works!



    Have seen basically zero full
    GCs with MSLAB enabled,
     after days of load testing
Summary
    Most GC pauses are caused by fragmentation
    in the old generation.
    The CMS collector doesn’t compact, so the
    only way it can fight fragmentation is to pause.
    The MSLAB moves all MemStore allocations
    into contiguous 2MB chunks in the old
    generation.
    No more GC pauses!
How to try it
   1. Upgrade to HBase 0.90.1 (included in
      CDH3b4)
   2. Set hbase.hregion.memstore.mslab.enabled to
      true
          Also tunable:
          hbase.hregion.memstore.mslab.chunksize
          (in bytes, default 2M)

          hbase.hregion.memstore.mslab.max.allocation
          (in bytes, default 256K)
   3. Report back your results!
Future work
     Flat 2MB chunk per region → 2GB RAM
     minimum usage for 1000 regions
     incrementColumnValue currently bypasses
     MSLAB for subtle reasons
     We’re doing an extra memory copy into
     MSLAB chunk - we can optimize this out
     Maybe we can relax
     CMSInitiatingOccupancyFraction back up
     a bit?
So I don’t forget...
Corporate shill time




    Cloudera offering HBase training on March 10th.

    15 percent off with hbase meetup code.
todd@cloudera.com
  Twitter: @tlipcon
#hbase IRC: tlipcon

   P.S. we’re hiring!

Contenu connexe

Tendances

HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path HBaseCon
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)NAVER D2
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3DataWorks Summit
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase强 王
 
LLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveLLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveDataWorks Summit
 
Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseenissoz
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guidelarsgeorge
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...DataWorks Summit/Hadoop Summit
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep divet3rmin4t0r
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013Jun Rao
 

Tendances (20)

HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
HBase Low Latency
HBase Low LatencyHBase Low Latency
HBase Low Latency
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
 
HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
LLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveLLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
 
Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAse
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 

En vedette

Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionCloudera, Inc.
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopMatthew Hayes
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.
 
Hbase运维碎碎念
Hbase运维碎碎念Hbase运维碎碎念
Hbase运维碎碎念haiyuan ning
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopMatthew Hayes
 
Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)Chris Aniszczyk
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleHBaseCon
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicasenissoz
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101Nick Dimiduk
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringTuri, Inc.
 
Nail the First 60 Seconds of Your Presentation
Nail the First 60 Seconds of Your PresentationNail the First 60 Seconds of Your Presentation
Nail the First 60 Seconds of Your PresentationBruce Kasanoff
 
The Growth Hacker Wake Up Call
The Growth Hacker Wake Up CallThe Growth Hacker Wake Up Call
The Growth Hacker Wake Up CallRyan Holiday
 
16 Unique & Innovative Ways to Market your Business
16 Unique & Innovative Ways to Market your Business16 Unique & Innovative Ways to Market your Business
16 Unique & Innovative Ways to Market your BusinessNicoleElmore.com
 
PSFK Future of Work Report 2013
PSFK Future of Work Report 2013PSFK Future of Work Report 2013
PSFK Future of Work Report 2013PSFK
 
5 Secrets to Killer Lead Generation Using SlideShare
5 Secrets to Killer Lead Generation Using SlideShare5 Secrets to Killer Lead Generation Using SlideShare
5 Secrets to Killer Lead Generation Using SlideShareEugene Cheng
 
The Evolution of Film Editing
The Evolution of Film EditingThe Evolution of Film Editing
The Evolution of Film EditingAdobe
 

En vedette (20)

Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on Hadoop
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 
Hbase运维碎碎念
Hbase运维碎碎念Hbase运维碎碎念
Hbase运维碎碎念
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on Hadoop
 
Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
Nail the First 60 Seconds of Your Presentation
Nail the First 60 Seconds of Your PresentationNail the First 60 Seconds of Your Presentation
Nail the First 60 Seconds of Your Presentation
 
The Growth Hacker Wake Up Call
The Growth Hacker Wake Up CallThe Growth Hacker Wake Up Call
The Growth Hacker Wake Up Call
 
16 Unique & Innovative Ways to Market your Business
16 Unique & Innovative Ways to Market your Business16 Unique & Innovative Ways to Market your Business
16 Unique & Innovative Ways to Market your Business
 
Slide Wars- The Force Sleeps
Slide Wars- The Force SleepsSlide Wars- The Force Sleeps
Slide Wars- The Force Sleeps
 
PSFK Future of Work Report 2013
PSFK Future of Work Report 2013PSFK Future of Work Report 2013
PSFK Future of Work Report 2013
 
5 Secrets to Killer Lead Generation Using SlideShare
5 Secrets to Killer Lead Generation Using SlideShare5 Secrets to Killer Lead Generation Using SlideShare
5 Secrets to Killer Lead Generation Using SlideShare
 
The Evolution of Film Editing
The Evolution of Film EditingThe Evolution of Film Editing
The Evolution of Film Editing
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
99 Facts on the Future of Business
99 Facts on the Future of Business99 Facts on the Future of Business
99 Facts on the Future of Business
 
Profits before People
Profits before PeopleProfits before People
Profits before People
 

Similaire à HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers

The JVM is your friend
The JVM is your friendThe JVM is your friend
The JVM is your friendKai Koenig
 
Trigger maxl from fdmee
Trigger maxl from fdmeeTrigger maxl from fdmee
Trigger maxl from fdmeeBernard Ash
 
A quick view about Java Virtual Machine
A quick view about Java Virtual MachineA quick view about Java Virtual Machine
A quick view about Java Virtual MachineJoão Santana
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.Jack Levin
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityConSanFrancisco123
 
[Jbcn 2016] Garbage Collectors WTF!?
[Jbcn 2016] Garbage Collectors WTF!?[Jbcn 2016] Garbage Collectors WTF!?
[Jbcn 2016] Garbage Collectors WTF!?Alonso Torres
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySematext Group, Inc.
 
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...Lucidworks
 
006 performance tuningandclusteradmin
006 performance tuningandclusteradmin006 performance tuningandclusteradmin
006 performance tuningandclusteradminScott Miao
 
Low pause GC in HotSpot
Low pause GC in HotSpotLow pause GC in HotSpot
Low pause GC in HotSpotjClarity
 
Lessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core clusterLessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core clusterEugene Kirpichov
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localyticsandrew311
 
Taming Go's Memory Usage — and Avoiding a Rust Rewrite
Taming Go's Memory Usage — and Avoiding a Rust RewriteTaming Go's Memory Usage — and Avoiding a Rust Rewrite
Taming Go's Memory Usage — and Avoiding a Rust RewriteScyllaDB
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme MakeoverHBaseCon
 
2009 Eclipse Con
2009 Eclipse Con2009 Eclipse Con
2009 Eclipse Conguest29922
 
CMF: a pain in the F @ PHPDay 05-14-2011
CMF: a pain in the F @ PHPDay 05-14-2011CMF: a pain in the F @ PHPDay 05-14-2011
CMF: a pain in the F @ PHPDay 05-14-2011Alessandro Nadalin
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-PatternsMatthew Dennis
 

Similaire à HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers (20)

JVM Magic
JVM MagicJVM Magic
JVM Magic
 
Jvm is-your-friend
Jvm is-your-friendJvm is-your-friend
Jvm is-your-friend
 
The JVM is your friend
The JVM is your friendThe JVM is your friend
The JVM is your friend
 
Trigger maxl from fdmee
Trigger maxl from fdmeeTrigger maxl from fdmee
Trigger maxl from fdmee
 
A quick view about Java Virtual Machine
A quick view about Java Virtual MachineA quick view about Java Virtual Machine
A quick view about Java Virtual Machine
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And Availability
 
[Jbcn 2016] Garbage Collectors WTF!?
[Jbcn 2016] Garbage Collectors WTF!?[Jbcn 2016] Garbage Collectors WTF!?
[Jbcn 2016] Garbage Collectors WTF!?
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the Ugly
 
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...
 
006 performance tuningandclusteradmin
006 performance tuningandclusteradmin006 performance tuningandclusteradmin
006 performance tuningandclusteradmin
 
Low pause GC in HotSpot
Low pause GC in HotSpotLow pause GC in HotSpot
Low pause GC in HotSpot
 
Lessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core clusterLessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core cluster
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localytics
 
Taming Go's Memory Usage — and Avoiding a Rust Rewrite
Taming Go's Memory Usage — and Avoiding a Rust RewriteTaming Go's Memory Usage — and Avoiding a Rust Rewrite
Taming Go's Memory Usage — and Avoiding a Rust Rewrite
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
 
2009 Eclipse Con
2009 Eclipse Con2009 Eclipse Con
2009 Eclipse Con
 
CMF: a pain in the F @ PHPDay 05-14-2011
CMF: a pain in the F @ PHPDay 05-14-2011CMF: a pain in the F @ PHPDay 05-14-2011
CMF: a pain in the F @ PHPDay 05-14-2011
 
Java8 bench gc
Java8 bench gcJava8 bench gc
Java8 bench gc
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-Patterns
 

Plus de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Plus de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Dernier

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Dernier (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers

  • 1. Avoiding Full GCs with MemStore-Local Allocation Buffers Todd Lipcon todd@cloudera.com Twitter: @tlipcon #hbase IRC: tlipcon February 22, 2011
  • 2. Outline Background HBase and GC A solution Summary
  • 3. Intro / who am I? Been working on data stuff for a few years HBase, HDFS, MR committer Cloudera engineer since March ’09
  • 4. Motivation HBase users want to use large heaps Bigger block caches make for better hit rates Bigger memstores make for larger and more efficient flushes Machines come with 24G-48G RAM But bigger heaps mean longer GC pauses Around 10 seconds/GB on my boxes. Several minute GC pauses wreak havoc
  • 5. GC Disasters 1. Client requests stalled 1 minute “latency” is just as bad as unavailability 2. ZooKeeper sessions stop pinging The dreaded “Juliet Pause” scenario 3. Triggers all kinds of other nasty bugs
  • 6. Yo Concurrent Mark-and-Sweep (CMS)! What part of Concurrent didn’t you understand?
  • 7. Java GC Background Java’s GC is generational Generational hypothesis: most objects either die young or stick around for quite a long time Split the heap into two “generations” - young (aka new) and old (aka tenured) Use different algorithms for the two generations We usually recommend -XX:+UseParNewGC -XX:+UseConcMarkSweepGC Young generation: Parallel New collector Old generation: Concurrent-mark-sweep
  • 8. The Parallel New collector in 60 seconds Divide the young generation into eden, survivor-0, and survivor-1 One survivor space is from-space and the other is to-space Allocate all objects in eden When eden fills up, stop the world and copy live objects from eden and from-space into to-space, swap from and to Once an object has been copied back and forth N times, copy it to the old generation N is the “Tenuring Threshold” (tunable)
  • 9. The CMS collector in 60 seconds A bit simplified, sorry... Several phases: 1. initial-mark (stop-the-world) - marks roots (eg thread stacks) 2. concurrent-mark - traverse references starting at roots, marking what’s live 3. concurrent-preclean - another pass of the same (catch new objects) 4. remark (stop-the-world) - any last changed/new objects 5. concurrent-sweep - clean up dead objects to update free space tracking Note: dead objects free up space, but it’s not contiguous. We’ll come back to this later!
  • 10. CMS failure modes 1. When young generation collection happens, it needs space in the old gen. What if CMS is already in the middle of concurrent work, but there’s no space? The dreaded concurrent mode failure! Stop the world and collect. Solution: lower value of -XX:CMSInitiatingOccupancyFraction so CMS starts working earlier 2. What if there’s space in the old generation, but not enough contiguous space to promote a large object? We need to compact the old generation (move all free space to be contiguous) This is also stop-the-world! Kaboom!
  • 11. OK... so life sucks. What can we do about it?
  • 12. Step 1. Hypothesize Setting the initiating occupancy fraction low puts off GC, but it eventually happens no matter what We see promotion failed followed by long GC pause, even when 30% of the heap is free. Why? Must be fragmentation!
  • 13. Step 2. Measure Let’s make some graphs: -XX:PrintFLSStatistics=1 -XX:PrintCMSStatistics=1 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -verbose:gc -Xloggc:/.../logs/gc-$(hostname).log FLS Statistics: verbose information about the state of the free space inside the old generation Free space - total amount of free space Num blocks - number of fragments it’s spread into Max chunk size parse-fls-statistics.py → R and ggplot2
  • 14. 3 YCSB workloads, graphed
  • 17. Workload 3 Read-only with no cache churn So boring I didn’t make a graph! All allocations are short lived → stay in young gen
  • 18. Recap What we have learned? Fragmentation is what causes long GC pauses Write load seems to cause fragmentation Read load (LRU cache churn) isn’t nearly so bad1 1 At least for my test workloads
  • 19. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation:
  • 20. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: A
  • 21. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: AB
  • 22. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABC
  • 23. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCD
  • 24. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCDE
  • 25. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCDEABCEDDAECBACEBCED
  • 26. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCDEABCEDDAECBACEBCED Now B’s memstore fills up and flushes. We’re left with:
  • 27. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCDEABCEDDAECBACEBCED Now B’s memstore fills up and flushes. We’re left with: A CDEA CEDDAEC ACE CED
  • 28. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCDEABCEDDAECBACEBCED Now B’s memstore fills up and flushes. We’re left with: A CDEA CEDDAEC ACE CED Looks like fragmentation!
  • 29. Also known as swiss cheese If every write is exactly the same size, it’s fine - we’ll fill in those holes. But this is seldom true.
  • 30. A solution Crucial issue is that memory allocations for a given memstore aren’t next to each other in the old generation. When we free an entire memstore we only get tiny blocks of free space What if we ensure that the memory for a memstore is made of large blocks? Enter the MemStore Local Allocation Buffer (MSLAB)
  • 31. What’s an MSLAB? Each MemStore has an instance of MemStoreLAB. MemStoreLAB has a 2MB curChunk with nextFreeOffset starting at 0. Before inserting a KeyValue that points to some byte[], copy the data into curChunk and increment nextFreeOffset by data.length Insert a KeyValue pointing inside curChunk instead of the original data. If a chunk fills up, just make a new one. This is all lock-free, using atomic compare-and-swap instructions.
  • 32. How does this help? The original data to be inserted becomes very short-lived, and dies in the young generation. The only data in the old generation is made of 2MB chunks Each chunk only belongs to one memstore. When we flush, we always free up 2MB chunks, and avoid the swiss cheese effect. Next time we allocate, we need exactly 2MB chunks again, and there will definitely be space.
  • 34. It works! Have seen basically zero full GCs with MSLAB enabled, after days of load testing
  • 35. Summary Most GC pauses are caused by fragmentation in the old generation. The CMS collector doesn’t compact, so the only way it can fight fragmentation is to pause. The MSLAB moves all MemStore allocations into contiguous 2MB chunks in the old generation. No more GC pauses!
  • 36. How to try it 1. Upgrade to HBase 0.90.1 (included in CDH3b4) 2. Set hbase.hregion.memstore.mslab.enabled to true Also tunable: hbase.hregion.memstore.mslab.chunksize (in bytes, default 2M) hbase.hregion.memstore.mslab.max.allocation (in bytes, default 256K) 3. Report back your results!
  • 37. Future work Flat 2MB chunk per region → 2GB RAM minimum usage for 1000 regions incrementColumnValue currently bypasses MSLAB for subtle reasons We’re doing an extra memory copy into MSLAB chunk - we can optimize this out Maybe we can relax CMSInitiatingOccupancyFraction back up a bit?
  • 38. So I don’t forget... Corporate shill time Cloudera offering HBase training on March 10th. 15 percent off with hbase meetup code.
  • 39. todd@cloudera.com Twitter: @tlipcon #hbase IRC: tlipcon P.S. we’re hiring!