SlideShare une entreprise Scribd logo
1  sur  41
Télécharger pour lire hors ligne
1
Java In-Process Caching
Performance, Progress and Pitfalls
Tuesday, May 21, 2019
19th Software Performance Meetup, Munich
Jens Wilke
2
Links - Disclaimer - Copyright
talk slides and diagram source data
https://github.com/cruftex/talk-java-in-process-caching-performance-progress-pitfalls
used benchmarks
https://github.com/cache2k/cache2k-benchmark
disclaimer
No guarantees at all. Do not sue me for any mistake,
instead, send a pull request and correct it!
Copyright: Creative Commons Attribution
CC BY 4.0
3
About Me
● Performance Fan(atic)
● Java Hacker since 1998
● Author of cache2k
● 70+ answered questions on
StackOverflow about Caching
● JCache / JSR107 Contributor
Jens Wilke
@cruftex
cruftex.net
4
Up Next
Java In-Process Caching What and Why
5
Example 1: Geolocation Lookup
48° 4' 28" N 11° 40' 17" E
6
Example 2: Date Formatting
7
Expensive Operations per Web Request
0 1x 10x
How often is an operation executed per web
request or user interaction?
●
Less than once:
e.g. initialization on startup
●
Exactly once:
e.g. fetch data and resolve geolocation
●
More than once:
e.g. render a time or date
X X X
8
Reduce Expensive Operations
0 1 10x
Cache:
Less executions per
web request or user
interaction
X X X
9
(Java) Caching
● temporary data storage to serve
requests faster
● reduce expensive operations at the cost
of storage
● A tool to tune the space time tradeoff
problem
● Lower latency and improve UX
● If not because of great UX, let‘s save
computing costs!
technical benefits
10
Java In Process Caching
● temporary data storage to serve
requests faster
● reduce expensive operations at the cost
of storage heap memory
● keep data as close to the CPU as possible
● A tool to tune the space time tradeoff
problem
● Lower latency much more and improve
UX
● If not because of great UX, let‘s save
more computing costs!
technical benefits
11
Constructing an Java In Process Cache
The interface of a cache is similar
(sometimes identical) to a Java Map:
cache.put(key, value);
value = cache.get(key);
● A hash table
● An eviction strategy
– to limit the used memory
– but keep data that is „hot“
interface implementation
12
Up Next
Benchmark a Simple Cache
13
@Param({"100000"})
public int entryCount = 100 * 1000;
BenchmarkCache<Integer, Integer> cache;
Integer[] ints;
@Setup
public void setup() throws Exception {
cache = getFactory().create(entryCount);
ints = new Integer[PATTERN_COUNT];
RandomGenerator generator =
new XorShift1024StarRandomGenerator(1802);
for (int i = 0; i < PATTERN_COUNT; i++) {
ints[i] = generator.nextInt(entryCount);
}
for (int i = 0; i < entryCount; i++) {
cache.put(i, i);
}
}
@Benchmark @BenchmarkMode(Mode.Throughput)
public long read(ThreadState threadState) {
int idx = (int) (threadState.index++ % PATTERN_COUNT);
return cache.get(ints[idx]);
}
1
2
3
4
Read Only JMH Benchmark
1) create a cache, via a wrapper to
adapt to different implementations
2) create an array with random
integer objects. Value range does
not exceed entry count
3) fill cache once, not part of the
benchmark
4) benchmark operation does one
cache read with random key
Benchmark that does only read a
cache that is filled with data intially.
No eviction takes place, so we can
compare the read throughput with a
(concurrent) hash table.
14
Benchmark Parameters
● CPU: Intel(R) Xeon(R) CPU E3-1240
v5 @ 3.50GHz 4 physical cores
● Benchmarks are done with different
number of cores by Linux CPU
hotplugging
● Oracle JVM 1.8.0-131, JMH 1.18
● Ubuntu 14.04, 4.4.0-137-generic
● Google Guava Cache, Version 26
● Caffeine, Version 2.6.2
● cache2k, Version 1.2.0.Final
● EHCache, Version 3.6.1
hardware cache versions
● 2 forks, 2 warmup iterations, 3
measurement iterations, 15
second iterations times
● => 6 measurement iterations
JMH parameters
15
Results
 ConcurrentHashMap
 Simple Java Implementation with LRU
via LinkedHashMap
 Google Guava Cache
 Simple Java Implementation with LRU
via LinkedHashMap and Segmentation
Y axis: operations/s
X axis: Number of threads
0.0
20.0M
40.0M
60.0M
80.0M
100.0M
120.0M
140.0M
160.0M
180.0M
ops/s
threads-size-hitRate
CHM
SLHM
Guava
PLHM
Especially when multi-threaded the
ConcurrentHashMap is much faster
then a cache.
16
Up Next
LRU = Least Recently Used
17
LRU List Operations
head tail
11. put (1, x) insert new head
2 12. put (2, x) insert new head
1 23. get (1) move to front
3 1 24. put (3, x) insert new head
5. put (4,x) remove tail (key 2)
insert new head
remove tail
cache operation list operation
double linked list with three entries
4 3 1
18
LRU Properties
● Simple and smart algorithm for eviction
(or replacement)
● Everybody knows it from CS, „eviction
= LRU“
● List operations need synchronization
● A cache read means rewriting
references in 4 objects, most likely
touching 4 different CPU cache lines
● A read operation (happens often!) is
more expensive than an eviction
(happens not so often!)
● LRU is not scan resistent; scans wipe
out the working set in the cache
● Non frequently accessed objects need
a long time until evicted
cool... ...but:
19
LRU Alternatives?
● Reduce CPU cycles for the read operation
● Do more costly operations later when we
need to evict
● Also take frequency into account, keeping
more frequently accessed objects longer
Overview at:
Wikipedia:Page_replacement_algorithm
we look for lots of research
20
Up Next
Clock / Clock-Pro Eviction
21
Clock
10
0
1 ● Each cache entry has a
reference bit which
indicates whether the
entry was accessed
● Access: Sets reference bit
● Eviction, scan at clock
hand:
– Not-Referenced? Evict!
– Referenced? Clear reference
and move to the next
22
Clock-Pro
● Extra clock for hot data
● History of recently evicted keys
● cache2k: Use reference counter instead of
reference bit
10
0
1
4
1
0
1
0
3
5
0
hot
cold
history
Faster:
cache access is tracked by setting a
bit or incrementing a counter
23
Up Next
Will it Blend?
(more Benchmarks...)
24
Results
● Google Guava Cache and
EHCache3 are slow in
comparison to the
ConcurrentHashMap
● Caffeine is faster, if there
are sufficient CPU
cores/threads
● Cache2k is fastest, at about
half the speed of the
ConcurrentHashMap
0.0
20.0M
40.0M
60.0M
80.0M
100.0M
120.0M
140.0M
160.0M
180.0M
1-100K-100
2-100K-100
3-100K-100
4-100K-100
ops/s
threads-size-hitRate
cache2k
Caffeine
Guava
EhCache3
CHM
25
Up Next
What about eviction efficiency?
26
Benchmarking Eviction Quality
● Collect access sequences (traces)
● Replay the access sequence on a cache and count
hits and misses
More information about the traces in the blog article:
https://cruftex.net/2016/05/09/Java-Caching-
Benchmarks-2016-Part-2.html
27
Zipf10k
0
10
20
30
40
50
60
70
80
90
100
500
2000
8000
OPT
LRU
CLOCK
EHCache3
Guava
Caffeine
cache2k
RAND
cache size
hitrate
28cache size
0
10
20
30
40
50
60
70
80
90
100
1250
2500
5000
OPT
LRU
CLOCK
EHCache3
Guava
Caffeine
cache2k
RAND
OrmAccessBusytime
hitrate
29cache size
0
10
20
30
40
50
60
100000
200000
300000
OPT
LRU
CLOCK
EHCache3
Guava
Caffeine
cache2k
RAND
UMassWebsearch1
hitrate
30
Results
● Eviction Improvements Caffeine and cache2k
● Varying results depending on access
sequences / workloads
31
But… Isn‘t Clock O(n)?!
10
0
1 Accademic Objection:
Time for eviction grows linear with
the cache size
Yes, in theory….
32
Cache2k and Clock-Pro
● Uses counter instead of reference bit
● Heuristics to reduce intensive scanning,
if not much is gained
● test with a random sequence at 80%
hitrate, results in the following average
entry scan counts:
– 100K entries: 6.00398
– 1M entries: 6.00463
– 10M entries: 6.00482
=> Little increase, but practically irrelevant.
Improved algorithm battle tested
33
Technical Overview I
Guava Caffeine EHCache3 cache2k
Latest Version 26 2.6.2 3.6.1 1.2.0.Final
JDK compatibility 8+ 8+ 8+ 6+
sun.misc.Unsafe - X X -
Hash implementation own JVM CHM old CHM own
Single object per entry - - - X
Treeifications of
collisions
- X - -
Metrics for hash
collisions
- - - X
Key mutation detection - - - X
34
Technical Overview II
Guava Caffeine EHCache3 cache2k
Eviction Algorithm Q + LRU Q + W-TinyLFU „Scan8“ Clock-Pro+
Lock Free Cache Hit Lock free Lock + Wait free Lock free Lock + Wait free
Limit by count X X X X
Limit by memory size - - X -
Weigher X X - X
JCache / JSR107 - X X X
Separete API jar - - - X
35
Try cache2k?
● Open Source, Apache 2 Licence
● On Maven Central
● Info and User Guide at: https://cache2.org
● JCache support
● Compatible with Android or pure Java
● Runs with hibernate, Spring, datanucleus
● Compatible with Java 8, 9, 10, 11, 12, …. because of no
sun.misc.Unsafe magic
36
Summary
● LRU is simple but outdated
● Caffeine and cache2k use modern eviction algorithms and have
(mostly) better eviction efficiency than LRU
● Caffeine likes to have cores
● EHCache3 likes to have memory
● cache2k optimizes on a fast/fastest access path for a „cache hit“ while
having reasonable eviction efficiency
● Modern hardware needs modern algorithms
● Faster caches allows more fine grained caching
37
Keep Tuning! Questions?
Jens Wilke
@cruftex
cruftex.net
38
Up Next
Appendix / Backup Slides
39
Simple Cache – Part I
public class LinkedHashMapCache<K,V>
extends LinkedHashMap<K,V> {
private final int cacheSize;
public LinkedHashMapCache(int cacheSize) {
super(16, 0.75F, true);
this.cacheSize = cacheSize;
}
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return size() >= cacheSize;
}
}
40
Simple Cache – Part II – Thread Safety
public class SynchronizedLinkedHashMapCache<K,V> {
final private LinkedHashMapCache<K,V> backingMap;
public void put(K key, V value) {
synchronized (backingMap) {
backingMap.put(key, value);
}
}
public V get(K key) {
synchronized (backingMap) {
return backingMap.get(key);
}
}
}
41
Simple Cache – Part III –
Partitioning/Segmentation
public class PartitionedLinkedHashMapCache<K,V> {
final private int PARTS = 4;
final private int MASK = 3;
final private LinkedHashMapCache<K, V>[] backingMaps =
new LinkedHashMapCache[PARTS];
public void put(K key, V value) {
LinkedHashMapCache<K, V> backingMap = backingMaps[key.hashCode() & MASK];
synchronized (backingMap) {
backingMap.put(key, value);
}
}
public V get(K key) {
LinkedHashMapCache<K, V> backingMap = backingMaps[key.hashCode() & MASK];
synchronized (backingMap) {
return backingMap.get(key);
}
}
}

Contenu connexe

Tendances

Mastering java in containers - MadridJUG
Mastering java in containers - MadridJUGMastering java in containers - MadridJUG
Mastering java in containers - MadridJUGJorge Morales
 
GitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedGitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedAlexey Lesovsky
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with PrometheusShiao-An Yuan
 
Become a GC Hero
Become a GC HeroBecome a GC Hero
Become a GC HeroTier1app
 
Become a Garbage Collection Hero
Become a Garbage Collection HeroBecome a Garbage Collection Hero
Become a Garbage Collection HeroTier1app
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015PostgreSQL-Consulting
 
Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationAlexey Lesovsky
 
Lets crash-applications
Lets crash-applicationsLets crash-applications
Lets crash-applicationsTier1 app
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres MonitoringDenish Patel
 
Lets crash-applications
Lets crash-applicationsLets crash-applications
Lets crash-applicationsTier1 app
 
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...Ontico
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
16 artifacts to capture when there is a production problem
16 artifacts to capture when there is a production problem16 artifacts to capture when there is a production problem
16 artifacts to capture when there is a production problemTier1 app
 
collectd & PostgreSQL
collectd & PostgreSQLcollectd & PostgreSQL
collectd & PostgreSQLMark Wong
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 
Go Programming Patterns
Go Programming PatternsGo Programming Patterns
Go Programming PatternsHao Chen
 
PostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetPostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetAlexey Lesovsky
 

Tendances (20)

Mastering java in containers - MadridJUG
Mastering java in containers - MadridJUGMastering java in containers - MadridJUG
Mastering java in containers - MadridJUG
 
GitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedGitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons Learned
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
Pgcenter overview
Pgcenter overviewPgcenter overview
Pgcenter overview
 
Become a GC Hero
Become a GC HeroBecome a GC Hero
Become a GC Hero
 
Become a Garbage Collection Hero
Become a Garbage Collection HeroBecome a Garbage Collection Hero
Become a Garbage Collection Hero
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
 
Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming Replication
 
Lets crash-applications
Lets crash-applicationsLets crash-applications
Lets crash-applications
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
 
Lets crash-applications
Lets crash-applicationsLets crash-applications
Lets crash-applications
 
PostgreSQL Replication Tutorial
PostgreSQL Replication TutorialPostgreSQL Replication Tutorial
PostgreSQL Replication Tutorial
 
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
16 artifacts to capture when there is a production problem
16 artifacts to capture when there is a production problem16 artifacts to capture when there is a production problem
16 artifacts to capture when there is a production problem
 
collectd & PostgreSQL
collectd & PostgreSQLcollectd & PostgreSQL
collectd & PostgreSQL
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Go Programming Patterns
Go Programming PatternsGo Programming Patterns
Go Programming Patterns
 
Tools for Metaspace
Tools for MetaspaceTools for Metaspace
Tools for Metaspace
 
PostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetPostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication Cheatsheet
 

Similaire à Java In-Process Caching - Performance, Progress and Pitfalls

cache2k, Java Caching, Turbo Charged, FOSDEM 2015
cache2k, Java Caching, Turbo Charged, FOSDEM 2015cache2k, Java Caching, Turbo Charged, FOSDEM 2015
cache2k, Java Caching, Turbo Charged, FOSDEM 2015cruftex
 
Distributed caching and computing v3.7
Distributed caching and computing v3.7Distributed caching and computing v3.7
Distributed caching and computing v3.7Rahul Gupta
 
Think Distributed: The Hazelcast Way
Think Distributed: The Hazelcast WayThink Distributed: The Hazelcast Way
Think Distributed: The Hazelcast WayRahul Gupta
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8Rahul Gupta
 
Less and faster – Cache tips for WordPress developers
Less and faster – Cache tips for WordPress developersLess and faster – Cache tips for WordPress developers
Less and faster – Cache tips for WordPress developersSeravo
 
Java и Linux — особенности эксплуатации / Алексей Рагозин (Дойче Банк)
Java и Linux — особенности эксплуатации / Алексей Рагозин (Дойче Банк)Java и Linux — особенности эксплуатации / Алексей Рагозин (Дойче Банк)
Java и Linux — особенности эксплуатации / Алексей Рагозин (Дойче Банк)Ontico
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionTanel Poder
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneEnkitec
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalabilityWim Godden
 
Java on Linux for devs and ops
Java on Linux for devs and opsJava on Linux for devs and ops
Java on Linux for devs and opsaragozin
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachAlexandre Rafalovitch
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Lucidworks
 
cache concepts and varnish-cache
cache concepts and varnish-cachecache concepts and varnish-cache
cache concepts and varnish-cacheMarc Cortinas Val
 
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...Universitat Politècnica de Catalunya
 
Clug 2011 March web server optimisation
Clug 2011 March  web server optimisationClug 2011 March  web server optimisation
Clug 2011 March web server optimisationgrooverdan
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesAmazon Web Services
 

Similaire à Java In-Process Caching - Performance, Progress and Pitfalls (20)

cache2k, Java Caching, Turbo Charged, FOSDEM 2015
cache2k, Java Caching, Turbo Charged, FOSDEM 2015cache2k, Java Caching, Turbo Charged, FOSDEM 2015
cache2k, Java Caching, Turbo Charged, FOSDEM 2015
 
Distributed caching and computing v3.7
Distributed caching and computing v3.7Distributed caching and computing v3.7
Distributed caching and computing v3.7
 
Think Distributed: The Hazelcast Way
Think Distributed: The Hazelcast WayThink Distributed: The Hazelcast Way
Think Distributed: The Hazelcast Way
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8
 
JCache Using JCache
JCache Using JCacheJCache Using JCache
JCache Using JCache
 
Cache is King
Cache is KingCache is King
Cache is King
 
Less and faster – Cache tips for WordPress developers
Less and faster – Cache tips for WordPress developersLess and faster – Cache tips for WordPress developers
Less and faster – Cache tips for WordPress developers
 
Java и Linux — особенности эксплуатации / Алексей Рагозин (Дойче Банк)
Java и Linux — особенности эксплуатации / Алексей Рагозин (Дойче Банк)Java и Linux — особенности эксплуатации / Алексей Рагозин (Дойче Банк)
Java и Linux — особенности эксплуатации / Алексей Рагозин (Дойче Банк)
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in Action
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry Osborne
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalability
 
Java on Linux for devs and ops
Java on Linux for devs and opsJava on Linux for devs and ops
Java on Linux for devs and ops
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approach
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
 
cache concepts and varnish-cache
cache concepts and varnish-cachecache concepts and varnish-cache
cache concepts and varnish-cache
 
Java Memory Model
Java Memory ModelJava Memory Model
Java Memory Model
 
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
 
Clug 2011 March web server optimisation
Clug 2011 March  web server optimisationClug 2011 March  web server optimisation
Clug 2011 March web server optimisation
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
 
Modern processors
Modern processorsModern processors
Modern processors
 

Dernier

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Dernier (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

Java In-Process Caching - Performance, Progress and Pitfalls

  • 1. 1 Java In-Process Caching Performance, Progress and Pitfalls Tuesday, May 21, 2019 19th Software Performance Meetup, Munich Jens Wilke
  • 2. 2 Links - Disclaimer - Copyright talk slides and diagram source data https://github.com/cruftex/talk-java-in-process-caching-performance-progress-pitfalls used benchmarks https://github.com/cache2k/cache2k-benchmark disclaimer No guarantees at all. Do not sue me for any mistake, instead, send a pull request and correct it! Copyright: Creative Commons Attribution CC BY 4.0
  • 3. 3 About Me ● Performance Fan(atic) ● Java Hacker since 1998 ● Author of cache2k ● 70+ answered questions on StackOverflow about Caching ● JCache / JSR107 Contributor Jens Wilke @cruftex cruftex.net
  • 4. 4 Up Next Java In-Process Caching What and Why
  • 5. 5 Example 1: Geolocation Lookup 48° 4' 28" N 11° 40' 17" E
  • 6. 6 Example 2: Date Formatting
  • 7. 7 Expensive Operations per Web Request 0 1x 10x How often is an operation executed per web request or user interaction? ● Less than once: e.g. initialization on startup ● Exactly once: e.g. fetch data and resolve geolocation ● More than once: e.g. render a time or date X X X
  • 8. 8 Reduce Expensive Operations 0 1 10x Cache: Less executions per web request or user interaction X X X
  • 9. 9 (Java) Caching ● temporary data storage to serve requests faster ● reduce expensive operations at the cost of storage ● A tool to tune the space time tradeoff problem ● Lower latency and improve UX ● If not because of great UX, let‘s save computing costs! technical benefits
  • 10. 10 Java In Process Caching ● temporary data storage to serve requests faster ● reduce expensive operations at the cost of storage heap memory ● keep data as close to the CPU as possible ● A tool to tune the space time tradeoff problem ● Lower latency much more and improve UX ● If not because of great UX, let‘s save more computing costs! technical benefits
  • 11. 11 Constructing an Java In Process Cache The interface of a cache is similar (sometimes identical) to a Java Map: cache.put(key, value); value = cache.get(key); ● A hash table ● An eviction strategy – to limit the used memory – but keep data that is „hot“ interface implementation
  • 12. 12 Up Next Benchmark a Simple Cache
  • 13. 13 @Param({"100000"}) public int entryCount = 100 * 1000; BenchmarkCache<Integer, Integer> cache; Integer[] ints; @Setup public void setup() throws Exception { cache = getFactory().create(entryCount); ints = new Integer[PATTERN_COUNT]; RandomGenerator generator = new XorShift1024StarRandomGenerator(1802); for (int i = 0; i < PATTERN_COUNT; i++) { ints[i] = generator.nextInt(entryCount); } for (int i = 0; i < entryCount; i++) { cache.put(i, i); } } @Benchmark @BenchmarkMode(Mode.Throughput) public long read(ThreadState threadState) { int idx = (int) (threadState.index++ % PATTERN_COUNT); return cache.get(ints[idx]); } 1 2 3 4 Read Only JMH Benchmark 1) create a cache, via a wrapper to adapt to different implementations 2) create an array with random integer objects. Value range does not exceed entry count 3) fill cache once, not part of the benchmark 4) benchmark operation does one cache read with random key Benchmark that does only read a cache that is filled with data intially. No eviction takes place, so we can compare the read throughput with a (concurrent) hash table.
  • 14. 14 Benchmark Parameters ● CPU: Intel(R) Xeon(R) CPU E3-1240 v5 @ 3.50GHz 4 physical cores ● Benchmarks are done with different number of cores by Linux CPU hotplugging ● Oracle JVM 1.8.0-131, JMH 1.18 ● Ubuntu 14.04, 4.4.0-137-generic ● Google Guava Cache, Version 26 ● Caffeine, Version 2.6.2 ● cache2k, Version 1.2.0.Final ● EHCache, Version 3.6.1 hardware cache versions ● 2 forks, 2 warmup iterations, 3 measurement iterations, 15 second iterations times ● => 6 measurement iterations JMH parameters
  • 15. 15 Results  ConcurrentHashMap  Simple Java Implementation with LRU via LinkedHashMap  Google Guava Cache  Simple Java Implementation with LRU via LinkedHashMap and Segmentation Y axis: operations/s X axis: Number of threads 0.0 20.0M 40.0M 60.0M 80.0M 100.0M 120.0M 140.0M 160.0M 180.0M ops/s threads-size-hitRate CHM SLHM Guava PLHM Especially when multi-threaded the ConcurrentHashMap is much faster then a cache.
  • 16. 16 Up Next LRU = Least Recently Used
  • 17. 17 LRU List Operations head tail 11. put (1, x) insert new head 2 12. put (2, x) insert new head 1 23. get (1) move to front 3 1 24. put (3, x) insert new head 5. put (4,x) remove tail (key 2) insert new head remove tail cache operation list operation double linked list with three entries 4 3 1
  • 18. 18 LRU Properties ● Simple and smart algorithm for eviction (or replacement) ● Everybody knows it from CS, „eviction = LRU“ ● List operations need synchronization ● A cache read means rewriting references in 4 objects, most likely touching 4 different CPU cache lines ● A read operation (happens often!) is more expensive than an eviction (happens not so often!) ● LRU is not scan resistent; scans wipe out the working set in the cache ● Non frequently accessed objects need a long time until evicted cool... ...but:
  • 19. 19 LRU Alternatives? ● Reduce CPU cycles for the read operation ● Do more costly operations later when we need to evict ● Also take frequency into account, keeping more frequently accessed objects longer Overview at: Wikipedia:Page_replacement_algorithm we look for lots of research
  • 20. 20 Up Next Clock / Clock-Pro Eviction
  • 21. 21 Clock 10 0 1 ● Each cache entry has a reference bit which indicates whether the entry was accessed ● Access: Sets reference bit ● Eviction, scan at clock hand: – Not-Referenced? Evict! – Referenced? Clear reference and move to the next
  • 22. 22 Clock-Pro ● Extra clock for hot data ● History of recently evicted keys ● cache2k: Use reference counter instead of reference bit 10 0 1 4 1 0 1 0 3 5 0 hot cold history Faster: cache access is tracked by setting a bit or incrementing a counter
  • 23. 23 Up Next Will it Blend? (more Benchmarks...)
  • 24. 24 Results ● Google Guava Cache and EHCache3 are slow in comparison to the ConcurrentHashMap ● Caffeine is faster, if there are sufficient CPU cores/threads ● Cache2k is fastest, at about half the speed of the ConcurrentHashMap 0.0 20.0M 40.0M 60.0M 80.0M 100.0M 120.0M 140.0M 160.0M 180.0M 1-100K-100 2-100K-100 3-100K-100 4-100K-100 ops/s threads-size-hitRate cache2k Caffeine Guava EhCache3 CHM
  • 25. 25 Up Next What about eviction efficiency?
  • 26. 26 Benchmarking Eviction Quality ● Collect access sequences (traces) ● Replay the access sequence on a cache and count hits and misses More information about the traces in the blog article: https://cruftex.net/2016/05/09/Java-Caching- Benchmarks-2016-Part-2.html
  • 30. 30 Results ● Eviction Improvements Caffeine and cache2k ● Varying results depending on access sequences / workloads
  • 31. 31 But… Isn‘t Clock O(n)?! 10 0 1 Accademic Objection: Time for eviction grows linear with the cache size Yes, in theory….
  • 32. 32 Cache2k and Clock-Pro ● Uses counter instead of reference bit ● Heuristics to reduce intensive scanning, if not much is gained ● test with a random sequence at 80% hitrate, results in the following average entry scan counts: – 100K entries: 6.00398 – 1M entries: 6.00463 – 10M entries: 6.00482 => Little increase, but practically irrelevant. Improved algorithm battle tested
  • 33. 33 Technical Overview I Guava Caffeine EHCache3 cache2k Latest Version 26 2.6.2 3.6.1 1.2.0.Final JDK compatibility 8+ 8+ 8+ 6+ sun.misc.Unsafe - X X - Hash implementation own JVM CHM old CHM own Single object per entry - - - X Treeifications of collisions - X - - Metrics for hash collisions - - - X Key mutation detection - - - X
  • 34. 34 Technical Overview II Guava Caffeine EHCache3 cache2k Eviction Algorithm Q + LRU Q + W-TinyLFU „Scan8“ Clock-Pro+ Lock Free Cache Hit Lock free Lock + Wait free Lock free Lock + Wait free Limit by count X X X X Limit by memory size - - X - Weigher X X - X JCache / JSR107 - X X X Separete API jar - - - X
  • 35. 35 Try cache2k? ● Open Source, Apache 2 Licence ● On Maven Central ● Info and User Guide at: https://cache2.org ● JCache support ● Compatible with Android or pure Java ● Runs with hibernate, Spring, datanucleus ● Compatible with Java 8, 9, 10, 11, 12, …. because of no sun.misc.Unsafe magic
  • 36. 36 Summary ● LRU is simple but outdated ● Caffeine and cache2k use modern eviction algorithms and have (mostly) better eviction efficiency than LRU ● Caffeine likes to have cores ● EHCache3 likes to have memory ● cache2k optimizes on a fast/fastest access path for a „cache hit“ while having reasonable eviction efficiency ● Modern hardware needs modern algorithms ● Faster caches allows more fine grained caching
  • 37. 37 Keep Tuning! Questions? Jens Wilke @cruftex cruftex.net
  • 38. 38 Up Next Appendix / Backup Slides
  • 39. 39 Simple Cache – Part I public class LinkedHashMapCache<K,V> extends LinkedHashMap<K,V> { private final int cacheSize; public LinkedHashMapCache(int cacheSize) { super(16, 0.75F, true); this.cacheSize = cacheSize; } protected boolean removeEldestEntry(Map.Entry<K, V> eldest) { return size() >= cacheSize; } }
  • 40. 40 Simple Cache – Part II – Thread Safety public class SynchronizedLinkedHashMapCache<K,V> { final private LinkedHashMapCache<K,V> backingMap; public void put(K key, V value) { synchronized (backingMap) { backingMap.put(key, value); } } public V get(K key) { synchronized (backingMap) { return backingMap.get(key); } } }
  • 41. 41 Simple Cache – Part III – Partitioning/Segmentation public class PartitionedLinkedHashMapCache<K,V> { final private int PARTS = 4; final private int MASK = 3; final private LinkedHashMapCache<K, V>[] backingMaps = new LinkedHashMapCache[PARTS]; public void put(K key, V value) { LinkedHashMapCache<K, V> backingMap = backingMaps[key.hashCode() & MASK]; synchronized (backingMap) { backingMap.put(key, value); } } public V get(K key) { LinkedHashMapCache<K, V> backingMap = backingMaps[key.hashCode() & MASK]; synchronized (backingMap) { return backingMap.get(key); } } }