Benchmarking L2 Caches with JMH

L2C BENCHMARKS,
OR HOW I LEARNED
TO STOP WORRYING
AND LOVE JMH
ANDRES ALMIRAY
@AALMIRAY
ANDRESALMIRAY.COM

@aalmiray
GET THE CODE
https://github.com/aalmiray/caching-
benchmarks
http://bit.ly/2BgF4fM

@aalmiray
THE SETTING (1)
A customer had a working JEE application
with a mix of several technologies and
desired to consolidate everything under
one banner: JBoss.
The application relied on EclipseLink as a
JPA provider.

@aalmiray
THE SETTING (2)
L2C caching was enabled (super easy with
EclipseLink) resulting in fast response
times when loading medium to big sized
datasets.
Average response time with a typical
dataset was in the 2..4 seconds range.

@aalmiray
THE PROBLEM
Once migration to Hibernate occurred the
response times went down the drain.
Average response time was in the 40..80
seconds range.
Setting up L2C for Hibernate required more
work than EclipseLink.

@aalmiray
MEASUREMENTS
Response times were clocked using both
manual approach and fumbling around with
log files, which lead to:
- Human error
- Inaccurate measurements
- Unrepeatable conditions

@aalmiray
GOAL
Could L2C & Hibernate be configured in
such a way that it results in similar (or
better) performance as provided by
EclipseLink?
Is there a JBoss friendly alternative?

@aalmiray
THE CANDIDATES
Infinispan (JBoss)
Hazelcast
EhCache
Caffeine

@aalmiray
WHAT WE NEEDED
- Accurate measurements.
- Ensure equal/fair test conditions as
much as possible.
- Repeatable executions.

@aalmiray
WRITE A JUNIT TEST
What better way there is to exercise some
code than running it through a test case?
Scenario:
- Load the data using EntityManager #1
which warms up the cache.
- Run another query with EntityManager
#2 which should hit the cache.
- Profit!

@aalmiray
TEST CONDITIONS
- Chosen database was H2 in embedded
mode.
- Dataset comprised of 6000 entities.
- Simple parent-child relationship between
2 entities.
- All queries were read-only.
- 50 iterations.

@aalmiray
TEST CANDIDATES
- EclipseLink would be the control.
- Hibernate without caching would serve
as a base for all other Hibernate
measurements.
- All caching providers must use default
configuration.

@aalmiray
ITERATIONS
@Test
public void first_test() throws Exception {
int entityCount = ENTITY_COUNT_SMALL;
setupDataset(entityCount);
List<List<Measurement>> measurements = new ArrayList<>();
for (int i = 0; i < ITERATION_COUNT; i++) {
measurements.add(executeBenchmark(entityCount, i));
}
writeReports(measurements, getTestName());
}

@aalmiray
ITERATIONS
private List<Measurement> executeBenchmark(int entityCount, int
iteration) throws Exception {
List<Measurement> measurements = synchronizedList(new
ArrayList<>());
EntityManagerFactory emf =
Persistence.createEntityManagerFactory(getTestName() + "-
cachedPersistenceUnit");
EntityManager em1 = emf.createEntityManager();
measurements.add(executeQueryOn(em1, entityCount));
em1.close();
EntityManager em2 = emf.createEntityManager();
measurements.add(executeQueryOn(em2, entityCount));
return measurements;
}

@aalmiray
THE NUMBERS
Load L2C with EM1 Hit L2C with EM2
EclipseLink 13.497332 4.771606
Hibernate 15.133310 13.236914
Hbm + Infinispan 29.761332 6.313264
Hbm + Hazelcast 24.369891 4.157640
Hbm + EhCache 38.078855 5.565752
Hbm + Caffeine 18.489389 4.596457
Times measured in milliseconds

@aalmiray
CONCLUSIONS
- Warming up the cache is expensive
- Once warmed up, the cache behaves as
fast as EclipseLink.

@aalmiray
MORE PROBLEMS
- In a real world scenario concurrent
queries would be executed. Our queries
are executed in a serial way.
- Dataset is small to fit in the cache. We
did not test when cache evictions
occurred, which is also likely to happen
in the real world.

@aalmiray
TEST CONDITIONS
- Additional operations once L2C is
warmed up:
- Hit L1C
- Clear L1C-> L1C loaded from L2C
- Clear L1C & L2C -> loads L2C -> L1C
- Hit L1C one more time
- Same operations but in concurrent
mode.
- Bigger dataset +60K entities

@aalmiray
TEST CODE
- Concurrent tests rely on
ExecutorService, CountDownLatch, and
other tricks.
- Infrastructure code (measurements)
entangled with operation code (what we
want to measure).
- Not showing it due to ugliness.

@aalmiray
CONCLUSIONS
- Both Hazelcast and Caffeine perform
pretty well, in some cases even better
than EclipseLink.
- Infinispan is left in third.
- EhCache + default configuration + cache
eviction => pain.

@aalmiray
MORE PROBLEMS
- Test code is quickly becoming
unmanageable and unreadable.
- What about GC pauses and other JVM
events that may affect measurements?
- Not taking advantage of jitted code.
- Have we really achieved the fastest
setup?

@aalmiray
JMH TO THE RESCUE
- JMH can be used for all kind of
benchmarks, not just the tiny, micro,
pico ones.
- Decided to benchmark the simplest case
only, that is, load L2C and hit it once per
iteration.

@aalmiray
JMH ADVANTAGES
- Concurrent tests supported out of the
box.
- Benchmarks can be parameterized.
- Benchmarks can have warm up
iterations.

@aalmiray
CONCLUSIONS
- Both Hazelcast and Caffeine perform
pretty well, again.
- Infinispan is left in third.
- EhCache + big datasets => we have to
talk

@aalmiray
AND THE
WINNER IS . . .

@aalmiray
DATA SAYS
- The winner is Hazelcast given
- Small and Big datasets
- However
- The database server was not production
quality.
- Production queries require mutation of
values, not just read-only.
- Customer would like a JBoss solution.

@aalmiray
WHAT WE DID
NEXT WILL
SHOCK YOU

@aalmiray
INFINISPAN + JBOSS EAP
- We tested Infinispan on a production
quality environment.
- Infinispan turned out to be much better,
it even smoked EclipseLink on most
queries (20% faster on average).

@aalmiray
PRAGMATIC PERFORMANCE
Pragmatic: dealing with things sensibly and
realistically in a way that is based on
practical rather than theoretical
considerations.
QCon San Paulo 2015 Keynote by Gil Tene
http://qconsp.com/sp2015/system/files/keynotes-
slides/Qcon_Sao_Paulo_Keynote_March2015.pdf

@aalmiray
WHAT WE DID IN THE
BENCHMARK
- Stripped down the code to its bare
minimum.
- Used non production database.
- Used non production hardware.
- We basically cheated by removing
“weight” and only succeeded to obtain
an upper bound.

@aalmiray
TESTS VS JMH
Tests can exercise scenarios that are more
difficult to setup using benchmarks.
JMH provides a lot of infrastructure
“almost for free”.
Numbers obtained by both tests and
benchmarks serve as a guideline.

@aalmiray
PARTING THOUGHTS
- Beware of preconceptions and
confirmation bias.
- Be mindful of the shape of the input
data.
- Always mind the system’s environment.
- Approach results skeptically.
- Iterate, iterate, iterate.

@aalmiray
“If we have data, let’s look at data. If all we
have are opinions, let’s go with mine.”
Jim Barksdale

@aalmiray
“Measure, don’t guess!”®
Kirk Pepperdine

@aalmiray
HTTP://ANDRESALMIRAY.COM/NEWSLETTER
HTTP://ANDRESALMIRAY.COM/EDITORIAL

@aalmiray
JCP Executive Committee Associate Seat
Committer
Committer
JSR377 Specification Lead

THANK YOU!
ANDRES ALMIRAY
@AALMIRAY
ANDRESALMIRAY.COM

Benchmarking L2 Caches with JMH

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Benchmarking L2 Caches with JMH

Similaire à Benchmarking L2 Caches with JMH (20)

Plus de Andres Almiray

Plus de Andres Almiray (20)

Dernier

Dernier (20)

Benchmarking L2 Caches with JMH