Writing Scalable Software in Java from Multi-core to Grid Computing

Writing Scalable
Software in Java
From multi-core to grid-computing

Me

• Ruben Badaró
• Dev Expert at Changingworlds/Amdocs
• PT.JUG Leader
• http://www.zonaj.org

What this talk is not
about
• Sales pitch
• Cloud Computing
• Service Oriented Architectures
• Java EE
• How to write multi-threaded code

Summary

• Deﬁne Performance and Scalability
• Vertical Scalability - scaling up
• Horizontal Scalability - scaling out
• Q&A

Performance

Amount of useful work accomplished by a
computer system compared to the time and
resources used

Scalability

Capability of a system to increase the amount of
useful work as resources and load are added to
the system

Scalability

• A system that performs fast with 10 users
might not do so with 1000 - it doesn’t scale
• Designing for scalability always decreases
performance

Linear Scalability
Throughput

Resources

Reality is sub-linear
Throughput

Resources

Scalability is about
parallelizing
• Parallel decomposition allows division of
work
• Parallelizing might mean more work
• There’s almost always a part of serial
computation

Vertical Scalability
Somewhat hard

Vertical Scalability
Scale Up

• Bigger, meaner machines
- More cores (and more powerful)
- More memory
- Faster local storage
• Limited
- Technical constraints
- Cost - big machines get exponentially
expensive

Shared State

• Need to use those cores
• Java - shared-state concurrency
- Mutable state protected with locks
- Hard to get right
- Most developers don’t have experience
writing multithreaded code

This is how they look
like
public static synchronized SomeObject getInstance() {

return instance;

}

public SomeObject doConcurrentThingy() {

synchronized(this) {

//...

}

return ..;

}

Single vs Multi-threaded
• Single-threaded

- No scheduling cost
- No synchronization cost
• Multi-threaded
- Context Switching (high cost)
- Memory Synchronization (memory barriers)
- Blocking

Lock Contention
Little’s Law

The average number of customers in a stable
system is equal to their average arrival rate
multiplied by their average time in the system

Reducing Contention
• Reduce lock duration
• Reduce frequency with which locks are
requested (stripping)
• Replace exclusive locks with other mechanisms
- Concurrent Collections
- ReadWriteLocks
- Atomic Variables
- Immutable Objects

Concurrent Collections

• Use lock stripping
• Includes putIfAbsent() and replace()
methods
• ConcurrentHashMap has 16 separate locks by
default
• Don’t reinvent the wheel

ReadWriteLocks

• Pair of locks
• Read lock can be held by multiple
threads if there are no writers
• Write lock is exclusive
• Good improvements if object as fewer
writers

Atomic Variables

• Allow to make check-update type of
operations atomically
• Without locks - use low-level CPU
instructions
• It’s volatile on steroids (visibility +
atomicity)

Immutable Objects
• Immutability makes concurrency simple - thread-
safety guaranteed
• An immutable object is:
- final
- ﬁelds are final and private
- Constructor constructs the object completely
- No state changing methods
- Copy internal mutable objects when receiving
or returning

JVM issues
• Caching is useful - storing stuff in memory
• Larger JVM heap size means longer garbage
collection times
• Not acceptable to have long pauses
• Solutions
- Maximum size for heap 2GB/4GB
- Multiple JVMs per machine
- Better garbage collectors: G1 might help

Scaling Up: Other
Approaches
• Change the paradigm
- Actors (Erlang and Scala)
- Dataﬂow programming (GParallelizer)
- Software Transactional Memory
(Pastrami)
- Functional languages, such as Clojure

Scaling Up: Other
Approaches
• Dedicated JVM-friendly hardware
- Azul Systems is amazing
- Hundreds of cores
- Enormous heap sizes with negligible gc
pauses
- HTM included
- Built-in lock elision mechanism

Horizontal Scalability
The hard part

Horizontal Scalability
Scale Out

• Big machines are expensive - 1 x 32 core
normally much more expensive than 4 x
8 core
• Increase throughput by adding more
machines
• Distributed Systems research revisited -
not new

Requirements

• Scalability
• Availability
• Reliability
• Performance

... and another one rides the bus

... and we cache wherever we can

Cache

Cache

Challenges

• How do we route requests to servers?
• How do distribute data between servers?
• How do we handle failures?
• How do we keep our cache consistent?
• How do we handle load peaks?

Technique #1: Partitioning

A F K P U
... ... ... ... ...
E J O T Z

Users

Technique #1: Partitioning

• Each server handles a subset of data
• Improves scalability by parallelizing
• Requires predictable routing
• Introduces problems with locality
• Move work to where the data is!

Technique #2: Replication

Active

Backup

Technique #2: Replication

• Keep copies of data/state in multiple
servers
• Used for fail-over - increases availability
• Requires more cold hardware
• Overhead of replicating might reduce
performance

Technique #3: Messaging
• Use message passing, queues and pub/sub
models - JMS
• Improves reliability easily
• Helps deal with peaks
- The queue keeps ﬁlling
- If it gets too big, extra requests are
rejected

Solution #1: De-
normalize DB
• Faster queries
• Additional work to generate tables
• Less space efﬁciency
• Harder to maintain consistency

Solution #2: Non-SQL
Database

• Why not remove the relational part
altogether
• Bad for complex queries
• Berkeley DB is a prime example

Solution #3: Distributed
Key/Value Stores
• Highly scalable - used in the largest websites in the
world, based on Amazon’s Dynamo and Google’s
BigTable
• Mostly open source
• Partitioned
• Replicated
• Versioned
• No SPOF
• Voldemort (LinkedIn), Cassandra (Facebook) and HBase
are written in Java

Solution #4:
MapReduce

Map...

Solution #4:
MapReduce

Divide Work

Map...

Solution #4:
MapReduce

Compute

Map...

Solution #4:
MapReduce

Return and
aggregate

Reduce...

Solution #4:
MapReduce
• Google’s algorithm to split work, process it
and reduce to an answer
• Used for ofﬂine processing of large
amounts of data
• Hadoop is used everywhere! Other options
such as GridGain exist

Solution #5: Data Grid
• Data (and computations)
• In-memory - low response times
• Database back-end (SQL or not)
• Partitioned - operations on data executed in
speciﬁc partition
• Replicated - handles failover automatically
• Transactional

Solution #5: Data Grid
• It’s a distributed cache + computational
engine
• Can be used as a cache with JPA and the like
• Oracle Coherence is very good.
• Terracotta, Gridgain, Gemﬁre, Gigaspaces,
Velocity (Microsoft) and Websphere
extreme scale (IBM)

Retrospective

• You need to scale up and out
• Write code thinking of hundreds of cores
• Relational might not be the way to go
• Cache whenever you can
• Be aware of data locality

Q &A
Thanks for listening!

Ruben Badaró
http://www.zonaj.org

Writing Scalable Software in Java from Multi-core to Grid Computing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Writing Scalable Software in Java from Multi-core to Grid Computing

Similar to Writing Scalable Software in Java from Multi-core to Grid Computing (20)

Recently uploaded

Recently uploaded (20)

Writing Scalable Software in Java from Multi-core to Grid Computing