Passive benchmarking with docker LXC and KVM using OpenStack hosted in SoftLayer. These results provide initial incite as to why LXC as a technology choice offers benefits over traditional VMs and seek to provide answers as to the typical initial LXC question -- "why would I consider Linux Containers over VMs" from a performance perspective.
Results here provide insight as to:
- Cloudy ops times (start, stop, reboot) using OpenStack.
- Guest micro benchmark performance (I/O, network, memory, CPU).
- Guest micro benchmark performance of MySQL; OLTP read, read / write complex and indexed insertion.
- Compute node resource consumption; VM / Container density factors.
- Lessons learned during benchmarking.
The tests here were performed using OpenStack Rally to drive the OpenStack cloudy tests and various other linux tools to test the guest performance on a "micro level". The nova docker virt driver was used in the Cloud scenario to realize VMs as docker LXC containers and compared to the nova virt driver for libvirt KVM.
Please read the disclaimers in the presentation as this is only intended to be the "chip of the ice burg".
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
KVM and docker LXC Benchmarking with OpenStack
1. Passive Benchmarking with
docker LXC, KVM & OpenStack
Hosted @ SoftLayer
Boden Russell (brussell@us.ibm.com)
IBM Global Technology Services
Advanced Cloud Solutions & Innovation
V2.0
2. FAQ
How is this version (v2.0) different from the initial benchmarks?
– See the revision history within this document.
Are there any artifacts associated with the test?
– Yes; see my github repo: https://github.com/bodenr/cloudy-docker-kvm-bench
Do these results imply an LXC based technology replaces the need for traditional
hypervisors?
– In my opinion, traditional VMs will become the “edge case” moving forward for use cases
which are currently based on Linux flavored VMs. However I believe there will still be cases
for traditional VMs, some of which are detailed in the LXC Realization presentation.
Are these results scientific?
– No. Disclaimers have been attached to any documentation related to these tests to
indicate such. These tests are meant to be a set of “litmus” tests to gain an initial
understanding of how LXC compares to traditional hypervisors specifically in the Cloud
space.
Do you welcome comments / feedback on the test?
– Yes; the goal of these tests is to educate the community on LXC based technologies vs
traditional hypervisors. As such they are fully disclosed in complete and hence open to
feedback of any kind.
5/11/2014 2Document v2.0
3. FAQ Continued
Should I act on these results?
– I believe the results provide enough information to gain some interest. I expect any
organization, group or individual considering actions as a result will perform their own
validation to assert the technology choice is beneficial for their consumption prior to
adoption.
Is further / deeper testing and investigation warranted?
– Absolutely. These tests should be conducted in a more active manner to understand the
root causes for any differences. Additional tests and variations are also needed including;
various KVM disk cache modes, skinny VM images (i.e. JeOS), impacts of database settings,
docker storage drivers, etc.
Is this a direct measurement of the hypervisor (KVM) or LXC engine (docker)?
– No, many factors play into results. For example the compute node has the nova virt driver
running which is obviously different in implementation between nova libvirt-kvm and
nova docker. Thus it’s implementation *may* have an impact on the compute node
metrics and performance.
5/11/2014 Document v2.0 3
4. Revision History
Revision Overview of changes
V1.0 - Initial document release
V2.0 - All tests were re-run using a single docker image throughout the tests (see my Dockerfile).
- As the result of an astute reader, the 15 VM serial “packing” test reflects VM boot overhead rather than steady-
state; this version clarifies such claims.
- A new Cloudy test was added to better understand steady-state CPU.
- Rather than presenting direct claims of density, raw data and graphs are presented to let the reader draw their
own conclusions.
- Additional “in the guest” tests were performed including blogbench.
5/11/2014 Document v2.0 4
5. Why Linux Containers (LXC)
Fast
– Runtime performance near bare metal speeds
– Management operations (run, stop , start, etc.) in seconds / milliseconds
Agile
– VM-like agility – it’s still “virtualization”
– Seamlessly “migrate” between virtual and bare metal environments
Flexible
– Containerize a “system”
– Containerize “application(s)”
Lightweight
– Just enough Operating System (JeOS)
– Minimal per container penalty
Inexpensive
– Open source – free – lower TCO
– Supported with out-of-the-box modern Linux kernel
Ecosystem
– Growing in popularity
– Vibrant community & numerous 3rd party apps
5/11/2014 5Document v2.0
6. Hypervisors vs. Linux Containers
Hardware
Operating System
Hypervisor
Virtual Machine
Operating
System
Bins / libs
App App
Virtual Machine
Operating
System
Bins / libs
App App
Hardware
Hypervisor
Virtual Machine
Operating
System
Bins / libs
App App
Virtual Machine
Operating
System
Bins / libs
App App
Hardware
Operating System
Container
Bins / libs
App App
Container
Bins / libs
App App
Type 1 Hypervisor Type 2 Hypervisor Linux Containers
5/11/2014 6
Containers share the OS kernel of the host and thus are lightweight.
However, each container must have the same OS kernel.
Containers are isolated, but
share OS and, where
appropriate, libs / bins.
Document v2.0
8. Docker in OpenStack
Havana
– Nova virt driver which integrates with docker REST API on backend
– Glance translator to integrate docker images with Glance
Icehouse
– Heat plugin for docker
Both options are still under development
5/11/2014 8
nova-docker virt driver docker heat plugin
DockerInc::Docke
r::Container
(plugin)
Document v2.0
9. About This Benchmark
Use case perspective
– As an OpenStack Cloud user I want a Ubuntu based VM with MySQL… Why would I choose
docker LXC vs a traditional hypervisor?
OpenStack “Cloudy” perspective
– LXC vs. traditional VM from a Cloudy (OpenStack) perspective
– VM operational times (boot, start, stop, snapshot)
– Compute node resource usage (per VM penalty); density factor
Guest runtime perspective
– CPU, memory, file I/O, MySQL OLTP, etc.
Why KVM?
– Exceptional performance
DISCLAIMERS
The tests herein are semi-active litmus tests – no in depth tuning,
analysis, etc. More active testing is warranted. These results do not
necessary reflect your workload or exact performance nor are they
guaranteed to be statistically sound.
5/11/2014 9Document v2.0
10. Benchmark Environment Topology @ SoftLayer
glance api / reg
nova api / cond / etc
keystone
…
rally
nova api / cond / etc
cinder api / sch / vol
docker lxc
dstat
controller compute node
glance api / reg
nova api / cond / etc
keystone
…
rally
nova api / cond / etc
cinder api / sch / vol
KVM
dstat
controller compute node
5/11/2014 10
+
Awesome!
+
Awesome!
Document v2.0
11. Benchmark Specs
5/11/2014 11
Spec Controller Node (4CPU x 8G RAM) Compute Node (16CPU x 96G RAM)
Environment Bare Metal @ SoftLayer Bare Metal @ SoftLayer
Mother Board SuperMicro X8SIE-F Intel Xeon QuadCore SingleProc SATA
[1Proc]
SuperMicro X8DTU-F_R2 Intel Xeon HexCore DualProc [2Proc]
CPU Intel Xeon-Lynnfield 3470-Quadcore [2.93GHz] (Intel Xeon-Westmere 5620-Quadcore [2.4GHz]) x 2
Memory (Kingston 4GB DDR3 2Rx8 4GB DDR3 2Rx8 [4GB]) x2 (Kingston 16GB DDR3 2Rx4 16GB DDR3 2Rx4 [16GB]) x 6
HDD (LOCAL) Digital WD Caviar RE3 WD5002ABYS [500GB]; SATAII Western Digital WD Caviar RE4 WD5003ABYX [500GB]; SATAII
NIC eth0/eth1 @ 100 Mbps eth0/eth1 @100 Mbps
Operating System Ubuntu 12.04 LTS 64bit Ubuntu 12.04 LTS 64bit
Kernel 3.5.0-48-generic 3.8.0-38-generic
IO Scheduler deadline deadline
Hypervisor tested NA - KVM 1.0 + virtio + KSM (memory deduplication)
- docker 0.10.0 + go1.2.1 + commit dc9c28f + AUFS
OpenStack Trunk master via devstack Trunk master via devstack. Libvirt KVM nova driver / nova-docker
virt driver
OpenStack Benchmark
Client
OpenStack project rally NA
Metrics Collection NA dstat
Guest Benchmark Driver NA - Sysbench 0.4.12
- mbw 1.1.1.-2
- iibench (py)
- netperf 2.5.0-1
- Blogbench 1.1
- cpu_bench.py
VM Image NA - Scenario 1 (KVM): official ubuntu 12.04 image + mysql
snapshotted and exported to qcow2 – 1080 MB
- Scenario 2 (docker): guillermo/mysql -- 381.5 MB
Hosted @Document v2.0
12. Test Descriptions: Cloudy Benchmarks
5/11/2014 12
Benchmark Benchmark Driver Description
OpenStack Cloudy Benchmarks
Serial VM boot
(15 VMs)
OpenStack Rally - Boot VM from image
- Wait for ACTIVE state
- Repeat the above a total of 15 times
- Delete VMs
Compute node
steady-state VM
packing
cpu_bench.py - Boot 15 VMs in async fashion
- Sleep for 5 minutes (wait for steady-state)
- Delete all 15 VMs in async fashion
VM reboot
(5 VMs rebooted 5
times each)
OpenStack Rally - Boot VM from image
- Wait for ACTIVE state
- Soft reboot VM 5 times
- Delete VM
- Repeat the above a total of 5 times
VM snapshot
(1 VM, 1 snapshot)
OpenStack Rally - Boot VM from image
- Wait for ACTIVE state
- Snapshot VM to glance image
- Delete VM
Document v2.0
13. Test Descriptions: Guest Benchmarks
5/11/2014 13
Benchmark Benchmark Driver Description
Guest Runtime Benchmarks
CPU performance Sysbench from within the guest - Clear memory cache
- Run sysbench cpu test
- Repeat a total of 5 times
- Average results over the 5 times
OLTP (MySQL)
performance
Sysbench from within the guest - Clear memory cache
- Run sysbench OLTP test
- Repeat a total of 5 times
- Average results over the 5 times
MySQL Indexed insertion benchmark - Clear memory cache
- Run iibench for a total of 1M inserts printing stats at 100K
intervals
- Collect data over 5 runs & average
File I/O performance Sysbench from within the guest
- Synchronous IO
- Clear memory cache
- Run sysbench OLTP test
- Repeat a total of 5 times
- Average results over the 5 times
Memory performance Mbw from within the guest - Clear memory cache
- Run mbw with array size of 1000 MiB and each test 10 times
- Collect average over 10 runs per test
Network performance Netperf - Run netperf server on controller
- From guest run netperf client in IPv4 mode
- Repeat text 5x
- Average results
Application type
performance
Blogbench - Clear memory cache
- Run blogbench for 5 minutes
- Repeat 5 times
- Average read / write scores
Document v2.0
14. STEADY STATE VM PACKING
OpenStack Cloudy Benchmark
5/11/2014 14Document v2.0
15. Cloudy Performance: Steady State Packing
Benchmark scenario overview
– Pre-cache VM image on compute node prior to test
– Boot 15 VM asynchronously in succession
– Wait for 5 minutes (to achieve steady-state on the
compute node)
– Delete all 15 VMs asynchronously in succession
Benchmark driver
– cpu_bench.py
High level goals
– Understand compute node characteristics under
steady-state conditions with 15 packed / active VMs
5/11/2014 15
0
2
4
6
8
10
12
14
16
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47
ActiveVMs
Time
Benchmark Visualization
VMs
Document v2.0
23. Cloudy Performance: Serial VM Boot
Benchmark scenario overview
– Pre-cache VM image on compute node prior to test
– Boot VM
– Wait for VM to become ACTIVE
– Repeat the above steps for a total of 15 VMs
– Delete all VMs
Benchmark driver
– OpenStack Rally
High level goals
– Understand compute node characteristics under
sustained VM boots
5/11/2014 23
0
2
4
6
8
10
12
14
16
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
ActiveVMs
Time
Benchmark Visualization
VMs
Document v2.0
24. Cloudy Performance: Serial VM Boot
5/11/2014 24
3.529113102
5.781662448
0
1
2
3
4
5
6
7
docker KVM
TimeInSeconds
Average Server Boot Time
docker
KVM
Document v2.0
32. SERIAL VM SOFT REBOOT
OpenStack Cloudy Benchmark
5/11/2014 32Document v2.0
33. Cloudy Performance: Serial VM Reboot
Benchmark scenario overview
– Pre-cache VM image on compute node prior to test
– Boot a VM & wait for it to become ACTIVE
– Soft reboot the VM and wait for it to become ACTIVE
• Repeat reboot a total of 5 times
– Delete VM
– Repeat the above for a total of 5 VMs
Benchmark driver
– OpenStack Rally
High level goals
– Understand compute node characteristics under sustained VM reboots
5/11/2014 33
0
1
2
3
4
5
6
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55
ActiveVMs
Time
Benchmark Visualization
Active VMs
Document v2.0
34. Cloudy Performance: Serial VM Reboot
5/11/2014 34
2.577879581
124.433239
0
20
40
60
80
100
120
140
docker KVM
TimeInSeconds
Average Server Reboot Time
docker
KVM
Document v2.0
35. Cloudy Performance: Serial VM Reboot
5/11/2014 35
3.567586041
3.479760051
0
0.5
1
1.5
2
2.5
3
3.5
4
docker KVM
TimeInSeconds
Average Server Delete Time
docker
KVM
Document v2.0
39. SNAPSHOT VM TO IMAGE
OpenStack Cloudy Benchmark
5/11/2014 39Document v2.0
40. Cloudy Performance: Snapshot VM To Image
Benchmark scenario overview
– Pre-cache VM image on compute node prior to test
– Boot a VM
– Wait for it to become ACTIVE
– Snapshot the VM
– Wait for image to become ACTIVE
– Delete VM
Benchmark driver
– OpenStack Rally
High level goals
– Understand cloudy ops times from a user perspective
5/11/2014 40Document v2.0
41. Cloudy Performance: Snapshot VM To Image
5/11/2014 41
36.88756394
48.02313805
0
10
20
30
40
50
60
docker KVM
TimeInSeconds
Average Snapshot Server Time
docker
KVM
Document v2.0
46. Configuring Docker Container for 2CPU x 4G RAM
Configuring docker LXC for 2CPU x 4G RAM
– Pin container to 2 CPUs / Mems
• Create cpuset cgroup
• Pin group to cpuset.mems to 0,1
• Pin group to cpuset.cpus to 0,1
• Add container root proc to tasks
– Limit container memory to 4G
• Create memory cgroup
• Set memory.limit_in_bytes to 4G
• Add container root proc to tasks
– Limit blkio
• Create blkio cgroup
• Add container root process of LXC to tasks
• Default blkio.weight of 500
5/11/2014 46Document v2.0
47. Guest Performance: CPU
Linux sysbench 0.4.12 cpu test
Calculate prime numbers up to 20000
2 threads
Instance size
– 4G RAM
– 2 CPU cores
– 20G disk
5/11/2014 47Document v2.0
48. Guest Performance: CPU
5/11/2014 48
15.26 15.22 15.13
0
2
4
6
8
10
12
14
16
18
Bare Metal docker KVM
Seconds
Calculate Primes Up To 20000
Bare Metal
docker
KVM
Document v2.0
49. Guest Performance: Memory
Linux mbw 1.1.1-2
Instance size
– 2 CPU
– 4G memory
Execution options
– 10 runs; average
– 1000 MiB
5/11/2014 49Document v2.0
57. Guest Performance: MySQL OLTP
Linux sysbench 0.4.12 oltp test
– Table size of 2,000,000
– MySQL 5.5 (installed on Ubuntu 12.04 LTS with apt-get)
– 60 second iterations
– Default MySQL cnf settings
Variations
– Number of threads
– Transactional random read & transactional random read / write
Instance size
– 4G RAM
– 2 CPU cores
– 20G disk
5/11/2014 57Document v2.0
65. Cloud Management Impacts on LXC
5/11/2014 65
0.17
3.529113102
0
0.5
1
1.5
2
2.5
3
3.5
4
docker cli nova-docker
Seconds
Docker: Boot Container - CLI vs Nova Virt
docker cli
nova-docker
Cloud management often caps true ops performance of LXC
Document v2.0
66. Ubuntu MySQL Image Size
5/11/2014 Document v2.0 66
381.5
1080
0
200
400
600
800
1000
1200
docker kvm
SizeInMB
Docker / KVM: Ubuntu MySQL
docker
kvm
Out of the box JeOS images for docker are lightweight
67. Other Observations
Micro “synthetic” benchmarks do not reflect macro “application” performance
– Always benchmark your “real” workload
Nova-docker virt driver still under development
– Great start, but additional features needed for parity (python anyone?)
– Additions to the nova-docker driver could change Cloudy performance
Docker LXC is still under development
– Docker has not yet released v1.0 for production readiness
KVM images can be made skinnier, but requires additional effort
Increased density / oversubscription imposes additional complexity
– Techniques to handle resource consumption surges which exceed capacity
5/11/2014 Document v2.0 67