SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Performance comparisson of
Distributed File Systems
GlusterFS
XfreemFS
FhgFS

Marian Marinov
CEO of 1H Ltd.
What have I tested?
➢ GlusterFS

http://glusterfs.org

➢ XtremeFS

http://www.xtreemfs.org/

➢ FhgFS

http://www.fhgfs.com/cms/ Fraunhofer

➢ Tahoe-LAFS http://tahoe-lafs.org/
➢ PlasmaFS

http://blog.camlcity.org/blog/plasma4.html
What will be compared?
➢ Ease of install and configuration
➢ Sequential write and read (large file)
➢ Sequential write and read (many same size, small files)
➢ Copy from local to distributed
➢ Copy from distributed to local
➢ Copy from distributed to distributed
➢ Creating many random file sizes (real cases)
➢ Creating many links (cp -al)
Why only on 1Gbit/s ?
➢ It is considered commodity
➢ 6-7 years ago it was considered high performance
➢ Some projects have started around that time
➢ And last, I only had 1Gbit/s switches available for the
tests
Lets get the theory first
1Gbit/s has ~950Mbit/s usable Bandwidth

Wikipedia - Ethernet frame

Which is 118.75 MBytes/s usable speed
iperf tests - 512Mbit/s -> 65MByte/s

There are many 1Gbit/s adapters
that can not go beyond 70k pps

iperf tests - 938Mbit/s -> 117MByte/s
hping3 tcp pps tests

- 50096 PPS (75MBytes/s)
- 62964 PPS (94MBytes/s)
Verify what the hardware can deliver locally
# echo 3 > /proc/sys/vm/drop_caches
# time dd if=/dev/zero of=test1 bs=XX count=1000
# time dd if=test1 of=/dev/null bs=XX
bs=1M

Local write 141MB/s

bs=1M

Local read 228MB/s real 0m4.605s

bs=100K Local write 141MB/s

real 0m7.493s
real 0m7.639s

bs=100K Local read 226MB/s real 0m4.596s
bs=1K

Local write 126MB/s

real 0m8.354s

bs=1K

Local read 220MB/s real 0m4.770s

* most distributed filesystems write with the speed of the slowest member node
Linux Kernel Tuning
sysctl
net.core.netdev_max_backlog=2000
Default 1000
Congestion control
selective acknowledgments
net.ipv4.tcp_sack=0
net.ipv4.tcp_dsack=0
Default enabled
Linux Kernel Tuning
TCP memory optimizations
min pressure max
net.ipv4.tcp_mem=41460 42484 82920
min default max
net.ipv4.tcp_rmem=8192 87380 6291456
net.ipv4.tcp_wmem=8192 87380 6291456
Double the tcp memory
Linux Kernel Tunning
➢ net.ipv4.tcp_syncookies=0

default 1

➢ net.ipv4.tcp_timestamps=0

default 1

➢ net.ipv4.tcp_app_win=40

default 31

➢ net.ipv4.tcp_early_retrans=1 default 2
* For more information - Documentation/networking/ip-sysctl.txt
More tuning :)
Ethernet Tuning
➢ TSO (TCP segmentation offload)
➢ GSO (generic segmentation offload)
➢ GRO/LRO (Generic/Large receive offload)
➢ TX/RX checksumming
➢ ethtool -K ethX tx on rx on tso on gro on lro on
GlusterFS setup
1. gluster peer probe nodeX
2. gluster volume create NAME replica/stripe 2
node1:/path/to/storage node2:/path/to/storage
3. gluster volume start NAME
4. mount -t glusterfs nodeX:/NAME /mnt
XtreemeFS setup
1. Configure and start the directory server(s)
2. Configure and start the metadata server(s)
3. Configure and start the storage server(s)
4. mkfs.xtreemfs localhost/myVolume
5. mount.xtreemfs localhost/myVolume /some/local/path
FhgFS setup
1. Configure /etc/fhgfs/fhgfs-*
2. /etc/init.d/fhgfs-client rebuild
3. Start daemons fhgfs-mgmtd fhgfs-meta fhgfs-storage
fhgfs-admon fhgfs-helperd
4. Configure the local client on all machines
5. Start the local client fhgfs-client
Tahoe-LAFS setup
➢ Download
➢ python setup.py build
➢ export PATH=”$PATH:$(pwd)/bin”
➢ Install sshfs
➢ Setup ssh rsa key
Tahoe-LAFS setup
➢ mkdir /storage/tahoe
➢ cd /storage/tahoe && tahoe create-introducer .
➢ tahoe start .
➢ cat /storage/tahoe/private/introducer.furl
➢ mkdir /storage/tahoe-storage
➢ cd /storage/tahoe-storage && tahoe create-node .
➢ Add the introducer.furl to tahoe.cfg
➢ Add [sftpd] section to tahoe.cfg
Tahoe-LAFS setup
➢ Configure the shares
➢ shares.needed = 2
➢ shares.happy = 2
➢ shares.total = 2

➢ Add accounts to the accounts file
# This is a password line, (username, password, cap)
alice password
URI:DIR2:ioej8xmzrwilg772gzj4fhdg7a:wtiizszzz2rgmczv4wl6bqvbv33ag4kvbr6prz3u6w3geixa6m6a
Statistics
Sequential write
GlusterFS

dd if=/dev/zero of=test1 bs=1M count=1000
dd if=/dev/zero of=test1 bs=100K count=10000
dd if=/dev/zero of=test1 bs=1K count=1000000

500

XtreemeFS
FhgFS
467

450
400

MBytes/s

358

342

350
300
250
200
150

112.6

106.3

100
50
0

43.53
13.7

59.83

1.7

1K
* higher is better

100K

1M
Sequential read
GlusterFS
XtreemeFS
FhgFS

dd if=/mnt/test1 of=/dev/zero bs=XX
250
225

214.6

MBytes/s

200

185.3

209
181.3

179.6

150

105

105.6

100K

1M

100
74.6
50

0

1K
* higher is better
Sequential write (local to cluster)
GlusterFS
XtreemeFS
FhgFS
Tahoe-LAFS

dd if=/tmp/test1 of=/mnt/test1 bs=XX
120

96.33

100

93.7
87.26
76.7

MBytes/s

80

70.3
57.96

60
43.7
40

20

11.36
5.41

0

1K
* higher is better

100K

1M
Sequential read (cluster to local)
GlusterFS
XtreemeFS
FhgFS

dd if=/mnt/test1 of=/tmp/test1 bs=XX
90
80

83.76

85.4

82.56

77.5

74.83

72.56
66.1

70

67.13

100K

1M

MBytes/s

60
50
40
30
20
10
0

1K
* higher is better
Sequential read/write (cluster to cluster)
GlusterFS
XtreemeFS
FhgFS

dd if=/mnt/test1 of=/mnt/test2 bs=XX
120

103.96
100

94.4

93.73

MBytes/s

80
62.7

59.6

60

40

20

36

40.7

11.8

0

1K
* higher is better

100K

1M
Joomla tests (local to cluster)
# for i in {1..100}; do time cp -a /tmp/joomla /mnt/joomla$i; done
70
62.83
60

seconds

50
40
31.42
30
19.26

20
10
0

copy
* lower is better

28MB
6384 inodes
GlusterFS
XtreemeFS
FhgFS
Joomla tests (cluster to local)
# for i in {1..100}; do time cp -a /mnt/joomla /tmp/joomla$i; done
250

200.73

seconds

200

150

100

50

39.7
19.26

0

copy
* lower is better

28MB
6384 inodes
GlusterFS
XtreemeFS
FhgFS
Joomla tests (cluster to cluster)
# for i in {1..100}; do time cp -a joomla joomla$i; done
# for i in {1..100}; do time cp -al joomla joomla$i; done

28MB
6384 inodes
GlusterFS
XtreemeFS
FhgFS

300
265.02
250

seconds

200

150
113.46
100

50

89.52

76.44

51.31
22.53

0

copy
* lower is better

link
Conclusion

➢Distributed FS for large file storage – FhgFS
➢ General purpose distributed FS - GlusterFS

* lower is better
QUESTIONS?
Marian Marinov
<mm@1h.com>
http://www.1h.com
http://hydra.azilian.net
irc.freenode.net hackman
ICQ: 7556201
Jabber: hackman@jabber.org

Contenu connexe

Tendances

Control your service resources with systemd
 Control your service resources with systemd  Control your service resources with systemd
Control your service resources with systemd Marian Marinov
 
Performance characterization in large distributed file system with gluster fs
Performance characterization in large distributed file system with gluster fsPerformance characterization in large distributed file system with gluster fs
Performance characterization in large distributed file system with gluster fsNeependra Khare
 
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...
Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...Tommy Lee
 
State of Gluster Performance
State of Gluster PerformanceState of Gluster Performance
State of Gluster PerformanceGluster.org
 
GlusterFS CTDB Integration
GlusterFS CTDB IntegrationGlusterFS CTDB Integration
GlusterFS CTDB IntegrationEtsuji Nakai
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudPatrick McGarry
 
Gluster for Geeks: Performance Tuning Tips & Tricks
Gluster for Geeks: Performance Tuning Tips & TricksGluster for Geeks: Performance Tuning Tips & Tricks
Gluster for Geeks: Performance Tuning Tips & TricksGlusterFS
 
Erasure codes and storage tiers on gluster
Erasure codes and storage tiers on glusterErasure codes and storage tiers on gluster
Erasure codes and storage tiers on glusterRed_Hat_Storage
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleJames Saint-Rossy
 
What every data programmer needs to know about disks
What every data programmer needs to know about disksWhat every data programmer needs to know about disks
What every data programmer needs to know about disksiammutex
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)Lars Marowsky-Brée
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCeph Community
 
Your 1st Ceph cluster
Your 1st Ceph clusterYour 1st Ceph cluster
Your 1st Ceph clusterMirantis
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practiceEugene Fidelin
 
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructureRed Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructureRed_Hat_Storage
 
GlusterFS As an Object Storage
GlusterFS As an Object StorageGlusterFS As an Object Storage
GlusterFS As an Object StorageKeisuke Takahashi
 
Backup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryBackup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryMongoDB
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
 

Tendances (20)

Control your service resources with systemd
 Control your service resources with systemd  Control your service resources with systemd
Control your service resources with systemd
 
Performance characterization in large distributed file system with gluster fs
Performance characterization in large distributed file system with gluster fsPerformance characterization in large distributed file system with gluster fs
Performance characterization in large distributed file system with gluster fs
 
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...
Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...
 
State of Gluster Performance
State of Gluster PerformanceState of Gluster Performance
State of Gluster Performance
 
GlusterFS CTDB Integration
GlusterFS CTDB IntegrationGlusterFS CTDB Integration
GlusterFS CTDB Integration
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
 
Gluster for Geeks: Performance Tuning Tips & Tricks
Gluster for Geeks: Performance Tuning Tips & TricksGluster for Geeks: Performance Tuning Tips & Tricks
Gluster for Geeks: Performance Tuning Tips & Tricks
 
Erasure codes and storage tiers on gluster
Erasure codes and storage tiers on glusterErasure codes and storage tiers on gluster
Erasure codes and storage tiers on gluster
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at Scale
 
What every data programmer needs to know about disks
What every data programmer needs to know about disksWhat every data programmer needs to know about disks
What every data programmer needs to know about disks
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)
 
ceph-barcelona-v-1.2
ceph-barcelona-v-1.2ceph-barcelona-v-1.2
ceph-barcelona-v-1.2
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at Last
 
Your 1st Ceph cluster
Your 1st Ceph clusterYour 1st Ceph cluster
Your 1st Ceph cluster
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practice
 
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructureRed Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
 
GlusterFS As an Object Storage
GlusterFS As an Object StorageGlusterFS As an Object Storage
GlusterFS As an Object Storage
 
Backup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryBackup, Restore, and Disaster Recovery
Backup, Restore, and Disaster Recovery
 
Bluestore
BluestoreBluestore
Bluestore
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 

En vedette

Japan Lustre User Group 2014
Japan Lustre User Group 2014Japan Lustre User Group 2014
Japan Lustre User Group 2014Hitoshi Sato
 
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Spark Summit
 
Gluster.community.day.2013
Gluster.community.day.2013Gluster.community.day.2013
Gluster.community.day.2013Udo Seidel
 
Cassandra and Spark: Optimizing for Data Locality
Cassandra and Spark: Optimizing for Data LocalityCassandra and Spark: Optimizing for Data Locality
Cassandra and Spark: Optimizing for Data LocalityRussell Spitzer
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 
Hadoop scalability
Hadoop scalabilityHadoop scalability
Hadoop scalabilityWANdisco Plc
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 

En vedette (8)

Japan Lustre User Group 2014
Japan Lustre User Group 2014Japan Lustre User Group 2014
Japan Lustre User Group 2014
 
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
 
Gluster.community.day.2013
Gluster.community.day.2013Gluster.community.day.2013
Gluster.community.day.2013
 
Cassandra and Spark: Optimizing for Data Locality
Cassandra and Spark: Optimizing for Data LocalityCassandra and Spark: Optimizing for Data Locality
Cassandra and Spark: Optimizing for Data Locality
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Hadoop scalability
Hadoop scalabilityHadoop scalability
Hadoop scalability
 
Hadoop 3 in a Nutshell
Hadoop 3 in a NutshellHadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 

Similaire à Performance comparison of Distributed File Systems on 1Gbit networks

Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...Nagios
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalabilityWim Godden
 
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...Ontico
 
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...confluent
 
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalShak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalTommy Lee
 
Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insightsOmid Vahdaty
 
Linux Common Command
Linux Common CommandLinux Common Command
Linux Common CommandJeff Yang
 
Backing up thousands of containers
Backing up thousands of containersBacking up thousands of containers
Backing up thousands of containersMarian Marinov
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Kyle Hailey
 
Optimizing Servers for High-Throughput and Low-Latency at Dropbox
Optimizing Servers for High-Throughput and Low-Latency at DropboxOptimizing Servers for High-Throughput and Low-Latency at Dropbox
Optimizing Servers for High-Throughput and Low-Latency at DropboxScyllaDB
 
Adobe AEM Maintenance - Customer Care Office Hours
Adobe AEM Maintenance - Customer Care Office HoursAdobe AEM Maintenance - Customer Care Office Hours
Adobe AEM Maintenance - Customer Care Office HoursAndrew Khoury
 
SiteGround Tech TeamBuilding
SiteGround Tech TeamBuildingSiteGround Tech TeamBuilding
SiteGround Tech TeamBuildingMarian Marinov
 
Oracle Performance On Linux X86 systems
Oracle  Performance On Linux  X86 systems Oracle  Performance On Linux  X86 systems
Oracle Performance On Linux X86 systems Baruch Osoveskiy
 
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...Nagios
 
UKOUG, Lies, Damn Lies and I/O Statistics
UKOUG, Lies, Damn Lies and I/O StatisticsUKOUG, Lies, Damn Lies and I/O Statistics
UKOUG, Lies, Damn Lies and I/O StatisticsKyle Hailey
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) PostgreSQL Experts, Inc.
 
AIX Advanced Administration Knowledge Share
AIX Advanced Administration Knowledge ShareAIX Advanced Administration Knowledge Share
AIX Advanced Administration Knowledge Share.Gastón. .Bx.
 

Similaire à Performance comparison of Distributed File Systems on 1Gbit networks (20)

Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalability
 
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab.,...
 
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
 
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalShak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-final
 
Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insights
 
Linux Common Command
Linux Common CommandLinux Common Command
Linux Common Command
 
Backing up thousands of containers
Backing up thousands of containersBacking up thousands of containers
Backing up thousands of containers
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
 
Dev ops
Dev opsDev ops
Dev ops
 
Hacking the swisscom modem
Hacking the swisscom modemHacking the swisscom modem
Hacking the swisscom modem
 
Optimizing Servers for High-Throughput and Low-Latency at Dropbox
Optimizing Servers for High-Throughput and Low-Latency at DropboxOptimizing Servers for High-Throughput and Low-Latency at Dropbox
Optimizing Servers for High-Throughput and Low-Latency at Dropbox
 
Adobe AEM Maintenance - Customer Care Office Hours
Adobe AEM Maintenance - Customer Care Office HoursAdobe AEM Maintenance - Customer Care Office Hours
Adobe AEM Maintenance - Customer Care Office Hours
 
SiteGround Tech TeamBuilding
SiteGround Tech TeamBuildingSiteGround Tech TeamBuilding
SiteGround Tech TeamBuilding
 
Oracle Performance On Linux X86 systems
Oracle  Performance On Linux  X86 systems Oracle  Performance On Linux  X86 systems
Oracle Performance On Linux X86 systems
 
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
 
UKOUG, Lies, Damn Lies and I/O Statistics
UKOUG, Lies, Damn Lies and I/O StatisticsUKOUG, Lies, Damn Lies and I/O Statistics
UKOUG, Lies, Damn Lies and I/O Statistics
 
Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009) Performance Whack-a-Mole Tutorial (pgCon 2009)
Performance Whack-a-Mole Tutorial (pgCon 2009)
 
Refining Linux
Refining LinuxRefining Linux
Refining Linux
 
AIX Advanced Administration Knowledge Share
AIX Advanced Administration Knowledge ShareAIX Advanced Administration Knowledge Share
AIX Advanced Administration Knowledge Share
 

Plus de Marian Marinov

Dev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingDev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingMarian Marinov
 
Basic presentation of cryptography mechanisms
Basic presentation of cryptography mechanismsBasic presentation of cryptography mechanisms
Basic presentation of cryptography mechanismsMarian Marinov
 
Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?Marian Marinov
 
Introduction and replication to DragonflyDB
Introduction and replication to DragonflyDBIntroduction and replication to DragonflyDB
Introduction and replication to DragonflyDBMarian Marinov
 
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQMessage Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQMarian Marinov
 
How to successfully migrate to DevOps .pdf
How to successfully migrate to DevOps .pdfHow to successfully migrate to DevOps .pdf
How to successfully migrate to DevOps .pdfMarian Marinov
 
How to survive in the work from home era
How to survive in the work from home eraHow to survive in the work from home era
How to survive in the work from home eraMarian Marinov
 
Improve your storage with bcachefs
Improve your storage with bcachefsImprove your storage with bcachefs
Improve your storage with bcachefsMarian Marinov
 
Защо и как да обогатяваме знанията си?
Защо и как да обогатяваме знанията си?Защо и как да обогатяваме знанията си?
Защо и как да обогатяваме знанията си?Marian Marinov
 
Securing your MySQL server
Securing your MySQL serverSecuring your MySQL server
Securing your MySQL serverMarian Marinov
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKMarian Marinov
 
Challenges with high density networks
Challenges with high density networksChallenges with high density networks
Challenges with high density networksMarian Marinov
 
SiteGround building automation
SiteGround building automationSiteGround building automation
SiteGround building automationMarian Marinov
 
Preventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel trackingPreventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel trackingMarian Marinov
 
Managing a lot of servers
Managing a lot of serversManaging a lot of servers
Managing a lot of serversMarian Marinov
 
Let's Encrypt failures
Let's Encrypt failuresLet's Encrypt failures
Let's Encrypt failuresMarian Marinov
 
Preventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel trackingPreventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel trackingMarian Marinov
 
How to build your own anycast service
How to build your own anycast serviceHow to build your own anycast service
How to build your own anycast serviceMarian Marinov
 

Plus de Marian Marinov (20)

Dev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingDev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & Logging
 
Basic presentation of cryptography mechanisms
Basic presentation of cryptography mechanismsBasic presentation of cryptography mechanisms
Basic presentation of cryptography mechanisms
 
Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?
 
Introduction and replication to DragonflyDB
Introduction and replication to DragonflyDBIntroduction and replication to DragonflyDB
Introduction and replication to DragonflyDB
 
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQMessage Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
 
How to successfully migrate to DevOps .pdf
How to successfully migrate to DevOps .pdfHow to successfully migrate to DevOps .pdf
How to successfully migrate to DevOps .pdf
 
How to survive in the work from home era
How to survive in the work from home eraHow to survive in the work from home era
How to survive in the work from home era
 
Managing sysadmins
Managing sysadminsManaging sysadmins
Managing sysadmins
 
Improve your storage with bcachefs
Improve your storage with bcachefsImprove your storage with bcachefs
Improve your storage with bcachefs
 
Защо и как да обогатяваме знанията си?
Защо и как да обогатяваме знанията си?Защо и как да обогатяваме знанията си?
Защо и как да обогатяваме знанията си?
 
Securing your MySQL server
Securing your MySQL serverSecuring your MySQL server
Securing your MySQL server
 
Sysadmin vs. dev ops
Sysadmin vs. dev opsSysadmin vs. dev ops
Sysadmin vs. dev ops
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK
 
Challenges with high density networks
Challenges with high density networksChallenges with high density networks
Challenges with high density networks
 
SiteGround building automation
SiteGround building automationSiteGround building automation
SiteGround building automation
 
Preventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel trackingPreventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel tracking
 
Managing a lot of servers
Managing a lot of serversManaging a lot of servers
Managing a lot of servers
 
Let's Encrypt failures
Let's Encrypt failuresLet's Encrypt failures
Let's Encrypt failures
 
Preventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel trackingPreventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel tracking
 
How to build your own anycast service
How to build your own anycast serviceHow to build your own anycast service
How to build your own anycast service
 

Dernier

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Dernier (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Performance comparison of Distributed File Systems on 1Gbit networks

  • 1. Performance comparisson of Distributed File Systems GlusterFS XfreemFS FhgFS Marian Marinov CEO of 1H Ltd.
  • 2. What have I tested? ➢ GlusterFS http://glusterfs.org ➢ XtremeFS http://www.xtreemfs.org/ ➢ FhgFS http://www.fhgfs.com/cms/ Fraunhofer ➢ Tahoe-LAFS http://tahoe-lafs.org/ ➢ PlasmaFS http://blog.camlcity.org/blog/plasma4.html
  • 3. What will be compared? ➢ Ease of install and configuration ➢ Sequential write and read (large file) ➢ Sequential write and read (many same size, small files) ➢ Copy from local to distributed ➢ Copy from distributed to local ➢ Copy from distributed to distributed ➢ Creating many random file sizes (real cases) ➢ Creating many links (cp -al)
  • 4. Why only on 1Gbit/s ? ➢ It is considered commodity ➢ 6-7 years ago it was considered high performance ➢ Some projects have started around that time ➢ And last, I only had 1Gbit/s switches available for the tests
  • 5. Lets get the theory first 1Gbit/s has ~950Mbit/s usable Bandwidth Wikipedia - Ethernet frame Which is 118.75 MBytes/s usable speed iperf tests - 512Mbit/s -> 65MByte/s There are many 1Gbit/s adapters that can not go beyond 70k pps iperf tests - 938Mbit/s -> 117MByte/s hping3 tcp pps tests - 50096 PPS (75MBytes/s) - 62964 PPS (94MBytes/s)
  • 6. Verify what the hardware can deliver locally # echo 3 > /proc/sys/vm/drop_caches # time dd if=/dev/zero of=test1 bs=XX count=1000 # time dd if=test1 of=/dev/null bs=XX bs=1M Local write 141MB/s bs=1M Local read 228MB/s real 0m4.605s bs=100K Local write 141MB/s real 0m7.493s real 0m7.639s bs=100K Local read 226MB/s real 0m4.596s bs=1K Local write 126MB/s real 0m8.354s bs=1K Local read 220MB/s real 0m4.770s * most distributed filesystems write with the speed of the slowest member node
  • 7. Linux Kernel Tuning sysctl net.core.netdev_max_backlog=2000 Default 1000 Congestion control selective acknowledgments net.ipv4.tcp_sack=0 net.ipv4.tcp_dsack=0 Default enabled
  • 8. Linux Kernel Tuning TCP memory optimizations min pressure max net.ipv4.tcp_mem=41460 42484 82920 min default max net.ipv4.tcp_rmem=8192 87380 6291456 net.ipv4.tcp_wmem=8192 87380 6291456 Double the tcp memory
  • 9. Linux Kernel Tunning ➢ net.ipv4.tcp_syncookies=0 default 1 ➢ net.ipv4.tcp_timestamps=0 default 1 ➢ net.ipv4.tcp_app_win=40 default 31 ➢ net.ipv4.tcp_early_retrans=1 default 2 * For more information - Documentation/networking/ip-sysctl.txt
  • 11. Ethernet Tuning ➢ TSO (TCP segmentation offload) ➢ GSO (generic segmentation offload) ➢ GRO/LRO (Generic/Large receive offload) ➢ TX/RX checksumming ➢ ethtool -K ethX tx on rx on tso on gro on lro on
  • 12. GlusterFS setup 1. gluster peer probe nodeX 2. gluster volume create NAME replica/stripe 2 node1:/path/to/storage node2:/path/to/storage 3. gluster volume start NAME 4. mount -t glusterfs nodeX:/NAME /mnt
  • 13. XtreemeFS setup 1. Configure and start the directory server(s) 2. Configure and start the metadata server(s) 3. Configure and start the storage server(s) 4. mkfs.xtreemfs localhost/myVolume 5. mount.xtreemfs localhost/myVolume /some/local/path
  • 14. FhgFS setup 1. Configure /etc/fhgfs/fhgfs-* 2. /etc/init.d/fhgfs-client rebuild 3. Start daemons fhgfs-mgmtd fhgfs-meta fhgfs-storage fhgfs-admon fhgfs-helperd 4. Configure the local client on all machines 5. Start the local client fhgfs-client
  • 15. Tahoe-LAFS setup ➢ Download ➢ python setup.py build ➢ export PATH=”$PATH:$(pwd)/bin” ➢ Install sshfs ➢ Setup ssh rsa key
  • 16. Tahoe-LAFS setup ➢ mkdir /storage/tahoe ➢ cd /storage/tahoe && tahoe create-introducer . ➢ tahoe start . ➢ cat /storage/tahoe/private/introducer.furl ➢ mkdir /storage/tahoe-storage ➢ cd /storage/tahoe-storage && tahoe create-node . ➢ Add the introducer.furl to tahoe.cfg ➢ Add [sftpd] section to tahoe.cfg
  • 17. Tahoe-LAFS setup ➢ Configure the shares ➢ shares.needed = 2 ➢ shares.happy = 2 ➢ shares.total = 2 ➢ Add accounts to the accounts file # This is a password line, (username, password, cap) alice password URI:DIR2:ioej8xmzrwilg772gzj4fhdg7a:wtiizszzz2rgmczv4wl6bqvbv33ag4kvbr6prz3u6w3geixa6m6a
  • 19. Sequential write GlusterFS dd if=/dev/zero of=test1 bs=1M count=1000 dd if=/dev/zero of=test1 bs=100K count=10000 dd if=/dev/zero of=test1 bs=1K count=1000000 500 XtreemeFS FhgFS 467 450 400 MBytes/s 358 342 350 300 250 200 150 112.6 106.3 100 50 0 43.53 13.7 59.83 1.7 1K * higher is better 100K 1M
  • 20. Sequential read GlusterFS XtreemeFS FhgFS dd if=/mnt/test1 of=/dev/zero bs=XX 250 225 214.6 MBytes/s 200 185.3 209 181.3 179.6 150 105 105.6 100K 1M 100 74.6 50 0 1K * higher is better
  • 21. Sequential write (local to cluster) GlusterFS XtreemeFS FhgFS Tahoe-LAFS dd if=/tmp/test1 of=/mnt/test1 bs=XX 120 96.33 100 93.7 87.26 76.7 MBytes/s 80 70.3 57.96 60 43.7 40 20 11.36 5.41 0 1K * higher is better 100K 1M
  • 22. Sequential read (cluster to local) GlusterFS XtreemeFS FhgFS dd if=/mnt/test1 of=/tmp/test1 bs=XX 90 80 83.76 85.4 82.56 77.5 74.83 72.56 66.1 70 67.13 100K 1M MBytes/s 60 50 40 30 20 10 0 1K * higher is better
  • 23. Sequential read/write (cluster to cluster) GlusterFS XtreemeFS FhgFS dd if=/mnt/test1 of=/mnt/test2 bs=XX 120 103.96 100 94.4 93.73 MBytes/s 80 62.7 59.6 60 40 20 36 40.7 11.8 0 1K * higher is better 100K 1M
  • 24. Joomla tests (local to cluster) # for i in {1..100}; do time cp -a /tmp/joomla /mnt/joomla$i; done 70 62.83 60 seconds 50 40 31.42 30 19.26 20 10 0 copy * lower is better 28MB 6384 inodes GlusterFS XtreemeFS FhgFS
  • 25. Joomla tests (cluster to local) # for i in {1..100}; do time cp -a /mnt/joomla /tmp/joomla$i; done 250 200.73 seconds 200 150 100 50 39.7 19.26 0 copy * lower is better 28MB 6384 inodes GlusterFS XtreemeFS FhgFS
  • 26. Joomla tests (cluster to cluster) # for i in {1..100}; do time cp -a joomla joomla$i; done # for i in {1..100}; do time cp -al joomla joomla$i; done 28MB 6384 inodes GlusterFS XtreemeFS FhgFS 300 265.02 250 seconds 200 150 113.46 100 50 89.52 76.44 51.31 22.53 0 copy * lower is better link
  • 27. Conclusion ➢Distributed FS for large file storage – FhgFS ➢ General purpose distributed FS - GlusterFS * lower is better