Published in TANET 2018.
The reliability of distributed file system (DFS)
is a key factor to satisfy the service level agreement
(SLA) of cloud services. Nowadays open source
DFSs are adopted by more and more cloud services,
but lacks of reliability evaluation on them. To
concern with the issue, we report on reliability
characteristics of MogileFS which is one of major
open source DFS implementations. We propose
improved schemes of MogileFS which corresponding
to failure patterns we identified from the observation.
Evaluations based on continuous-time markov chain
(CTMC) are also provided. This study contributed to
the current understanding of the reliability of DFS in
reality.
2. Speaker
● CHTTL Cloud Lab. / mogilefs-moji team member
● Maintainer / contributor to several storage related projects
● http://www.github.com/hrchu
3. Distributed File System
A.K.A.
- restful storage, object storage, distributed storage system, software defined storage...
Characteristics
- Accessible: flat namespace / access files via restful API, SNIA CDMI or other protocol
- Scalable/Available: files are written to multiple drives spread throughout servers and data centers
- Reliable/Durable: ensuring data replication and integrity across the cluster
Typical Implementations
- AWS S3 / Azure blob storage / Google cloud storage
- MogileFS / Openstack Swift / Ceph radosgw
4. How reliable is your DFS setup?
● Hard to measure by simply testing strategies
● Most existing works are addressing theoretical and conceptual frameworks
5. We investigated open sourced DFSs[1] , and found that...
1. Points of concern and characteristics are vary among DFSs we surveyed
2. Existing reliable risks among DFSs implementations:
Study case
● 實作及版本:MogileFS 2.57+
● 儲存節點數量:147台
● 硬碟機數量:1547台
● 儲存容量:~3PB
● 檔案數量:396,043,825份
● 讀寫比率:0.34
● 機房配置:兩座機房
Study Result
[1] 分散式儲存可靠度的實務性研究 , 電信研究期刊 第 47 卷第 1 期
6. Regarding MogileFS (Responding to reviewers)
One of widely adopted DFS today
- Implemented by Brad Fitzpatrick (1999)
- Simple architecture design
- Users: KKBOX, Dreamwidth Studios, Sixpart...
Be the research case in this work
[2] MogileFS 簡約可靠的儲存方案 , TWJUG 2016
7. What we proposed in this work
Two methods to address the issue are proposed based on insights from previous works
1. Simple Majority Write
2. Routined Deep FSCK
Outcome
● Code: available in github
● Reliability evaluation: introduced in the following sections
8. Reliability Analysis
Goal
● Mean time to data loss (MTTDL)
○ a classical metric for studys of reliability
● Approximate Reliability
○ needed for our customer and my boss
○ also the key factor of service level of agreement (SLA)
Modeling
● Continuous-time markov chain (CTMC)
○ number of replicas as state
○ disks fail independently
○ failure process - Poisson process with rate λ
○ repair time - exponential with mean time 1/μ
13. 2. Routined Deep FSCK
AS-IS: check host/disk regularly
TO-BE: check each file length and checksum
Extended from the previous model
● c: file damage detected rate
● state 1’: one replica is broken
and unawared
14. (1) Mean time in each state (2) P(1->2), P(1->0) =
2. Routined Deep FSCK - MTTDL
15. (3) Possible ways to walk:
(4) Time of walk patterns:
(5) MTTDL:
2. Routined Deep FSCK - MTTDL
20. Limitation and future work
● Independence of disks failure is impractical in the real world
○ powered by more magical math?
○ simulation such as the Monte Carlo method
● 評估受限於可靠度數據難以於短期取得,無法進一步驗證
○ 強化儲存叢集數據蒐集
○ 蒐集不同實作叢集運行數據 , e.g., openstack swift and Ceph