TANET 2018 - Insights into the reliability of open-source distributed file system (DFS)

開源分散式儲存實做
可靠度研究
Chu, Hua-Rong @ TANET 2018

Speaker
● CHTTL Cloud Lab. / mogilefs-moji team member
● Maintainer / contributor to several storage related projects
● http://www.github.com/hrchu

Distributed File System
A.K.A.
- restful storage, object storage, distributed storage system, software defined storage...
Characteristics
- Accessible: flat namespace / access files via restful API, SNIA CDMI or other protocol
- Scalable/Available: files are written to multiple drives spread throughout servers and data centers
- Reliable/Durable: ensuring data replication and integrity across the cluster
Typical Implementations
- AWS S3 / Azure blob storage / Google cloud storage
- MogileFS / Openstack Swift / Ceph radosgw

How reliable is your DFS setup?
● Hard to measure by simply testing strategies
● Most existing works are addressing theoretical and conceptual frameworks

We investigated open sourced DFSs[1] , and found that...
1. Points of concern and characteristics are vary among DFSs we surveyed
2. Existing reliable risks among DFSs implementations:
Study case
● 實作及版本：MogileFS 2.57+
● 儲存節點數量：147台
● 硬碟機數量：1547台
● 儲存容量：~3PB
● 檔案數量：396,043,825份
● 讀寫比率：0.34
● 機房配置：兩座機房
Study Result
[1] 分散式儲存可靠度的實務性研究 , 電信研究期刊第 47 卷第 1 期

Regarding MogileFS (Responding to reviewers)
One of widely adopted DFS today
- Implemented by Brad Fitzpatrick (1999)
- Simple architecture design
- Users: KKBOX, Dreamwidth Studios, Sixpart...
Be the research case in this work
[2] MogileFS 簡約可靠的儲存方案 , TWJUG 2016

What we proposed in this work
Two methods to address the issue are proposed based on insights from previous works
1. Simple Majority Write
2. Routined Deep FSCK
Outcome
● Code: available in github
● Reliability evaluation: introduced in the following sections

Reliability Analysis
Goal
● Mean time to data loss (MTTDL)
○ a classical metric for studys of reliability
● Approximate Reliability
○ needed for our customer and my boss
○ also the key factor of service level of agreement (SLA)
Modeling
● Continuous-time markov chain (CTMC)
○ number of replicas as state
○ disks fail independently
○ failure process - Poisson process with rate λ
○ repair time - exponential with mean time 1/μ

1. Simple Majority Write
AS-IS TO-BE

(4)
(5) MTTDLfrom1=
(1) Mean time in state 1 =
(2) P(1->2), P(1->0) =
(3)
1. Simple Majority Write - MTTDL

(4)
(5) MDDDLfrom1=
1. Simple Majority Write - MTTDL

1. Simple Majority Write - Reliability
假設副本存活時間為50萬小時，產生時間為1小時
● 保存五年可靠度：99.99996%
● 保存十年可靠度：99.99990%

2. Routined Deep FSCK
AS-IS: check host/disk regularly
TO-BE: check each file length and checksum
Extended from the previous model
● c: file damage detected rate
● state 1’: one replica is broken
and unawared

(1) Mean time in each state (2) P(1->2), P(1->0) =
2. Routined Deep FSCK - MTTDL

(3) Possible ways to walk:
(4) Time of walk patterns:
(5) MTTDL:

Extremely cases
(6) c=1, MTTDL=
(7) c=0, MTTDL=
(5) MTTDL:

2. Routined Deep FSCK - Reliability
Assume coverage rates are:
● AS-IS: 72%
● TO-BE: 81%
Reliability Difference:
● 檔案保存五年: 0.78%
● 檔案保存十年: 1.5%

Contributions
● 呈現開源分散式儲存實作的可靠度特性及調校評估
○ 揭露開源分散式儲存實作實際運行可靠度數據
○ 提供DFS開發者可靠度機制的評估方法
○ 實務上DFS開發者/用戶應優先考量驗證涵蓋度及降低修復時間 (Amdahl's Law)
● Impact factor of reliability: integrity check coverage > repair time > sync write
○ 簡單多數寫入相較於縮短檔案修復時間，改善效果有限
○ 改善幅度僅跟平均副本壽命有關，與修復時間無關
○ 例行性FSCK能提高異常偵測涵蓋度，相較於縮短修復時間能顯著改善可靠度

Limitation and future work
● Independence of disks failure is impractical in the real world
○ powered by more magical math?
○ simulation such as the Monte Carlo method
● 評估受限於可靠度數據難以於短期取得，無法進一步驗證
○ 強化儲存叢集數據蒐集
○ 蒐集不同實作叢集運行數據 , e.g., openstack swift and Ceph

TANET 2018 - Insights into the reliability of open-source distributed file system (DFS)

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à TANET 2018 - Insights into the reliability of open-source distributed file system (DFS)

Similaire à TANET 2018 - Insights into the reliability of open-source distributed file system (DFS) (20)

Plus de Hua Chu

Plus de Hua Chu (7)

Dernier

Dernier (20)

TANET 2018 - Insights into the reliability of open-source distributed file system (DFS)