2. What makes Hadoop different?
Not much
EXCEPT
• Tera- to Peta-bytes of data
• Commodity hardware
• Highly distributed
• Many different services
3. What needs protection?
Data Sets: Applications: Configuration:
System Knobs and
applications (JT, configurations
Data & Meta-data
NN, Region necessary to run
about your data
Servers, etc) and applications
(Hive)
User applications
4. We will focus on….
Data Sets
but not because the others aren’t important..
Existing systems & processes can help
manage Apps & Configuration (to some
extent)
5. Classes of Problems to Plan For
Hardware Failures
• Data corruption on disk
• Disk/Node crash
• Rack failure
User/Application Error
• Accidental or malicious data deletion
• Corrupted data writes
Site Failures
• Permanent site loss – fire, ice, etc
• Temporary site loss – Network, Power, etc (more common)
6. Business goals must drive solutions
RPOs and RTOs are awesome…
But plan for what you care about – how much is
this data worth?
Failure mode Risk Cost
Disk failure High Low
Node failure High Low
Rack failure Medium Medium
Accidental deletes Medium Medium
Site loss Low High
8. Hardware failures – Data Corruption
Data corruption on disk
• Checksums metadata for each block stored
with file
• If checksums do not match, name node
discards block and replaces with fresh copy
• Name node can write metadata to multiple
copies for safety – write to different file
systems and make backups
9. Hardware Failures - Crashes
Disk/Node crash
• Synchronous replicationon disk day- first
Data corruption saves the
two replicas always on different hosts
• Hardware failure detected by heartbeat loss
• Name node HA for meta-data
• HDFS automatically re-replicates blocks
without enough replicas through periodic
process
10. Hardware Failures – Rack failure
Rack failure
• Configure corruption on diskprovide rack
Data at least 3 replicas and
information (
topology.node.switch.mapping.impl or
topology.script.file.name)
• 3rd replica always in a different rack
• 3rd is important – allows for time window
between failure and detection to safely exist
11. Don’t forget metadata
• Your data is defined by Hive metadata
• But this is easy! SQL backups as per usual for
Hive safety
12. Cool.. Basic hardware is under control
Not quite
• Employ Monitoring to track node health
• Examine data node block scanner reports
(http://datanode:50075/blockScannerReport)
• Hadoop fsck is your friend
Of course, your friendly neighborhood Hadoop vendor
has tools – Cloudera Manager health checks FTW!
13. Phew.. Past the easy stuff
One more small detail…
Upgrades for HDFS should be treated with care
On-disk layout changes are risky!
• Save name node meta-data offsite
• Test upgrade on smaller cluster before pushing out
• Data layout upgrades support roll-back but be safe
• Making backups of all or important data to remote
location before upgrade!
14. Application or user errors
Permissions scope
Users only have access to data they
must have access to
Apply the
principle of
least Quota management
privilege Name quota: Limits number of
files rooted at dir
Space quota: Limit bytes of files
rooted at dir
15. Protecting against accidental deletes
Trash server
When enabled, files are deleted into
trash
Enable using fs.trash.interval to set
trash interval
Keep in mind:
• Trash deletion only works through fs shell –
programmatic deletes will not employ Trash
• .Trash is a per user directory for restores
17. HDFS Snapshots
What are snapshots?
Snapshots represent state of the system at a point
in time
Often implemented using copy-on-write semantics
• In HDFS, append-only fs means only deletes have
to be managed
• Many of the problems with COW are gone!
18. HDFS Snapshots – coming to a distro
near you
Community is hard at work on HDFS snapshots
Expect availability in major distros within the year
Some implementation details – NameNode
snapshotting:
• Very fast snapping capability
• Consistency guarantees
• Restores need to perform data copy
• .snapshot directories for access to individual files
19. What can HDFS Snapshots do for you?
• Handles user/application data corruption
• Handles accidental deletes
• Can also be used for Test/Dev purposes!
20. HBase snapshots
Oh hello, HBase!
Very similar construct to HDFS snapshots
COW model
• Fast snaps
• Consistent snapshots
• Restores still need a copy
(hey, at least we are consistent)
21. Hive metadata
The recurring theme of data + meta-data
Ideally, metadata backed up in the same flow as the
core data
Consistency of data and metadata is really
important
22. Management of snapshots
Space considerations:
• % of cluster for snapshots
• Number of snapshots
• Alerting on space issues
Scheduling backups:
• Time based
• Workflow based
23. Great… Are we done?
Don’t forget Roger Duronio!
Principle of least privilege still matters…
25. Teeing vs Copying
Teeing Copying
Data is copied from
Send data during ingest
production to replica as a
phase to production and
separate step after
replica clusters
processing
• Time delay is minimal
• Consistent data
between clusters
between both sites
• Bandwidth required
• Process once only
could be larger
• Time delay for RPO
• Requires re-processing
objectives to do
data on both sides
incremental copy
• No consistency between
• More bandwidth
sites
needed
26. Recommendations?
Scenario dependent
But
Generally prefer copying over teeing
27. How to replicate – per service
HDFS HBase Hive
Teeing:
Teeing:
Flume and Teeing:
Application
Sqoop support NA
level teeing
teeing
Copying:
Copying: Copying:
DistCP for
copying HBase Database
replication import/export*
* Database import/export isn’t the full story
28. Hive metadata
The recurring theme of data + meta-data
Ideally, metadata backed up in the same flow as the
core data
Consistency of data and metadata is really
important
29. Key considerations for large data
movement
• Is your data compressed?
– None of the systems support compression on the wire natively
– WAN accelerators can help but cost $$
• Do you know your bandwidth needs?
– Initial data load
– Daily ingest rate – Maintain historical information
• Do you know your network security setup?
– Data nodes & Region Servers talk to each other – they need to be able to have network connectivity
• Have you configured security appropriately?
– Kerberos support for cross-realm trust is challenging
• What about cross-version copying?
– Can’t always have both clusters be same version – but this is not trivial
30. Management of replications
Scheduling replication jobs
• Time based
• Workflow based – Kicked off from Oozie script?
Prioritization
• Keep replications in a separate scheduler group and
dedicate capacity to replication jobs
• Don’t schedule more map tasks than can handle
available network bandwidth between sites
31. Secondary configuration and usage
Hardware considerations
• Denser disk configurations acceptable on remote site
depending on workload goals – 4 TB disks vs 2 TB disks, etc
• Fewer nodes are typical – consider replicating only critical
data. Be careful playing with replication factors
Usage considerations
• Physical partitioning means a great place for ad-hoc
analytics
• Production workloads continue to run on core cluster but
ad-hoc analytics on replica cluster
• For HBase, all clusters can be used for data serving!
32. What about external systems?
• Backing up to external systems is a 1 way
street with large data volumes
• Can’t do useful processing on the other side
• Cost of hadoop storage is fairly low, especially
if you can drive work on it
33. Summary
• It can be done!
• Lots of gotchas and details to track in the process
• We haven’t even talked about applications and
configuration!
• Failure workflows are important too – testing,
testing, testing
34. Cloudera Enterprise BDR
CLOUDERA ENTERPRISE
CLOUDERA MANAGER
SELECT CONFIGURE SYNCHRONIZE MONITOR
DISASTER RECOVERY MODULE
CDH
HDFS DISTRIBUTED REPLICATION HIVE METASTORE REPLICATION
HIGH PERFORMANCE REPLICATION THE ONLY DISASTER RECOVERY SOLUTION
USING MAPREDUCE FOR METADATA
HDFS HIVE
34
Notes de l'éditeur
Data movement is expensiveHardware more likely to failMore complex interactions in distributed environmentEach service requires different hand-holding
Keep in mind that configuration may not even make sense to replicate – remote side may have different configuration options
Data is split into blocks: Default 128 MBBlocks are replicated: Default: 3 timesHDFS is rack aware
Cloudera Manager helps with replication by managing versions as well
Cross-version managementImproveddistcpHive export/import with updatesSimple UI