Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

9

Share

Cloudera training: secure your Cloudera cluster

Download to read offline

The first and possibly most important task you perform when you deploy your Cloudera cluster is securing it. Get it wrong and you may inadvertently and unknowingly have introduced a risk to the business. Getting it right eventually leaves you looking back at wasted efforts and false starts. So how do you get it right first time?

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Cloudera training: secure your Cloudera cluster

  1. 1. © Cloudera, Inc. All rights reserved. Cloudera training: secure your Cloudera cluster
  2. 2. © Cloudera, Inc. All rights reserved. The demand for skills is high and Hadoop is the future. Customers cannot afford to move slowly in staffing their Big Data projects. Customers are building plans to ensure projects are staffed with skilled employees, and supported by a qualified services provider. Job Trends from Indeed.com What are you most concerned about when it comes to your readiness for big data and hadoop? Cloudera MDP webinar poll results, July 2016
  3. 3. © Cloudera, Inc. All rights reserved. Why Cloudera training? Aligned to best practices and the pace of change 1 Broadest range of courses Learning paths for Developer, Admin, Analyst 2 Most experienced instructors More than 40,000 trained since 2009 6 Widest geographic coverage Most classes offered: 50 cities worldwide plus online 7 Most relevant platform & community CDH deployed more than all other distributions combined 3 Leader in certification Over 12,000 accredited Cloudera professionals Trusted source for training 100,000+ people have attended online courses4 8 Depth of training material Hands-on labs and VMs support live instruction 9 Ongoing learning Video tutorials and e-learning complement training State of the art curriculum Courses updated as Hadoop evolves5 10Commitment to big data education University partnerships to teach Hadoop in colleges
  4. 4. © Cloudera, Inc. All rights reserved. Creating leaders in the field Training enables Big Data solutions and innovation 94% 66% Would recommend or highly recommend Cloudera training to friends or colleagues Draw on lessons from Cloudera training on at least a monthly basis 40% Develop new apps or perform business-critical analyses as a result of training alone Sources: Cloudera Past Public Training Participant Study, December 2012. Cloudera Customer Satisfaction Study, January 2013. 88% Indicate Cloudera training provided the Hadoop expertise their roles require
  5. 5. © Cloudera, Inc. All rights reserved. What is available from Cloudera University? • Private training: Course delivered at location of customer choice to internal audience • Public training: Courses regularly scheduled around the globe. Schedule available on web • Virtual training: Live training accessed via the internet; available for public and private courses • OnDemand training: Pre-recorded lecture with identical content/exercises as live training options • Certification: Rigorously developed and meaningful bodies of knowledge OnDemand Virtual live classroom Private onsitePublic live classroom
  6. 6. © Cloudera, Inc. All rights reserved. Suggested Cloudera University curricula Developers • Python/Scala Training • Developer for Spark and Hadoop • CCA: Spark and Hadoop Developer • Spark ML & Kafka modules • Topic specific training (Search, HBase) • Hands on practice • CCP: Data Engineer Administrators • Cloudera Administration training • CCA: Administrator • Cloudera Security OnDemand Data Analysts/Data Scientists • Data Analyst: Using Hive, Pig & Impala • CCA: Data Analyst • Cloudera Data Science
  7. 7. 7© Cloudera, Inc. All rights reserved. Security for Hadoop Carlo Lazzaris | Technical Instructor
  8. 8. 8© Cloudera, Inc. All rights reserved. Security Webinar Agenda 1. The need for Hadoop Security Hacker news and legal regulations 2. Cloudera Security Implementation Five levels of security 3. How to secure your Cloudera cluster Cloudera Documentation Cloudera professional services Cloudera OnDemand security course
  9. 9. 9© Cloudera, Inc. All rights reserved. The need for Hadoop security
  10. 10. 10© Cloudera, Inc. All rights reserved. Unguarded data stores are the victims
  11. 11. 11© Cloudera, Inc. All rights reserved. Regulatory Compliance Organizations can be fined up to 4% of annual global turnover for breaching GDPR or €20 Million
  12. 12. 12© Cloudera, Inc. All rights reserved. Cloudera security implementation
  13. 13. 13© Cloudera, Inc. All rights reserved. Cloudera Enterprise CDH 13 The modern platform for machine learning and analytics optimized for the cloud EXTENSIBLE SERVICES CORE SERVICES DATA ENGINEERING OPERATIONAL DATABASE ANALYTIC DATABASE DATA CATALOG INGEST & REPLICATION SECURITY GOVERNANCE WORKLOAD MANAGEMENT DATA SCIENCE S3 ADLS HDFS KUDU STORAGE SERVICES
  14. 14. 14© Cloudera, Inc. All rights reserved. • Unified security – protects sensitive data with consistent controls, even for transient and recurring workloads • Consistent governance – enables secure self-service access to all relevant data and increases compliance • Easy workload management – increases user productivity and boosts job predictability • Flexible ingest and replication – aggregates a single copy of all data, provides disaster recovery, and eases migration • Shared catalog – defines and preserves structure and business context of data for new applications and partner solutions Open platform services Built for multi-function analytics | Optimized for cloud
  15. 15. 15© Cloudera, Inc. All rights reserved. Cloudera Enterprise-Grade Security and Governance Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Data Protection Shielding data in the cluster from unauthorized visibility Technical Concepts: Encryption at rest & in motion Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage Cloudera Manager Apache Sentry Cloudera Navigator Navigator Encrypt & Key Trustee Identity Validate users by membership in enterprise directory Technical Concepts: Authentication User/group mapping
  16. 16. 16© Cloudera, Inc. All rights reserved. Cloudera Certified Technology Partners Data Sources Data Ingest Process, Refine & Prep Data Discovery Advanced Analytics Connected Machines/Data sources Other Data Sources
  17. 17. 17© Cloudera, Inc. All rights reserved. A certified product ensures it integrates securely • Authenticate via Kerberos or LDAP Authentication • Handle Apache Sentry with Hive, Impala, Search, HDFS Authorization • Support HDFS transport encryption, at-rest encryption; support SSL/TLS connection encryption Encryption
  18. 18. 18© Cloudera, Inc. All rights reserved. Vulnerability Response and Process Vulnerability reports Upstream Internal External Fix Publish
  19. 19. 19© Cloudera, Inc. All rights reserved. Cluster Security Levels
  20. 20. 20© Cloudera, Inc. All rights reserved. Cloudera Enterprise 20 The modern platform for machine learning and analytics optimized for the cloud
  21. 21. 21© Cloudera, Inc. All rights reserved. Enterprise Encryption Performance
  22. 22. 23© Cloudera, Inc. All rights reserved. Disclaimer This talk serves as a general guideline for security implementation on Hadoop. The actual implementation procedures and scope of implementation vary on a case-by- case basis, and should be assessed by Cloudera’s Professional Services team or certified Cloudera SI Partners.
  23. 23. 24© Cloudera, Inc. All rights reserved. Non-secure #0 Data Free for All
  24. 24. 25© Cloudera, Inc. All rights reserved. Firewall ActiveDirectory/KDC Hadoop cluster Cloudera Manager Gateway node Cloudera Worker nodesDatacenter Applications
  25. 25. 26© Cloudera, Inc. All rights reserved. 4 modes of Identity Management 1. Simple Authentication 2. Kerberos 3. LDAP 4. SAML File group ownership • AD integration • SSSD or Centrify Consideration in large enterprises. via SSSD via
  26. 26. 27© Cloudera, Inc. All rights reserved. Simple Authentication detect the user Firewall ActiveDirectory Master Worker Worker Worker Cloudera Manager Master (SSSD/Centrify)
  27. 27. 28© Cloudera, Inc. All rights reserved. Simple authentication = no authentication
  28. 28. 29© Cloudera, Inc. All rights reserved. Minimal Security #1 Reduce Risk Exposure
  29. 29. 30© Cloudera, Inc. All rights reserved. How it works: Authentication • LDAP and SAML authentication options Web UIs • LDAP/AD and Kerberos authentication options SQL Access •Kerberos authentication •Automation provided by Cloudera Manager to leverage Active Directory (AD) Command Lines User authenticates to AD or KDC Authenticated user gets Kerberos Ticket Ticket grants access to Services e.g. Impala User [ssmith] Password [***** ]
  30. 30. 31© Cloudera, Inc. All rights reserved. Kerberos EXAMPLE.COM KDC user@EXAMPLE.COM Hadoop user@EXAMPLE.COM  user Strong Authentication KDC Key Distribution Center • MIT • ActiveDirectory (more common) realmprimary
  31. 31. 32© Cloudera, Inc. All rights reserved. Kerberos Consideration in large corporates Time synchronization CM Kerberos Wizard • Configure AD to create a Kerberos principal for CM server, and to delegate CM the ability to create/manage Kerberos principals
  32. 32. 33© Cloudera, Inc. All rights reserved. Kerberos Consideration in large corporates Time synchronization CM Kerberos Wizard • Configure AD to create a Kerberos principal for CM server, and to delegate CM the ability to create/manage Kerberos principals
  33. 33. 34© Cloudera, Inc. All rights reserved. Kerberos Authentication * LDAP over SSL
  34. 34. 35© Cloudera, Inc. All rights reserved. Authorization/Access Control HDFS File ACL YARN job submission Hbase ACLsOozie ACL Access Control List (ACLs) Hive Sentry Managed (RBAC) Impala
  35. 35. 36© Cloudera, Inc. All rights reserved. Auditing
  36. 36. 37© Cloudera, Inc. All rights reserved. Backup/Disaster Recovery Cloudera Backup/Disaster Recovery (BDR) • A high performance data replicator • Copies incremental data on the source cluster at specified schedules Supports  Kerberos  Data encryption  HDFS replication to cloud
  37. 37. 38© Cloudera, Inc. All rights reserved. Kerberized BDR Best Practice Production DR Cloudera BDR PROD.EXAMPLE.COM Cross-realm trust KDC KDC DR.EXAMPLE.COM
  38. 38. 39© Cloudera, Inc. All rights reserved. More Security #2 Managed, Secure, Protected
  39. 39. 40© Cloudera, Inc. All rights reserved. Data In-Motion Encryption RPC encryption Data transport encryption • Supports AES CTR, up to 256-bit key length HTTP TLS/SSL encryption • No self-signed certificates in production Master Worker Worker Worker Master Application RPC encryption Transport encryption TLS/SSL
  40. 40. 41© Cloudera, Inc. All rights reserved. Data At-Rest Encryption Transparent encryption Supports any Hadoop applications Encryption Zone $ hadoop key create mykey $ hadoop fs -mkdir /zone $ hdfs crypto -createZone -keyName mykey -path /zone / /tmp /zone foo bar Encryption zone
  41. 41. 42© Cloudera, Inc. All rights reserved. Key Management Server Deployment (non-prod) HDFS NameNode Client Java Keystore KMS Keystore file Separation of duties • Encryption Zone Key (EZK) is stored in KMS server • HDFS super user can not decrypt files
  42. 42. 43© Cloudera, Inc. All rights reserved. Key Management Server/Key Trustee Server Deployment HDFS NameNode Client Key Trustee KMS Key Trustee KMS Firewall Key Trustee Server (Active) Key Trustee Server (Passive) synchronization (or more)
  43. 43. 44© Cloudera, Inc. All rights reserved. KMS+KTS+HSM Deployment HDFS NameNode Client HSM KMS HSM KMS Firewall Key Trustee Server (Active) Key Trustee Server (Passive) synchronization Key HSM (or more) Key HSM HSM HSM
  44. 44. 45© Cloudera, Inc. All rights reserved. Troubleshooting: Encryption Performance Anomaly • Configuration • AES-NI Hardware acceleration • OpenSSL library • Entropy
  45. 45. 46© Cloudera, Inc. All rights reserved. Fine Grained Access Control with Apache Sentry
  46. 46. 47© Cloudera, Inc. All rights reserved. Most Security #3 Secure Data Vault
  47. 47. 48© Cloudera, Inc. All rights reserved. Level 3 Secure Data Vault • All data, both data-at-rest and data-in-transit is encrypted • Key management system is fault-tolerant • Auditing mechanisms comply with industry, government, and regulatory standards (PCI, HIPAA, NIST, for example) • Auditing extends from EDH to the other systems that integrate with it. • Cluster administrators are well-trained • Security procedures have been certified by an expert • Cluster can pass technical review
  48. 48. 49© Cloudera, Inc. All rights reserved. Data Redaction Personal Identifiable Information • PCI-DSS, HIPAA Best practices followed Password • stores in credential files, not in configuration Log, queries • Cloudera Manager
  49. 49. 50© Cloudera, Inc. All rights reserved. Full Encryption Encrypt Data Spills • MapReduce • Impala • Hive • Flume OS-level encryption • Navigator Encrypt
  50. 50. 51© Cloudera, Inc. All rights reserved. How to secure your Cloudera cluster
  51. 51. 52© Cloudera, Inc. All rights reserved. Cloudera Documentation
  52. 52. 53© Cloudera, Inc. All rights reserved. Cloudera Professional Services security engagement • Review security requirements and provide an overview of data security policies • Audit architecture and current systems for security policies and best practices • Custom tailor a security reference architecture • Optimize OS and Java to take advantage of hardware-based crypto-acceleration • Install and configure Kerberos with MIT Kerberos KDC or Active Directory • Install and configure Sentry and Cloudera Navigator (license required) • Install and configure Navigator Encrypt and Key Trustee with an HSM root of trust • Review fine-grain permissions on sample data using Sentry • Review audit and lineage on sample data using Navigator • Use Cloudera Manager and Hue to review security integration for users • Enable and configure HDFS encryption https://www.cloudera.com/more/services-and-support/professional-services/security-integration-pilot.html
  53. 53. 54© Cloudera, Inc. All rights reserved. Cloudera online ondemand security course • Online self paced training course https://ondemand.cloudera.com • Launch planned for mid Feb 2018 • 3 days estimate worth of content at Cloudera level 1 and 2 security level • Currently 375~ slides with 9 detailed chapters and 16 instructor demonstrations : 1. Security overview 2. Security Architecture 3. Host Security 4. Encrypting Data in motion 5. Authentication 6. Authorization 7. Encrypting Data at Rest 8. Auditing 9. Additional Considerations: Data Governance
  54. 54. 55© Cloudera, Inc. All rights reserved. Ondemand security course instructor guided demos 1. Potential Attack vectors 2. Securing the cluster hosts 3. Generating and managing keys for TLS 4. Configuring Cloudera Manager for TLS 5. Encrypting Data in Motion 6. Hadoop default authentication 7. Kerberizing Cluster with MIT Kerberos 8. Kerberizing Cluster with Active Directory 9. Configuring Authorising with Cloudera Manager 10. Controlling access to Yarn 11. Controlling access to HDFS 12. Controlling access to Tables 13. Enabling HDFS Encryption 14. Protecting local data with NavEncrypt 15. Using Navigator for auditing 16. Reassessing cluster security
  55. 55. 56© Cloudera, Inc. All rights reserved. Ondemand security course disclaimer THIS IS REALLY IMPORTANT: The examples in this course are based on CM/CDH 5.12, running in a cloud-based deployment on a cluster using the CentOS 7.2 operating system. Given the almost limitless permutations of possible configurations, including different versions of CDH, Cloudera Manager, operating systems, directory servers, Kerberos servers, web browsers, and other tools, as well as variations in policies, laws, and practices that affect each organization differently, it's impossible for a training course to cover all aspects of security. This course is meant to provide a background that will help you to understand many important concepts and techniques, but is not intended as a replacement for the relevant documentation or a consulting engagement with an expert who can provide advice based on your specific requirements. • Disclaimers ~ due to security variety and permutations • Versions used: CDH 5.12 and Centos 7.2
  56. 56. 57© Cloudera, Inc. All rights reserved. Ondemand security course scenario • Many of our demonstrations are based on a hypothetical scenario • However, the concepts should apply to nearly any organization • Loudacre Mobile is a fast-growing wireless carrier • Employees serving in a variety of roles • Data ingested from many sources, in many formats • Data processed by many tools
  57. 57. 58© Cloudera, Inc. All rights reserved. Ondemand security course environment
  58. 58. 59© Cloudera, Inc. All rights reserved. Comprehensive demonstration cluster
  59. 59. 60© Cloudera, Inc. All rights reserved. Sample chapter structure: Encrypting Data in Motion • Encryption Fundamentals • Certificates • Key Management  Instructor-Led Demonstration: Generating and Managing Keys for TLS • Configuring Cloudera Manager for TLS  Instructor-Led Demonstration: Configuring Cloudera Manager for TLS • Encrypting Hadoop’s Data in Motion  Instructor-Led Demonstration: Encrypting Hadoop’s Data in Motion • Essential Points
  60. 60. 61© Cloudera, Inc. All rights reserved. Register your interest for OnDemand security course: peter.rizvi@cloudera.com
  61. 61. © Cloudera, Inc. All rights reserved. Thank you
  • MASeyal

    Oct. 3, 2020
  • mondalani

    Jun. 16, 2020
  • sagarsanghavi2

    May. 26, 2020
  • LaurentHoss

    Jul. 9, 2019
  • panweiyou

    Apr. 3, 2019
  • emanueledellavalle

    Feb. 6, 2019
  • UmapathyV

    Jan. 15, 2019
  • KennethJeckell

    Jan. 16, 2018
  • mageru

    Jan. 16, 2018

The first and possibly most important task you perform when you deploy your Cloudera cluster is securing it. Get it wrong and you may inadvertently and unknowingly have introduced a risk to the business. Getting it right eventually leaves you looking back at wasted efforts and false starts. So how do you get it right first time?

Views

Total views

3,121

On Slideshare

0

From embeds

0

Number of embeds

3

Actions

Downloads

188

Shares

0

Comments

0

Likes

9

×