SlideShare a Scribd company logo
1 of 35
© Cloudera, Inc. All rights reserved.
Migrating Analytics and ML to the Cloud
Sushant Rao
Cloud Product Marketing @ Cloudera
Ron Abellera
Azure Global Black Belt @ Microsoft Azure
© Cloudera, Inc. All rights reserved. 2
Poll Question 1: Where are you in your journey to the Cloud?
● Just started researching options in Cloud
● Starting to test different products / services in Cloud
● Have some deployments and looking to expand in Cloud
● Critical mass in the Cloud
3 © Cloudera, Inc. All rights reserved.
Why Cloud?
CLOUD
BENEFITS
CLOUD
PROBLEMS
• Agility
○ Speed of making changes to meet business / technical needs
• Scalable & Elastic
○ Scale up and down quickly
• Reliable
○ Multiple options to ensure infrastructure / services are available
○ Tenant isolation ensure different workloads don’t conflict with each other
• Other
○ Pay-as-you-go charges only for consumption (but not necessarily cheaper)
○ Self-service enables users to do their work without contacting IT / Data platform team
4 © Cloudera, Inc. All rights reserved.
But ...
CLOUD
PROBLEMSCLOUD
CHALLENGES
• Multiple copies of data & Disjointed services
○ Different services have their own copies and may not work together
• On-premises integration
○ Data gravity is on-prem, so cloud needs to complement current data platform
• Cloud Lock-in
○ Open source prevented lock-in for on-prem. What about cloud?
• Shadow IT
○ Individual business units may setup up their own cloud deployments, without
the architecture, security, and/or governance of the on-prem deployment
• Cheaper?
○ On-prem can be more than 2x cheaper than cloud
5 © Cloudera, Inc. All rights reserved.
Common Uses Cases for Cloud
CORPORATE DIRECTIVE
• C-level has decided to
utilize the cloud more
• Running out of data center
space, looking for more
agility / flexibility
6 © Cloudera, Inc. All rights reserved.
Common Uses Cases for Cloud
CORPORATE DIRECTIVE DISASTER RECOVERY
• C-level has decided to
utilize the cloud more
• Running out of data center
space, looking for more
agility / flexibility
• Backup all data to the
cloud, without a second
“physical” location
• Save time and expense of
setting up a physical DR
site
7 © Cloudera, Inc. All rights reserved.
Common Uses Cases for Cloud
CORPORATE DIRECTIVE ELASTIC WORKLOADSDISASTER RECOVERY
• C-level has decided to
utilize the cloud more
• Running out of data center
space, looking for more
agility / flexibility
• Separate environment for
new, production or for
intermittent, ad-hoc
workloads
• Takes too long to acquire
and setup on-prem
infrastructure
• Backup all data to the
cloud, without a second
“physical” location
• Save time and expense of
setting up a physical DR
site
8 © Cloudera, Inc. All rights reserved.
Common Uses Cases for Cloud
CORPORATE DIRECTIVE SANDBOXELASTIC WORKLOADSDISASTER RECOVERY
• C-level has decided to
utilize the cloud more
• Running out of data center
space, looking for more
agility / flexibility
• Environment to test queries
and algorithms
• Doesn’t impact production
cluster as data analysts
and engineers test
• Separate environment for
new, production or for
intermittent, ad-hoc
workloads
• Takes too long to acquire
and setup on-prem
infrastructure
• Backup all data to the
cloud, without a second
“physical” location
• Save time and expense of
setting up a physical DR
site
9 © Cloudera, Inc. All rights reserved.
Cloudera’s Solution for Data Analytics / Engineering in Cloud
• The modern platform for machine learning and analytics
• Numerous functions for all types of jobs and queries
• with multiple deployment options
• On-premises, Public cloud (including multi-), and Hybrid
• and one shared data experience
• Framework for consistent security, governance, and metadata management across
applications and deployments
10 © Cloudera, Inc. All rights reserved.
The Modern Platform for Machine Learning & Analytics
OPERATIONAL
DATABASE
DATA
ENGINEERING
DATA
WAREHOUSE
DATA
SCIENCE
DATA PROCESSING
• Cost efficient
• Reliable
• Scalable
• Based on Spark,
MapReduce,
Hive & Pig
• Supported by
Workload
Analytics
FAST BI & SQL
• Flexibility
• Elastic scale
• Go beyond SQL
• Based on
Impala & Hive
• SQL dev enviro
• Supported by
Workload
Analytics
MACHINE LEARNING
• Fast dev to
production
• Secure self-
serve
• Based on
Python, R, and
Spark
• ML dev
environment
(CDSW)
ONLINE & REAL-TIME
• High throughput,
low latency
• Strongly
consistent
• Based on
Hbase, Kudu
& Spark
streaming
11 © Cloudera, Inc. All rights reserved.
Cloudera’s Vision for AI and Machine Learning
Modern Enterprise Platform, Tools, and Expert Guidance to help you Unlock
Business Value with ML / AI
Agile platform to build,
train, and deploy
scalable ML
applications
Enterprise data science
tools to accelerate
team productivity
Expert guidance,
services & training to
fast track value & scale
12 © Cloudera, Inc. All rights reserved.
With Multiple Deployment Options
Via Cloudera Altus (IaaS)
INFRASTRUCTURE SERVICES
OPERATIONAL
DATABASE
DATA
ENGINEERING
DATA
WAREHOUSE
DATA
SCIENCE
DATA
ENGINEERING
DATA
WAREHOUSE
Via Cloudera Altus Services (PaaS)
Traditional Infrastructure
(combined storage and compute)
Cloud Infrastructure
(decoupled storage and compute)
Cloud Infrastructure
(decoupled storage and compute)
© Cloudera, Inc. All rights reserved. 13© Cloudera, Inc. All rights reserved.
Cloudera
Enterprise Data
Platform
Benefits for IT infra & ops
• Central control and security
• Focus on curating not
firefighting
Benefits for users
• Value from single source of
truth
• Bring the best tools for each
job
WORKLOADS DATA
SCIENCE
DATA
WAREHOUSE
OPERATIONAL
DATABASE
DATA
ENGINEERING
3RD PARTY
SERVICES
COMMON
SERVICES
SECURITY GOVERNANCE LIFECYCLE
MANAGEMENT
CONTROL
PLANE
DATA CATALOG
STORAGE
HDFS
Public Cloud
Object Storage
(S3, ADLS, etc)
KUDUPrivate Cloud
Object Storage
© Cloudera, Inc. All rights reserved. 14
Journey to the Cloud from On-Prem
CLOUDERA CLUSTER
(PERSISTENT)
COMPUTE DATA
CONTEXT
Data
Engineering
Analytics
Data
Science
Security
Metadata
Governance
STORAGE
HDFS
ON PREMISES
Current State
● Multiple workloads and services run in a single cluster
● Data Context (security, metadata, governance) in single
cluster
Goals in Journey to the Cloud
● Get to Cloud with minimal impact and change
● Replicate security groups and permissions in the Cloud
● May require multiple stages to get there
● First step may vary depending on goals
● Need to determine how data will be replicated to the
Cloud
© Cloudera, Inc. All rights reserved. 15
CUSTOMER CLOUD (AWS, Azure, GCP, etc)
Start by Replicating Data to Public Cloud via BDR
ON PREMISES
STORAGE
HDFS
PUBLIC CLOUD
HDFS
CLOUDERA CLUSTER
(PERSISTENT)
COMPUTE DATA
CONTEXT
Hive
Impala
Spark
Sentry
HMS
STORAGE
HDFS
Navigator
BDR
© Cloudera, Inc. All rights reserved. 16
CUSTOMER CLOUD
Journey to the Cloud - Step 1
CLOUDERA CLUSTER
(PERSISTENT)
COMPUTE DATA
CONTEXT
Data
Engineering
Analytics
Data
Science
Security
Metadata
Governance
STORAGE
CLOUD OBJECT STORE
1- LIFT AND SHIFT
HDFS
© Cloudera, Inc. All rights reserved. 17
CUSTOMER CLOUDCUSTOMER CLOUD
Journey to the Cloud - Step 2
CLOUDERA CLUSTER
(PERSISTENT)
COMPUTE DATA
CONTEXT
Data
Engineering
Analytics
Data
Science
Security
Metadata
Governance
STORAGE
CLOUD OBJECT STORE
1- LIFT AND SHIFT
CLOUDERA CLUSTER
(PERSISTENT)
COMPUTE DATA
CONTEXT
Data
Engineering
Analytics
Data
Science
Security
Metadata
Governance
STORAGE
CLOUD OBJECT STORE
2 - OBJECT STORAGE
HDFS
© Cloudera, Inc. All rights reserved. 18
CUSTOMER CLOUD
CLOUDERA CLUSTER
(PERSISTENT)
COMPUTE DATA
CONTEXT
Data
Engineering
Analytics
Data
Science
Security
Metadata
Governance
STORAGE
CLOUD OBJECT STORE
CUSTOMER CLOUD
Journey to the Cloud - Step 3
CLOUDERA CLUSTER
(PERSISTENT)
COMPUTE DATA
CONTEXT
Data
Engineering
Analytics
Data
Science
Security
Metadata
Governance
STORAGE
CLOUD OBJECT STORE
1- LIFT AND SHIFT 2 - OBJECT STORAGE
HDFS
CLOUDERA
CLUSTERS
(TRANSIENT–
ALTUS)
COMPUTE
Data
Engineering
CUSTOMER CLOUD CLOUDERA CLOUD
CLOUDERA
ALTUS
CONTROL
PLANE
STORAGE
CLOUD OBJECT STORE
DATA
CONTEXT
CLOUDERA CLUSTER
(PERSISTENT–DIRECTOR)
COMPUTE DATA
CONTEXT
CLOUDERA
CLUSTERS
(TRANSIENT–
ALTUS)
COMPUTE
Analytics
3 - CLOUD NATIVE ARCHITECTURES
© Cloudera, Inc. All rights reserved. 19
Customer Examples
Many Cloudera customers (Global 5K) used public cloud
• Online retailer
• Over 2,000 nodes with ~2PB of data in cloud running in an active - active configuration
• Transforming data with Spark and then analyzing with Apache Hive
• German chain of coffee retailers and cafés
• 30+ nodes with 50TB of data in cloud
• Modern Cloudera platform with an Impala data warehouse
• Global information company
• 70+ nodes in cloud across Microsoft Azure and AWS
• Replaced Netezza with Hadoop and leveraging both Impala and Spark for analytics
© Cloudera, Inc. All rights reserved. 20
Cloudera is using cloud as well
Security Use Case
Altus based solution saved more than 50% cost compared to initial implementation
© Cloudera, Inc. All rights reserved. 21
Cloudera Altus
Key Differentiators
• Multi-function: Unified platform for data engineering, data warehouse, and
data science
• Multi-cloud: Option for on-premises, Public cloud (including multi-), and
Hybrid
• SDX: Integrated shared data experience across multi-function clusters
© Cloudera, Inc. All rights reserved.22 © Cloudera, Inc. All rights reserved.
Pick the Right Altus Component for Your Needs
Depending on workload and service level
• Service offering for batch
oriented Data
Engineering jobs on data
in object stores (ADLS,
others)
• Usage based pricing
• Runs Apache Spark,
Apache Hive and
MapReduce jobs
• Provides Workload
Analytics to troubleshoot
and optimize job
performance
• Service offering for cloud
native data warehouse
use cases
• Usage based pricing
• Runs Apache Impala on
data stored in object
stores (ADLS, others)
• Exposes endpoint to
connect BI Tools for
visualization
• Offers built-in SQL Editor
for ad-hoc data
exploration
• EDH for public cloud
which gives customers
full cluster control
• Self-managed cloud
infrastructure
• Usage or node based
pricing
• Full breadth of CDH
services available
(Apache Kafka, Apache
Spark Streaming, CDSW,
etc)
• Supports deployments
on 5 public cloud
platforms
Altus Data Engineering (PaaS) Altus Data Warehouse (PaaS) Altus (IaaS)
23 © Cloudera, Inc. All rights reserved.
Azure Update
Cosmos
Microsoft’s internal data lake
• A data lake for all teams @Microsoft
• Tools approachable by any developer
• Batch, Interactive, Streaming, ML
• Used across Office, Xbox, Azure,
Windows, Ads, Bing, Skype, …
By the numbers
• Exabytes of data
• 100Ks of Physical Servers
• Millions of Interactive Queries
• Huge Streaming Pipelines
• 100Ks of Batch Jobs
• 10K+ Developers
Microsoft’s Big Data Service
Azure Data Lake
A data lake for everyone
• The next version of Cosmos
• Fully aligned with Hadoop ecosystem
and standards, with full support for
Hadoop tools and engines as well as
unique Microsoft capabilities
• Migration from Cosmos to ADL is
already underway
• External customers on the same
service as internal customers
Ingest all data
regardless of requirements
Store all data
in native format without
schema definition
Do analysis
Using analytic engines
like Hadoop
Interactive queries
Batch queries
Machine Learning
Data warehouse
Real-time analytics
Devices
Azure Data Lake Overview
Windows Azure Blob Storage
Spark
Map-
Reduce
Impala
Cloudera
Azure Key
Vault
Azure
Active Dir
Azure Data Lake Store – in-cluster services
U-SQL
ADL Analytics
…
Ingestion Service
ADLS Gateway Service
Cosmos API HDFS++ API
HDFS++ API
Scope
YARN
ADLS Micro
Services
ADL local tier
Azure VMs
Azure remote storage tier
ADLS Gen 2
• Preview announced June 2018
• Allows all storage regions to have HDFS API
• Soon available for Cloudera implementations
Azure Data Lake Storage Gen2 Key Features
© Cloudera, Inc. All rights reserved.
Demo
© Cloudera, Inc. All rights reserved.30 © Cloudera, Inc. All rights reserved.
Poll Question 2: How do you want to use the Cloud?
• Migrating existing workloads from your on-prem cluster to Azure
• Deploying new data analytics / engineering jobs in Cloud (PaaS / SaaS)
• Interested in both of the above
• Not sure
© Cloudera, Inc. All rights reserved. 31© Cloudera, Inc. All rights reserved.
Cloud Data Analytics / Engineering with Cloudera
$
• Lower risk of data breach
• Analysts more productive on jobs
• Self-service (no shadow IT) and
more productive
• IT more strategic, less admin time
• Deployment choices and no lock-in
• Same solution as on-premises and multi-
cloud
• Eliminate data copies
• Single security framework with
universally shared metadata
• Easy to track data lineage
• Unified services
+
ADVANTAGES
BUSINESS
VALUE
• Lower risk of data breach
• Analysts more productive on jobs
• Self-service (no shadow IT) and
more productive
• IT more strategic, less admin time
• Deployment choices and no lock-in
© Cloudera, Inc. All rights reserved. 32© Cloudera, Inc. All rights reserved.
Ready to try the Cloud?
$10K of free Azure credits!
• Cloudera and Microsoft will
offer $10,000 in FREE Azure for
qualifying opportunities
• To be applied to Azure
subscription
• Must be consumed in 60 days
• Must be a Cloudera product
running on Microsoft Azure
• Must be tied to a single
customer entity for PoC or pilot
deployment
• Limited time offer
• Contact
azureoffer@cloudera.com
THANK YOU
© Cloudera, Inc. All rights reserved.
Appendix
35 © Cloudera, Inc. All rights reserved.
Cloudera Pricing / Acquisition
Acquisition Options
● Pay-as-you-go usage-based pricing
● Node-based license subscription
● Free 30-day trial
● Pre-pay of cloud credits
● Free version that can be deployed in the cloud
Pricing - https://www.cloudera.com/products/pricing.html

More Related Content

What's hot

Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Cloudera, Inc.
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18Cloudera, Inc.
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformCloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
PaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusPaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusCloudera, Inc.
 
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18Cloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Big data journey to the cloud rohit pujari 5.30.18
Big data journey to the cloud   rohit pujari 5.30.18Big data journey to the cloud   rohit pujari 5.30.18
Big data journey to the cloud rohit pujari 5.30.18Cloudera, Inc.
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldCloudera, Inc.
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...Cloudera, Inc.
 
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Cloudera, Inc.
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsCloudera, Inc.
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...Cloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningCloudera, Inc.
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseCloudera, Inc.
 

What's hot (20)

Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
PaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusPaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with Altus
 
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Big data journey to the cloud rohit pujari 5.30.18
Big data journey to the cloud   rohit pujari 5.30.18Big data journey to the cloud   rohit pujari 5.30.18
Big data journey to the cloud rohit pujari 5.30.18
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
 
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
 

Similar to Leveraging the cloud for analytics and machine learning 1.29.19

A deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudA deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudCloudera, Inc.
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartchCloudera, Inc.
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Cloudera, Inc.
 
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera Director: Unlock the Full Potential of Hadoop in the CloudCloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera Director: Unlock the Full Potential of Hadoop in the CloudCloudera, Inc.
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Stefan Lipp
 
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera, Inc.
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudGoDataDriven
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSCloudera, Inc.
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)Cloudera, Inc.
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadDataWorks Summit
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaCloudera, Inc.
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudCloudera, Inc.
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Cloudera, Inc.
 
Altair Leveraging Disruptive Cloud Technologies
Altair Leveraging Disruptive Cloud TechnologiesAltair Leveraging Disruptive Cloud Technologies
Altair Leveraging Disruptive Cloud TechnologiesAltair
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureCloudera, Inc.
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computingPUBLEAD (R)
 
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Cloudera, Inc.
 
Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureOracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureRiccardo Romani
 

Similar to Leveraging the cloud for analytics and machine learning 1.29.19 (20)

A deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudA deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloud
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
 
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera Director: Unlock the Full Potential of Hadoop in the CloudCloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
 
Hybrid is the New Normal
Hybrid is the New NormalHybrid is the New Normal
Hybrid is the New Normal
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
 
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemacht
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the Cloud
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 

 
Altair Leveraging Disruptive Cloud Technologies
Altair Leveraging Disruptive Cloud TechnologiesAltair Leveraging Disruptive Cloud Technologies
Altair Leveraging Disruptive Cloud Technologies
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computing
 
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera

 
Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureOracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and Architecture
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Cloudera, Inc.
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceCloudera, Inc.
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enoughCloudera, Inc.
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Cloudera, Inc.
 
Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera, Inc.
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedCloudera, Inc.
 
Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18Cloudera, Inc.
 

More from Cloudera, Inc. (17)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enough
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18
 
Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: Exposed
 
Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18
 

Recently uploaded

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesExploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesSanjay Willie
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsZilliz
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 

Recently uploaded (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesExploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 

Leveraging the cloud for analytics and machine learning 1.29.19

  • 1. © Cloudera, Inc. All rights reserved. Migrating Analytics and ML to the Cloud Sushant Rao Cloud Product Marketing @ Cloudera Ron Abellera Azure Global Black Belt @ Microsoft Azure
  • 2. © Cloudera, Inc. All rights reserved. 2 Poll Question 1: Where are you in your journey to the Cloud? ● Just started researching options in Cloud ● Starting to test different products / services in Cloud ● Have some deployments and looking to expand in Cloud ● Critical mass in the Cloud
  • 3. 3 © Cloudera, Inc. All rights reserved. Why Cloud? CLOUD BENEFITS CLOUD PROBLEMS • Agility ○ Speed of making changes to meet business / technical needs • Scalable & Elastic ○ Scale up and down quickly • Reliable ○ Multiple options to ensure infrastructure / services are available ○ Tenant isolation ensure different workloads don’t conflict with each other • Other ○ Pay-as-you-go charges only for consumption (but not necessarily cheaper) ○ Self-service enables users to do their work without contacting IT / Data platform team
  • 4. 4 © Cloudera, Inc. All rights reserved. But ... CLOUD PROBLEMSCLOUD CHALLENGES • Multiple copies of data & Disjointed services ○ Different services have their own copies and may not work together • On-premises integration ○ Data gravity is on-prem, so cloud needs to complement current data platform • Cloud Lock-in ○ Open source prevented lock-in for on-prem. What about cloud? • Shadow IT ○ Individual business units may setup up their own cloud deployments, without the architecture, security, and/or governance of the on-prem deployment • Cheaper? ○ On-prem can be more than 2x cheaper than cloud
  • 5. 5 © Cloudera, Inc. All rights reserved. Common Uses Cases for Cloud CORPORATE DIRECTIVE • C-level has decided to utilize the cloud more • Running out of data center space, looking for more agility / flexibility
  • 6. 6 © Cloudera, Inc. All rights reserved. Common Uses Cases for Cloud CORPORATE DIRECTIVE DISASTER RECOVERY • C-level has decided to utilize the cloud more • Running out of data center space, looking for more agility / flexibility • Backup all data to the cloud, without a second “physical” location • Save time and expense of setting up a physical DR site
  • 7. 7 © Cloudera, Inc. All rights reserved. Common Uses Cases for Cloud CORPORATE DIRECTIVE ELASTIC WORKLOADSDISASTER RECOVERY • C-level has decided to utilize the cloud more • Running out of data center space, looking for more agility / flexibility • Separate environment for new, production or for intermittent, ad-hoc workloads • Takes too long to acquire and setup on-prem infrastructure • Backup all data to the cloud, without a second “physical” location • Save time and expense of setting up a physical DR site
  • 8. 8 © Cloudera, Inc. All rights reserved. Common Uses Cases for Cloud CORPORATE DIRECTIVE SANDBOXELASTIC WORKLOADSDISASTER RECOVERY • C-level has decided to utilize the cloud more • Running out of data center space, looking for more agility / flexibility • Environment to test queries and algorithms • Doesn’t impact production cluster as data analysts and engineers test • Separate environment for new, production or for intermittent, ad-hoc workloads • Takes too long to acquire and setup on-prem infrastructure • Backup all data to the cloud, without a second “physical” location • Save time and expense of setting up a physical DR site
  • 9. 9 © Cloudera, Inc. All rights reserved. Cloudera’s Solution for Data Analytics / Engineering in Cloud • The modern platform for machine learning and analytics • Numerous functions for all types of jobs and queries • with multiple deployment options • On-premises, Public cloud (including multi-), and Hybrid • and one shared data experience • Framework for consistent security, governance, and metadata management across applications and deployments
  • 10. 10 © Cloudera, Inc. All rights reserved. The Modern Platform for Machine Learning & Analytics OPERATIONAL DATABASE DATA ENGINEERING DATA WAREHOUSE DATA SCIENCE DATA PROCESSING • Cost efficient • Reliable • Scalable • Based on Spark, MapReduce, Hive & Pig • Supported by Workload Analytics FAST BI & SQL • Flexibility • Elastic scale • Go beyond SQL • Based on Impala & Hive • SQL dev enviro • Supported by Workload Analytics MACHINE LEARNING • Fast dev to production • Secure self- serve • Based on Python, R, and Spark • ML dev environment (CDSW) ONLINE & REAL-TIME • High throughput, low latency • Strongly consistent • Based on Hbase, Kudu & Spark streaming
  • 11. 11 © Cloudera, Inc. All rights reserved. Cloudera’s Vision for AI and Machine Learning Modern Enterprise Platform, Tools, and Expert Guidance to help you Unlock Business Value with ML / AI Agile platform to build, train, and deploy scalable ML applications Enterprise data science tools to accelerate team productivity Expert guidance, services & training to fast track value & scale
  • 12. 12 © Cloudera, Inc. All rights reserved. With Multiple Deployment Options Via Cloudera Altus (IaaS) INFRASTRUCTURE SERVICES OPERATIONAL DATABASE DATA ENGINEERING DATA WAREHOUSE DATA SCIENCE DATA ENGINEERING DATA WAREHOUSE Via Cloudera Altus Services (PaaS) Traditional Infrastructure (combined storage and compute) Cloud Infrastructure (decoupled storage and compute) Cloud Infrastructure (decoupled storage and compute)
  • 13. © Cloudera, Inc. All rights reserved. 13© Cloudera, Inc. All rights reserved. Cloudera Enterprise Data Platform Benefits for IT infra & ops • Central control and security • Focus on curating not firefighting Benefits for users • Value from single source of truth • Bring the best tools for each job WORKLOADS DATA SCIENCE DATA WAREHOUSE OPERATIONAL DATABASE DATA ENGINEERING 3RD PARTY SERVICES COMMON SERVICES SECURITY GOVERNANCE LIFECYCLE MANAGEMENT CONTROL PLANE DATA CATALOG STORAGE HDFS Public Cloud Object Storage (S3, ADLS, etc) KUDUPrivate Cloud Object Storage
  • 14. © Cloudera, Inc. All rights reserved. 14 Journey to the Cloud from On-Prem CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Analytics Data Science Security Metadata Governance STORAGE HDFS ON PREMISES Current State ● Multiple workloads and services run in a single cluster ● Data Context (security, metadata, governance) in single cluster Goals in Journey to the Cloud ● Get to Cloud with minimal impact and change ● Replicate security groups and permissions in the Cloud ● May require multiple stages to get there ● First step may vary depending on goals ● Need to determine how data will be replicated to the Cloud
  • 15. © Cloudera, Inc. All rights reserved. 15 CUSTOMER CLOUD (AWS, Azure, GCP, etc) Start by Replicating Data to Public Cloud via BDR ON PREMISES STORAGE HDFS PUBLIC CLOUD HDFS CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Hive Impala Spark Sentry HMS STORAGE HDFS Navigator BDR
  • 16. © Cloudera, Inc. All rights reserved. 16 CUSTOMER CLOUD Journey to the Cloud - Step 1 CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Analytics Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 1- LIFT AND SHIFT HDFS
  • 17. © Cloudera, Inc. All rights reserved. 17 CUSTOMER CLOUDCUSTOMER CLOUD Journey to the Cloud - Step 2 CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Analytics Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 1- LIFT AND SHIFT CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Analytics Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 2 - OBJECT STORAGE HDFS
  • 18. © Cloudera, Inc. All rights reserved. 18 CUSTOMER CLOUD CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Analytics Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE CUSTOMER CLOUD Journey to the Cloud - Step 3 CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Analytics Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 1- LIFT AND SHIFT 2 - OBJECT STORAGE HDFS CLOUDERA CLUSTERS (TRANSIENT– ALTUS) COMPUTE Data Engineering CUSTOMER CLOUD CLOUDERA CLOUD CLOUDERA ALTUS CONTROL PLANE STORAGE CLOUD OBJECT STORE DATA CONTEXT CLOUDERA CLUSTER (PERSISTENT–DIRECTOR) COMPUTE DATA CONTEXT CLOUDERA CLUSTERS (TRANSIENT– ALTUS) COMPUTE Analytics 3 - CLOUD NATIVE ARCHITECTURES
  • 19. © Cloudera, Inc. All rights reserved. 19 Customer Examples Many Cloudera customers (Global 5K) used public cloud • Online retailer • Over 2,000 nodes with ~2PB of data in cloud running in an active - active configuration • Transforming data with Spark and then analyzing with Apache Hive • German chain of coffee retailers and cafés • 30+ nodes with 50TB of data in cloud • Modern Cloudera platform with an Impala data warehouse • Global information company • 70+ nodes in cloud across Microsoft Azure and AWS • Replaced Netezza with Hadoop and leveraging both Impala and Spark for analytics
  • 20. © Cloudera, Inc. All rights reserved. 20 Cloudera is using cloud as well Security Use Case Altus based solution saved more than 50% cost compared to initial implementation
  • 21. © Cloudera, Inc. All rights reserved. 21 Cloudera Altus Key Differentiators • Multi-function: Unified platform for data engineering, data warehouse, and data science • Multi-cloud: Option for on-premises, Public cloud (including multi-), and Hybrid • SDX: Integrated shared data experience across multi-function clusters
  • 22. © Cloudera, Inc. All rights reserved.22 © Cloudera, Inc. All rights reserved. Pick the Right Altus Component for Your Needs Depending on workload and service level • Service offering for batch oriented Data Engineering jobs on data in object stores (ADLS, others) • Usage based pricing • Runs Apache Spark, Apache Hive and MapReduce jobs • Provides Workload Analytics to troubleshoot and optimize job performance • Service offering for cloud native data warehouse use cases • Usage based pricing • Runs Apache Impala on data stored in object stores (ADLS, others) • Exposes endpoint to connect BI Tools for visualization • Offers built-in SQL Editor for ad-hoc data exploration • EDH for public cloud which gives customers full cluster control • Self-managed cloud infrastructure • Usage or node based pricing • Full breadth of CDH services available (Apache Kafka, Apache Spark Streaming, CDSW, etc) • Supports deployments on 5 public cloud platforms Altus Data Engineering (PaaS) Altus Data Warehouse (PaaS) Altus (IaaS)
  • 23. 23 © Cloudera, Inc. All rights reserved. Azure Update
  • 24. Cosmos Microsoft’s internal data lake • A data lake for all teams @Microsoft • Tools approachable by any developer • Batch, Interactive, Streaming, ML • Used across Office, Xbox, Azure, Windows, Ads, Bing, Skype, … By the numbers • Exabytes of data • 100Ks of Physical Servers • Millions of Interactive Queries • Huge Streaming Pipelines • 100Ks of Batch Jobs • 10K+ Developers Microsoft’s Big Data Service Azure Data Lake A data lake for everyone • The next version of Cosmos • Fully aligned with Hadoop ecosystem and standards, with full support for Hadoop tools and engines as well as unique Microsoft capabilities • Migration from Cosmos to ADL is already underway • External customers on the same service as internal customers
  • 25. Ingest all data regardless of requirements Store all data in native format without schema definition Do analysis Using analytic engines like Hadoop Interactive queries Batch queries Machine Learning Data warehouse Real-time analytics Devices
  • 26. Azure Data Lake Overview Windows Azure Blob Storage Spark Map- Reduce Impala Cloudera Azure Key Vault Azure Active Dir Azure Data Lake Store – in-cluster services U-SQL ADL Analytics … Ingestion Service ADLS Gateway Service Cosmos API HDFS++ API HDFS++ API Scope YARN ADLS Micro Services ADL local tier Azure VMs Azure remote storage tier
  • 27. ADLS Gen 2 • Preview announced June 2018 • Allows all storage regions to have HDFS API • Soon available for Cloudera implementations
  • 28. Azure Data Lake Storage Gen2 Key Features
  • 29. © Cloudera, Inc. All rights reserved. Demo
  • 30. © Cloudera, Inc. All rights reserved.30 © Cloudera, Inc. All rights reserved. Poll Question 2: How do you want to use the Cloud? • Migrating existing workloads from your on-prem cluster to Azure • Deploying new data analytics / engineering jobs in Cloud (PaaS / SaaS) • Interested in both of the above • Not sure
  • 31. © Cloudera, Inc. All rights reserved. 31© Cloudera, Inc. All rights reserved. Cloud Data Analytics / Engineering with Cloudera $ • Lower risk of data breach • Analysts more productive on jobs • Self-service (no shadow IT) and more productive • IT more strategic, less admin time • Deployment choices and no lock-in • Same solution as on-premises and multi- cloud • Eliminate data copies • Single security framework with universally shared metadata • Easy to track data lineage • Unified services + ADVANTAGES BUSINESS VALUE • Lower risk of data breach • Analysts more productive on jobs • Self-service (no shadow IT) and more productive • IT more strategic, less admin time • Deployment choices and no lock-in
  • 32. © Cloudera, Inc. All rights reserved. 32© Cloudera, Inc. All rights reserved. Ready to try the Cloud? $10K of free Azure credits! • Cloudera and Microsoft will offer $10,000 in FREE Azure for qualifying opportunities • To be applied to Azure subscription • Must be consumed in 60 days • Must be a Cloudera product running on Microsoft Azure • Must be tied to a single customer entity for PoC or pilot deployment • Limited time offer • Contact azureoffer@cloudera.com
  • 34. © Cloudera, Inc. All rights reserved. Appendix
  • 35. 35 © Cloudera, Inc. All rights reserved. Cloudera Pricing / Acquisition Acquisition Options ● Pay-as-you-go usage-based pricing ● Node-based license subscription ● Free 30-day trial ● Pre-pay of cloud credits ● Free version that can be deployed in the cloud Pricing - https://www.cloudera.com/products/pricing.html