SlideShare a Scribd company logo
1 of 36
Download to read offline
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
November 1, 2017 | 11:00 AM PT
Automating Big Data
Technologies for Faster Time-
to-Value
© 2017, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Today’s Presenters
David Potes, Solutions Architect, Amazon Web Services
Minesh Patel, Technical Director, Qubole
Seth Myers, Senior Data Scientist, Demandbase
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Today’s Agenda
1. An overview of AWS and AWS Marketplace, with an emphasis on
AWS data lake solutions and Qubole
2. Overview of the Qubole solutions featured in our story
3. Challenges faced by Demandbase
4. The Demandbase success story with AWS and Qubole
5. Q&A/Discussion
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Learning Objectives
1. How to dramatically reduce management complexities for analytics
operations
2. How to reduce the costs of processing and analyzing data in a data
lake on AWS
3. How to operate at the scale and efficiency of a large enterprise,
with a small data team
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Introduction to Data Lake
Concepts
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Unlocking Data
Most companies and organizations are embarking on
ambitious innovation initiatives to unlock their data.
The data already exists but goes unused or is locked away
from complimentary data sets in isolated data silos.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Enter Data Lake Architectures
Data Lake is a new and increasingly
popular architecture to store and analyze
massive volumes and heterogeneous
types of data.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of a Data Lake – All Data in One Place
Store and analyze all of your data,
from all of your sources, in one
centralized location.
“Why is the data distributed in
many locations? Where is the
single source of truth ?”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of a Data Lake – Quick Ingest
Quickly ingest data
without needing to force it into a
pre-defined schema.
“How can I collect data quickly
from various sources and store
it efficiently?”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of a Data Lake – Storage vs Compute
Separating your storage and compute
allows you to scale each component as
required
“How can I scale up with the
volume of data being generated?”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of a Data Lake – Schema on Read
“Is there a way I can apply multiple
analytics and processing frameworks
to the same data?”
A Data Lake enables ad-hoc
analysis by applying schemas
on read, not write.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Approach to Data Lake
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3 is the Data Lake
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Designed Benefits of an Amazon S3 Data Lake
Fixed Cluster Data Lake Amazon S3 Data Lake
• Limited to only the single tool contained
on the cluster (i.e. Hadoop or data
warehouse or Cassandra, etc.). Use
cases & ecosystem tools change
rapidly
• Expensive to add nodes to add storage
capacity
• Expensive to replicate data against
node loss
• Complexity in scaling local storage
capacity
• Long refresh cycles to add additional
storage equipment
• Decouple storage and compute by
making Amazon S3 object based
storage, not a fixed tool cluster the data
lake
• Flexibility to use any and all tools in the
ecosystem. The right tool for the job
• Future proof your architecture. As new
use cases and new tools emerge you
can plug and play current best of breed.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Amazon S3 for Data Lake?
Designed for 11 9s
of durability
Designed for
99.99% availability
Durable Available High performance
 Multiple upload
 Range GET
 Store as much as you need
 Scale storage and compute
independently
 No minimum usage commitments
Scalable
 Amazon EMR
 Amazon Redshift
 Amazon DynamoDB
Integrated
 Simple REST API
 AWS SDKs
 Read-after-create consistency
 Event notification
 Lifecycle policies
Easy to use
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Automating Complex Tasks
Qubole makes Big Data technologies swift and simple
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
About Qubole
One of the largest cloud-
agnostic Big Data as a Service
companies
Founded by the pioneers of “big
data” @ Facebook and the
creators of Apache Hive
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Poll Question #1
What is the status of your big data initiative?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Vision
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Qubole Data Service
Amazon
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Autonomous Data Management
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Qubole Cloud Agents
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Total Cost Savings Among Qubole Customers in 2016
and 2017
Cluster Life
Cycle
Management
$150M
Workload-
aware
Autoscaling
$121M
Spot
Shopper
$40M
 Cluster Life Cycle Management
Savings
– Amount saved by automatically
terminating a cluster when inactive
 Workload-aware Auto-scaling Saving
– Amount saved by predictively adjusting
the number of nodes to meet demand
 Spot Shopper savings
– Amount saved by utilizing SPOT
instances
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Architectural Diagram
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Poll Question #2
What big data technology are you using or evaluating?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Qubole?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Demandbase Automates With
Qubole
Demandbase provides more value for their B2B marketing customers
by automating Big Data and Machine Learning operations.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Who is Demandbase?
Demandbase is a B2B marketing automation company that leverages
artificial intelligence to automate all aspects of the advertising, selling,
and marketing process.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Challenge
• Many factors determine which accounts a business should target
• Do they have a need/budget for the product?
• Are they currently in-market for the product?
• Do they have decision makers ready to buy?
• These insights must come from many different types of big datasets
• Demandbase’s previous account identification tool took multiple days to
run
• Our clients could not iterate or modify their strategies with such slow
turn-around
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Data Used to Identify Accounts
• To determine an account’s need for the product
• We have firmographic information on 14 Million accounts
• We’ve built a knowledge graph of all accounts using NLP
technology that crawls 350 TB of web pages a month
• To determine if an account is in-market
• We track 700 Billion web interactions a year, each one mapped
to employees across all accounts
• To identify decision makers
• We are currently tracking over a 100 Million employees across
all accounts
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
All 14M accounts are scored,
top 5K available to user
Keywords extracts from 700B
web interactions
Buyers at each account
identified from 100M+ contacts
Company 2
Company 3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Solution
• The user requests a new list of accounts with a button-
press
• 60 EC2 servers are spun up
• A machine learning algorithm is built using Spark and MLLIB
• For each of 14 Million accounts
• Information about relevant web interactions, buyers, online content, etc. fed into
machine learning model
• The model scores each account
• Top 5K accounts are pushed to web app, along with
relevant info
• From button-press to new account list – 20 minutes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Qubole Makes This Possible
• Qubole manages all of our EC2 instances
• So far, we’ve tested 20 different concurrent models (20 X 60
EC2 servers) successfully
• Qubole keeps our costs down through dynamic bidding and
heterogeneous server clusters
• Our web app calls Qubole’s easy-to-implement Play API, which
spins up the EC2 instances and deploys our Spark job
• With Qubole taking care of the infrastructure, we could focus on
developing the machine learning
• Qubole allowed us to build a self-serve machine-learning-as-service
solution
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Next Steps and Further Information
• Try a pre-configured production-ready Qubole deployment on AWS Data Lake:
• https://aws.amazon.com/quickstart/architecture/qubole-on-data-lake-foundation/
• Buy on AWS Marketplace:
• https://aws.amazon.com/marketplace/pp/B06XX76R24
• Learn more about Qubole:
• https://www.qubole.com/products/qds-for-aws/
• Learn more about Demandbase:
• https://www.demandbase.com/technology/
• Try AWS:
• https://aws.amazon.com/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Q & A
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!

More Related Content

What's hot

Serverless DevOps to the Rescue - SRV330 - re:Invent 2017
Serverless DevOps to the Rescue - SRV330 - re:Invent 2017Serverless DevOps to the Rescue - SRV330 - re:Invent 2017
Serverless DevOps to the Rescue - SRV330 - re:Invent 2017Amazon Web Services
 
Born in the Cloud, Built like a Startup
Born in the Cloud, Built like a StartupBorn in the Cloud, Built like a Startup
Born in the Cloud, Built like a StartupAmazon Web Services
 
Building Serverless Websites with Lambda@Edge - AWS Online Tech Talks
Building Serverless Websites with Lambda@Edge - AWS Online Tech TalksBuilding Serverless Websites with Lambda@Edge - AWS Online Tech Talks
Building Serverless Websites with Lambda@Edge - AWS Online Tech TalksAmazon Web Services
 
You Don’t Need A Mobile App! Responsive Web Apps Using AWS
You Don’t Need A Mobile App! Responsive Web Apps Using AWSYou Don’t Need A Mobile App! Responsive Web Apps Using AWS
You Don’t Need A Mobile App! Responsive Web Apps Using AWSAmazon Web Services
 
Building Serverless Microservices with AWS
Building Serverless Microservices with AWSBuilding Serverless Microservices with AWS
Building Serverless Microservices with AWSDonnie Prakoso
 
規劃大規模遷移到 AWS 的最佳實踐
規劃大規模遷移到 AWS 的最佳實踐規劃大規模遷移到 AWS 的最佳實踐
規劃大規模遷移到 AWS 的最佳實踐Amazon Web Services
 
Dow Jones & Wall Street Journal's journey to manage traffic spikes while miti...
Dow Jones & Wall Street Journal's journey to manage traffic spikes while miti...Dow Jones & Wall Street Journal's journey to manage traffic spikes while miti...
Dow Jones & Wall Street Journal's journey to manage traffic spikes while miti...Amazon Web Services
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersAmazon Web Services
 
Introduction to GraphQL and AWS Appsync on AWS - iOS
Introduction to GraphQL and AWS Appsync on AWS - iOSIntroduction to GraphQL and AWS Appsync on AWS - iOS
Introduction to GraphQL and AWS Appsync on AWS - iOSAmazon Web Services
 
Comparing Compute Options for Microservices - AWS Summti Sydney 2018
Comparing Compute Options for Microservices - AWS Summti Sydney 2018Comparing Compute Options for Microservices - AWS Summti Sydney 2018
Comparing Compute Options for Microservices - AWS Summti Sydney 2018Amazon Web Services
 
MBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real WorldMBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real WorldAmazon Web Services
 
Building Manageable Windows Workloads - ARC324 - re:Invent 2017
Building Manageable Windows Workloads - ARC324 - re:Invent 2017Building Manageable Windows Workloads - ARC324 - re:Invent 2017
Building Manageable Windows Workloads - ARC324 - re:Invent 2017Amazon Web Services
 
Application Performance Management on AWS
Application Performance Management on AWSApplication Performance Management on AWS
Application Performance Management on AWSAmazon Web Services
 
CON202-Getting Started with Docker and Amazon ECS
CON202-Getting Started with Docker and Amazon ECSCON202-Getting Started with Docker and Amazon ECS
CON202-Getting Started with Docker and Amazon ECSAmazon Web Services
 
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...Amazon Web Services
 
WIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWSWIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWSAmazon Web Services
 
Build a Serverless Web Application in One Day
Build a Serverless Web Application in One DayBuild a Serverless Web Application in One Day
Build a Serverless Web Application in One DayAmazon Web Services
 
How to Build Scalable Serverless Applications
How to Build Scalable Serverless ApplicationsHow to Build Scalable Serverless Applications
How to Build Scalable Serverless ApplicationsAmazon Web Services
 
WIN204-Simplifying Microsoft Architectures with AWS Services
WIN204-Simplifying Microsoft Architectures with AWS ServicesWIN204-Simplifying Microsoft Architectures with AWS Services
WIN204-Simplifying Microsoft Architectures with AWS ServicesAmazon Web Services
 

What's hot (20)

Serverless DevOps to the Rescue - SRV330 - re:Invent 2017
Serverless DevOps to the Rescue - SRV330 - re:Invent 2017Serverless DevOps to the Rescue - SRV330 - re:Invent 2017
Serverless DevOps to the Rescue - SRV330 - re:Invent 2017
 
Born in the Cloud, Built like a Startup
Born in the Cloud, Built like a StartupBorn in the Cloud, Built like a Startup
Born in the Cloud, Built like a Startup
 
Building Serverless Websites with Lambda@Edge - AWS Online Tech Talks
Building Serverless Websites with Lambda@Edge - AWS Online Tech TalksBuilding Serverless Websites with Lambda@Edge - AWS Online Tech Talks
Building Serverless Websites with Lambda@Edge - AWS Online Tech Talks
 
You Don’t Need A Mobile App! Responsive Web Apps Using AWS
You Don’t Need A Mobile App! Responsive Web Apps Using AWSYou Don’t Need A Mobile App! Responsive Web Apps Using AWS
You Don’t Need A Mobile App! Responsive Web Apps Using AWS
 
Building Serverless Microservices with AWS
Building Serverless Microservices with AWSBuilding Serverless Microservices with AWS
Building Serverless Microservices with AWS
 
規劃大規模遷移到 AWS 的最佳實踐
規劃大規模遷移到 AWS 的最佳實踐規劃大規模遷移到 AWS 的最佳實踐
規劃大規模遷移到 AWS 的最佳實踐
 
Dow Jones & Wall Street Journal's journey to manage traffic spikes while miti...
Dow Jones & Wall Street Journal's journey to manage traffic spikes while miti...Dow Jones & Wall Street Journal's journey to manage traffic spikes while miti...
Dow Jones & Wall Street Journal's journey to manage traffic spikes while miti...
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
 
Introduction to GraphQL and AWS Appsync on AWS - iOS
Introduction to GraphQL and AWS Appsync on AWS - iOSIntroduction to GraphQL and AWS Appsync on AWS - iOS
Introduction to GraphQL and AWS Appsync on AWS - iOS
 
Comparing Compute Options for Microservices - AWS Summti Sydney 2018
Comparing Compute Options for Microservices - AWS Summti Sydney 2018Comparing Compute Options for Microservices - AWS Summti Sydney 2018
Comparing Compute Options for Microservices - AWS Summti Sydney 2018
 
MBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real WorldMBL201_Progressive Web Apps in the Real World
MBL201_Progressive Web Apps in the Real World
 
Building Manageable Windows Workloads - ARC324 - re:Invent 2017
Building Manageable Windows Workloads - ARC324 - re:Invent 2017Building Manageable Windows Workloads - ARC324 - re:Invent 2017
Building Manageable Windows Workloads - ARC324 - re:Invent 2017
 
Application Performance Management on AWS
Application Performance Management on AWSApplication Performance Management on AWS
Application Performance Management on AWS
 
SID402_An AWS Security Odyssey
SID402_An AWS Security OdysseySID402_An AWS Security Odyssey
SID402_An AWS Security Odyssey
 
CON202-Getting Started with Docker and Amazon ECS
CON202-Getting Started with Docker and Amazon ECSCON202-Getting Started with Docker and Amazon ECS
CON202-Getting Started with Docker and Amazon ECS
 
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
Set it and Forget it: Auto Scaling Target Tracking Policies - AWS Online Tech...
 
WIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWSWIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWS
 
Build a Serverless Web Application in One Day
Build a Serverless Web Application in One DayBuild a Serverless Web Application in One Day
Build a Serverless Web Application in One Day
 
How to Build Scalable Serverless Applications
How to Build Scalable Serverless ApplicationsHow to Build Scalable Serverless Applications
How to Build Scalable Serverless Applications
 
WIN204-Simplifying Microsoft Architectures with AWS Services
WIN204-Simplifying Microsoft Architectures with AWS ServicesWIN204-Simplifying Microsoft Architectures with AWS Services
WIN204-Simplifying Microsoft Architectures with AWS Services
 

Viewers also liked

Getting Started with Amazon EC2 Container Service
Getting Started with Amazon EC2 Container ServiceGetting Started with Amazon EC2 Container Service
Getting Started with Amazon EC2 Container ServiceAmazon Web Services
 
Managing Container Images with Amazon ECR - AWS Online Tech Talks
Managing Container Images with Amazon ECR - AWS Online Tech TalksManaging Container Images with Amazon ECR - AWS Online Tech Talks
Managing Container Images with Amazon ECR - AWS Online Tech TalksAmazon Web Services
 
Tips and Tricks for Running Container Workloads on AWS
Tips and Tricks for Running Container Workloads on AWSTips and Tricks for Running Container Workloads on AWS
Tips and Tricks for Running Container Workloads on AWSAmazon Web Services
 
Automating Amazon Inspector Assessments and Findings Remediation
Automating Amazon Inspector Assessments and Findings RemediationAutomating Amazon Inspector Assessments and Findings Remediation
Automating Amazon Inspector Assessments and Findings RemediationAmazon Web Services
 
Building a Global Sales Channel with AWS Marketplace
Building a Global Sales Channel with AWS MarketplaceBuilding a Global Sales Channel with AWS Marketplace
Building a Global Sales Channel with AWS MarketplaceAmazon Web Services
 
Building on AWS: Optimizing & Delivering
Building on AWS: Optimizing & DeliveringBuilding on AWS: Optimizing & Delivering
Building on AWS: Optimizing & DeliveringAmazon Web Services
 
Introduction to Amazon Web Services
Introduction to Amazon Web ServicesIntroduction to Amazon Web Services
Introduction to Amazon Web ServicesAmazon Web Services
 
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...Amazon Web Services
 
Serverless by Example: Building a Real-Time Chat System
Serverless by Example: Building a Real-Time Chat SystemServerless by Example: Building a Real-Time Chat System
Serverless by Example: Building a Real-Time Chat SystemAmazon Web Services
 
Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...
Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...
Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...Amazon Web Services
 
Delivering DevOps on AWS - Transformation Day Public Sector London 2017
Delivering DevOps on AWS - Transformation Day Public Sector London 2017Delivering DevOps on AWS - Transformation Day Public Sector London 2017
Delivering DevOps on AWS - Transformation Day Public Sector London 2017Amazon Web Services
 
Maturing your organization from DevOps to DevSecOps
Maturing your organization from DevOps to DevSecOpsMaturing your organization from DevOps to DevSecOps
Maturing your organization from DevOps to DevSecOpsAmazon Web Services
 
AWS サービスアップデートまとめ re:Invent 2017 直前編
AWS サービスアップデートまとめ re:Invent 2017 直前編AWS サービスアップデートまとめ re:Invent 2017 直前編
AWS サービスアップデートまとめ re:Invent 2017 直前編Amazon Web Services Japan
 
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017Amazon Web Services
 
Amazon EC2 and Amazon VPC Hands-on Workshop
Amazon EC2 and Amazon VPC Hands-on WorkshopAmazon EC2 and Amazon VPC Hands-on Workshop
Amazon EC2 and Amazon VPC Hands-on WorkshopAmazon Web Services
 
Build on AWS: Migrating And Platforming
Build on AWS: Migrating And PlatformingBuild on AWS: Migrating And Platforming
Build on AWS: Migrating And PlatformingAmazon Web Services
 
Deep dive on Serverless application development
Deep dive on Serverless application developmentDeep dive on Serverless application development
Deep dive on Serverless application developmentAmazon Web Services
 

Viewers also liked (20)

Getting Started with Amazon EC2 Container Service
Getting Started with Amazon EC2 Container ServiceGetting Started with Amazon EC2 Container Service
Getting Started with Amazon EC2 Container Service
 
Managing Container Images with Amazon ECR - AWS Online Tech Talks
Managing Container Images with Amazon ECR - AWS Online Tech TalksManaging Container Images with Amazon ECR - AWS Online Tech Talks
Managing Container Images with Amazon ECR - AWS Online Tech Talks
 
Tips and Tricks for Running Container Workloads on AWS
Tips and Tricks for Running Container Workloads on AWSTips and Tricks for Running Container Workloads on AWS
Tips and Tricks for Running Container Workloads on AWS
 
Keynote - Security is Coming
Keynote - Security is ComingKeynote - Security is Coming
Keynote - Security is Coming
 
Automating Amazon Inspector Assessments and Findings Remediation
Automating Amazon Inspector Assessments and Findings RemediationAutomating Amazon Inspector Assessments and Findings Remediation
Automating Amazon Inspector Assessments and Findings Remediation
 
Building a Global Sales Channel with AWS Marketplace
Building a Global Sales Channel with AWS MarketplaceBuilding a Global Sales Channel with AWS Marketplace
Building a Global Sales Channel with AWS Marketplace
 
Building on AWS: Optimizing & Delivering
Building on AWS: Optimizing & DeliveringBuilding on AWS: Optimizing & Delivering
Building on AWS: Optimizing & Delivering
 
Introduction to Amazon Web Services
Introduction to Amazon Web ServicesIntroduction to Amazon Web Services
Introduction to Amazon Web Services
 
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
 
Serverless by Example: Building a Real-Time Chat System
Serverless by Example: Building a Real-Time Chat SystemServerless by Example: Building a Real-Time Chat System
Serverless by Example: Building a Real-Time Chat System
 
Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...
Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...
Cloud Economics; How to Quantify the Benefits of Moving to the Cloud - Transf...
 
Delivering DevOps on AWS - Transformation Day Public Sector London 2017
Delivering DevOps on AWS - Transformation Day Public Sector London 2017Delivering DevOps on AWS - Transformation Day Public Sector London 2017
Delivering DevOps on AWS - Transformation Day Public Sector London 2017
 
AWS Security Fundamentals
AWS Security FundamentalsAWS Security Fundamentals
AWS Security Fundamentals
 
Maturing your organization from DevOps to DevSecOps
Maturing your organization from DevOps to DevSecOpsMaturing your organization from DevOps to DevSecOps
Maturing your organization from DevOps to DevSecOps
 
Deep Dive on Big Data
Deep Dive on Big Data Deep Dive on Big Data
Deep Dive on Big Data
 
AWS サービスアップデートまとめ re:Invent 2017 直前編
AWS サービスアップデートまとめ re:Invent 2017 直前編AWS サービスアップデートまとめ re:Invent 2017 直前編
AWS サービスアップデートまとめ re:Invent 2017 直前編
 
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
 
Amazon EC2 and Amazon VPC Hands-on Workshop
Amazon EC2 and Amazon VPC Hands-on WorkshopAmazon EC2 and Amazon VPC Hands-on Workshop
Amazon EC2 and Amazon VPC Hands-on Workshop
 
Build on AWS: Migrating And Platforming
Build on AWS: Migrating And PlatformingBuild on AWS: Migrating And Platforming
Build on AWS: Migrating And Platforming
 
Deep dive on Serverless application development
Deep dive on Serverless application developmentDeep dive on Serverless application development
Deep dive on Serverless application development
 

Similar to Automating Big Data Technologies for Faster Time-to-Value

TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and QuboleAmazon Web Services
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and QuboleAmazon Web Services
 
Architecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseArchitecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseAmazon Web Services
 
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_SingaporeAmazon Web Services
 
McGraw-Hill Optimizes Analytics Workloads with Databricks
 McGraw-Hill Optimizes Analytics Workloads with Databricks McGraw-Hill Optimizes Analytics Workloads with Databricks
McGraw-Hill Optimizes Analytics Workloads with DatabricksAmazon Web Services
 
FSV305-Optimizing Payments Collections with Containers and Machine Learning
FSV305-Optimizing Payments Collections with Containers and Machine LearningFSV305-Optimizing Payments Collections with Containers and Machine Learning
FSV305-Optimizing Payments Collections with Containers and Machine LearningAmazon Web Services
 
ARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million UsersARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million UsersAmazon Web Services
 
GPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made EasyGPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made EasyAmazon Web Services
 
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...Amazon Web Services
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsAmazon Web Services
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsAmazon Web Services
 
ABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing OrganizationABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing OrganizationAmazon Web Services
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Amazon Web Services
 
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...Amazon Web Services
 
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWSAWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWSAmazon Web Services
 
How Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS AnalyticsHow Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS AnalyticsAmazon Web Services
 
Design, Build, and Modernize Your Web Applications with AWS
 Design, Build, and Modernize Your Web Applications with AWS Design, Build, and Modernize Your Web Applications with AWS
Design, Build, and Modernize Your Web Applications with AWSDonnie Prakoso
 
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...Amazon Web Services
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansAmazon Web Services
 

Similar to Automating Big Data Technologies for Faster Time-to-Value (20)

TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
Architecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseArchitecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the Enterprise
 
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
 
McGraw-Hill Optimizes Analytics Workloads with Databricks
 McGraw-Hill Optimizes Analytics Workloads with Databricks McGraw-Hill Optimizes Analytics Workloads with Databricks
McGraw-Hill Optimizes Analytics Workloads with Databricks
 
FSV305-Optimizing Payments Collections with Containers and Machine Learning
FSV305-Optimizing Payments Collections with Containers and Machine LearningFSV305-Optimizing Payments Collections with Containers and Machine Learning
FSV305-Optimizing Payments Collections with Containers and Machine Learning
 
ARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million UsersARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million Users
 
GPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made EasyGPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made Easy
 
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS Analytics
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS Analytics
 
ABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing OrganizationABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing Organization
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
 
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Mas...
 
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWSAWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
 
How Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS AnalyticsHow Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS Analytics
 
Design, Build, and Modernize Your Web Applications with AWS
 Design, Build, and Modernize Your Web Applications with AWS Design, Build, and Modernize Your Web Applications with AWS
Design, Build, and Modernize Your Web Applications with AWS
 
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data Oceans
 
How Amazon uses AWS Analytics
How Amazon uses AWS AnalyticsHow Amazon uses AWS Analytics
How Amazon uses AWS Analytics
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Automating Big Data Technologies for Faster Time-to-Value

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. November 1, 2017 | 11:00 AM PT Automating Big Data Technologies for Faster Time- to-Value © 2017, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Today’s Presenters David Potes, Solutions Architect, Amazon Web Services Minesh Patel, Technical Director, Qubole Seth Myers, Senior Data Scientist, Demandbase
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Today’s Agenda 1. An overview of AWS and AWS Marketplace, with an emphasis on AWS data lake solutions and Qubole 2. Overview of the Qubole solutions featured in our story 3. Challenges faced by Demandbase 4. The Demandbase success story with AWS and Qubole 5. Q&A/Discussion
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Learning Objectives 1. How to dramatically reduce management complexities for analytics operations 2. How to reduce the costs of processing and analyzing data in a data lake on AWS 3. How to operate at the scale and efficiency of a large enterprise, with a small data team
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Introduction to Data Lake Concepts
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Unlocking Data Most companies and organizations are embarking on ambitious innovation initiatives to unlock their data. The data already exists but goes unused or is locked away from complimentary data sets in isolated data silos.
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Enter Data Lake Architectures Data Lake is a new and increasingly popular architecture to store and analyze massive volumes and heterogeneous types of data.
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – All Data in One Place Store and analyze all of your data, from all of your sources, in one centralized location. “Why is the data distributed in many locations? Where is the single source of truth ?”
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Quick Ingest Quickly ingest data without needing to force it into a pre-defined schema. “How can I collect data quickly from various sources and store it efficiently?”
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Storage vs Compute Separating your storage and compute allows you to scale each component as required “How can I scale up with the volume of data being generated?”
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Schema on Read “Is there a way I can apply multiple analytics and processing frameworks to the same data?” A Data Lake enables ad-hoc analysis by applying schemas on read, not write.
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Approach to Data Lake
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon S3 is the Data Lake
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Designed Benefits of an Amazon S3 Data Lake Fixed Cluster Data Lake Amazon S3 Data Lake • Limited to only the single tool contained on the cluster (i.e. Hadoop or data warehouse or Cassandra, etc.). Use cases & ecosystem tools change rapidly • Expensive to add nodes to add storage capacity • Expensive to replicate data against node loss • Complexity in scaling local storage capacity • Long refresh cycles to add additional storage equipment • Decouple storage and compute by making Amazon S3 object based storage, not a fixed tool cluster the data lake • Flexibility to use any and all tools in the ecosystem. The right tool for the job • Future proof your architecture. As new use cases and new tools emerge you can plug and play current best of breed.
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why Amazon S3 for Data Lake? Designed for 11 9s of durability Designed for 99.99% availability Durable Available High performance  Multiple upload  Range GET  Store as much as you need  Scale storage and compute independently  No minimum usage commitments Scalable  Amazon EMR  Amazon Redshift  Amazon DynamoDB Integrated  Simple REST API  AWS SDKs  Read-after-create consistency  Event notification  Lifecycle policies Easy to use
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Automating Complex Tasks Qubole makes Big Data technologies swift and simple
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. About Qubole One of the largest cloud- agnostic Big Data as a Service companies Founded by the pioneers of “big data” @ Facebook and the creators of Apache Hive
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Poll Question #1 What is the status of your big data initiative?
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Vision
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Qubole Data Service Amazon
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Autonomous Data Management
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Qubole Cloud Agents
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Total Cost Savings Among Qubole Customers in 2016 and 2017 Cluster Life Cycle Management $150M Workload- aware Autoscaling $121M Spot Shopper $40M  Cluster Life Cycle Management Savings – Amount saved by automatically terminating a cluster when inactive  Workload-aware Auto-scaling Saving – Amount saved by predictively adjusting the number of nodes to meet demand  Spot Shopper savings – Amount saved by utilizing SPOT instances
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Architectural Diagram
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Poll Question #2 What big data technology are you using or evaluating?
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why Qubole?
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Demandbase Automates With Qubole Demandbase provides more value for their B2B marketing customers by automating Big Data and Machine Learning operations.
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Who is Demandbase? Demandbase is a B2B marketing automation company that leverages artificial intelligence to automate all aspects of the advertising, selling, and marketing process.
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Challenge • Many factors determine which accounts a business should target • Do they have a need/budget for the product? • Are they currently in-market for the product? • Do they have decision makers ready to buy? • These insights must come from many different types of big datasets • Demandbase’s previous account identification tool took multiple days to run • Our clients could not iterate or modify their strategies with such slow turn-around
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Data Used to Identify Accounts • To determine an account’s need for the product • We have firmographic information on 14 Million accounts • We’ve built a knowledge graph of all accounts using NLP technology that crawls 350 TB of web pages a month • To determine if an account is in-market • We track 700 Billion web interactions a year, each one mapped to employees across all accounts • To identify decision makers • We are currently tracking over a 100 Million employees across all accounts
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. All 14M accounts are scored, top 5K available to user Keywords extracts from 700B web interactions Buyers at each account identified from 100M+ contacts Company 2 Company 3
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Solution • The user requests a new list of accounts with a button- press • 60 EC2 servers are spun up • A machine learning algorithm is built using Spark and MLLIB • For each of 14 Million accounts • Information about relevant web interactions, buyers, online content, etc. fed into machine learning model • The model scores each account • Top 5K accounts are pushed to web app, along with relevant info • From button-press to new account list – 20 minutes
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Qubole Makes This Possible • Qubole manages all of our EC2 instances • So far, we’ve tested 20 different concurrent models (20 X 60 EC2 servers) successfully • Qubole keeps our costs down through dynamic bidding and heterogeneous server clusters • Our web app calls Qubole’s easy-to-implement Play API, which spins up the EC2 instances and deploys our Spark job • With Qubole taking care of the infrastructure, we could focus on developing the machine learning • Qubole allowed us to build a self-serve machine-learning-as-service solution
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Next Steps and Further Information • Try a pre-configured production-ready Qubole deployment on AWS Data Lake: • https://aws.amazon.com/quickstart/architecture/qubole-on-data-lake-foundation/ • Buy on AWS Marketplace: • https://aws.amazon.com/marketplace/pp/B06XX76R24 • Learn more about Qubole: • https://www.qubole.com/products/qds-for-aws/ • Learn more about Demandbase: • https://www.demandbase.com/technology/ • Try AWS: • https://aws.amazon.com/
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Q & A
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you!