SlideShare une entreprise Scribd logo
1  sur  33
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
搭建現代化的資料數據湖
Young Yang
T r a c k 6 | S e s s i o n 2
ML Specialist SA
Amazon Web Services
AWS offers a modern data platform
BI +
A NA LYT I C S
OLTP ERP CRM
DW SILO 1
BUSINESS
INTELLIGENCE
DEVICES WEB
LOGS
MOBILE
APPS
DW SILO 2
LOB
APPS
BUSINESS
INTELLIGENCE
to
MA C H I NE
LE A R NI NG
DA T A
WA R E H O US I NG
Data lakes
OPEN FORMATS
CENTRAL
CATALOG
(CSV, ORC, Parquet, Avro)
Data silos
Old guard data patterns Modern data architecture
After this session, what you will take away?
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Data analysts
Data scientists
Business users
Engagement
platforms
Automation /
events
Internet
AWS Direct Connect
VPN
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
Lake Formation
s3
Digested
EMR
Glue
EMR
Glue
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
ML &
Analytics
SageMaker AI ServicesElasticsearch AthenaKinesis
Data Firehose
Event Capture Event Handler
Kinesis
Data Analytics
Lambda
Event Scoring
SageMaker
Event Action
AI Services
Step
Functions
Fargate
EKS
ECS
API Gateway
Lambda
Data Warehouse
Database
Elasticsearch
DynamoDB
Aurora
Amazon Redshift
ElastiCache
QuickSight
BI Reporting
Analytics
kibana
Near-Zero Latency
DocumentDB
Jupyter
Kinesis
Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark
Stream
Flink on Kinesis
Data Analytics
Stream Analysis
Athena
Federated Query
New Preview
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
Lake Formation
s3
Digested
EMR
Glue
EMR
Glue
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
Lake Formation
s3
Digested
EMR
Glue
EMR
Glue
ML &
Analytics
SageMaker AI ServicesElasticsearch Athena
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
Lake Formation
s3
Digested
EMR
Glue
EMR
Glue
ML &
Analytics
SageMaker AI ServicesElasticsearch Athena
Kinesis
Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark
Stream
Flink on Kinesis
Data Analytics
Stream Analysis
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
Lake Formation
s3
Digested
EMR
Glue
EMR
Glue
ML &
Analytics
SageMaker AI ServicesElasticsearch Athena
Kinesis
Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark
Stream
Flink on Kinesis
Data Analytics
Stream Analysis
Kinesis
Data Firehose
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
Lake Formation
s3
Digested
EMR
Glue
EMR
Glue
ML &
Analytics
SageMaker AI ServicesElasticsearch Athena
Kinesis
Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark
Stream
Flink on Kinesis
Data Analytics
Stream Analysis
Event Capture Event Handler
Kinesis
Data Analytics
Lambda
Event Scoring
SageMaker
Event Action
AI Services
Step
Functions
Kinesis
Data Firehose
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
Lake Formation
s3
Digested
EMR
Glue
EMR
Glue
ML &
Analytics
SageMaker AI ServicesElasticsearch Athena
Kinesis
Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark
Stream
Flink on Kinesis
Data Analytics
Stream Analysis
Event Capture Event Handler
Kinesis
Data Analytics
Lambda
Event Scoring
SageMaker
Event Action
AI Services
Step
Functions
Kinesis
Data Firehose
Data Warehouse
Database
Elasticsearch
DynamoDB
Aurora
Amazon Redshift
ElastiCache
QuickSight
BI Reporting
Analytics
kibana
Near-Zero Latency
DocumentDB
Jupyter
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
Lake Formation
s3
Digested
EMR
Glue
EMR
Glue
ML &
Analytics
SageMaker AI ServicesElasticsearch Athena
Kinesis
Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark
Stream
Flink on Kinesis
Data Analytics
Stream Analysis
Event Capture Event Handler
Kinesis
Data Analytics
Lambda
Event Scoring
SageMaker
Event Action
AI Services
Step
Functions
Kinesis
Data Firehose
Data Warehouse
Database
Elasticsearch
DynamoDB
Aurora
Amazon Redshift
ElastiCache
QuickSight
BI Reporting
Analytics
kibana
Near-Zero Latency
DocumentDB
Jupyter
Athena
Federated Query
New Preview
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
Lake Formation
s3
Digested
EMR
Glue
EMR
Glue
ML &
Analytics
SageMaker AI ServicesElasticsearch Athena
Kinesis
Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark
Stream
Flink on Kinesis
Data Analytics
Stream Analysis
Event Capture Event Handler
Kinesis
Data Analytics
Lambda
Event Scoring
SageMaker
Event Action
AI Services
Step
Functions
Kinesis
Data Firehose
Data Warehouse
Database
Elasticsearch
DynamoDB
Aurora
Amazon Redshift
ElastiCache
QuickSight
BI Reporting
Analytics
kibana
Near-Zero Latency
DocumentDB
Jupyter
Athena
Federated Query
New Preview
Fargate
EKS
ECS
API Gateway
Lambda
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Connected
devices
Social media
GPS Location
Mobile
Internet
AWS Direct Connect
VPN
API Gateway
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
Kinesis
Kafka (MSK)
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
Lake Formation
s3
Digested
EMR
Glue
EMR
Glue
ML &
Analytics
SageMaker AI ServicesElasticsearch Athena
Kinesis
Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark
Stream
Flink on Kinesis
Data Analytics
Stream Analysis
Event Capture Event Handler
Kinesis
Data Analytics
Lambda
Event Scoring
SageMaker
Event Action
AI Services
Step
Functions
Kinesis
Data Firehose
Data Warehouse
Database
Elasticsearch
DynamoDB
Aurora
Amazon Redshift
ElastiCache
QuickSight
BI Reporting
Analytics
kibana
Near-Zero Latency
DocumentDB
Jupyter
Athena
Federated Query
New Preview
Fargate
EKS
ECS
API Gateway
Lambda
Data analysts
Data scientists
Business users
Engagement
platforms
Automation /
events
Amazon S3 is the foundation of any data lake
Multiple data
input sources
Supports many
unique users and
teams
Storage scales on
demand
Analyzed by
many applications
Amazon S3 as the foundation for data lakes
Durable, available, exabyte-scalable
Secure, compliant, auditable
High performance
Low-cost storage and analytics
Broad network integration
Amazon S3
AWS Lake Formation
& AWS Glue
AWS
Snowball
Amazon Kinesis
Data Streams
AWS
Snowmobile
Amazon
Kinesis
Data Firehose
Amazon
Redshift
Amazon
EMR
Amazon
Athena
Amazon Kinesis
Amazon
Elasticsearch
Service
Amazon
SageMaker
Amazon
Comprehend
Amazon
Rekognition
AWS Lake Formation
Build a secure data lake in days
Simplify security
management
Centrally define security, governance,
and auditing policies
Enforce policies consistently
across multiple services
Integrates with IAM and KMS
Provide self-service
access to data
Build a data catalog that
describes your data
Enable analysts and data scientists
to easily find relevant data
Analyze with multiple analytics
services without moving data
Build data lakes
quickly
Move, store, catalog,
and clean your data faster
Transform to open formats like
Parquet and ORC
ML-based deduplication
and record matching
Single Source of Truth for Raw Data
Use Least Transformations
Use Lifecycle policies to S3-IA or GlacierAmazon S3
Tier 1 Data Lake: Raw or Ingestion
Non-structed to structed Raw Data
Annotation
Data cleansing and transform
Uniform the data of encoding, format, types
(suchastimeformat,stringencoding,andetc)
Amazon S3
Tier 2 Data Lake: Curated
Use columnar formats – Parquet/ORC
Organized into Partitions
Coalescing to Larger Partitions over time
Optimized for Analytics
Amazon S3
Tier 3 Data Lake: Analytics
Domain Level DataMart
Organized by use cases
Optimized for Specialized Analysis
Amazon S3
Tier 4 Data Lake: Digested
(Serving Stage)
Amazon Redshift
Amazon Redshift: What’s Under the Hood?
Amazon Redshift
Seamless Data Lake Integration
Amazon Redshift is a fully managed data
warehouse service that extends
seamlessly to the data lake. It’s highly
performant, scalable, resilient, easy-to-
use, cost-effective, & secure.
Our portfolio
Broadanddeepportfolio,purpose-builtforbuilders
S3/Glacier
Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Managed Streaming for Kafka
Data Movement
Data Lake
Business Intelligence & Machine Learning
Data Exchange
Data exchange
NEW
QuickSight
Visualizations
SageMaker
ML
Comprehend
NLP
Transcribe
Speech-to-text
Textract
Extract text
Personalize
Recommendation
Forecast
Forecasts
Translate
Translation
CodeGuru
Code reviews
Kendra
Enterprise search
NEW NEW
Analytics Databases
Managed
Blockchain
Blockchain
Templates
Blockchain
Redshift
Data warehousing
EMR
Hadoop + Spark
Kinesis Data Analytics
Real time
Elasticsearch Service
Operational Analytics
Athena
Interactive analytics
NEW
NEW
NEWAQUA EMR on Outposts
UltraWarm
RDS
MySQL, PostgreSQL,
MariaDB, Oracle, SQL Server,
RDS on VMware
Aurora
MySQL, PostgreSQL
DynamoDB
Key value, Document
ElastiCache
Redis, Memcached
Neptune
Graph
Timestream
Time Series
QLDB
Ledger Database
Managed Apache
Cassandra Service
Wide column
NEW
DocumentDB
Document
NEW
NEW
RDS Proxy
RDS on Outposts
Broad database and analytics services portfolio
Relational
databases
Non-relational
databases
Data
warehouses
Hadoop
and Spark
Amazon
Redshift
Amazon
EMR
Operational
analytics
Amazon ES
Amazon
Aurora
Amazon
DynamoDB
Business
intelligence
Amazon
QuickSight
Amazon
RDS
Amazon
DocumentDB
Amazon
ElastiCache
Real-time
analytics
Amazon MSK
PostgreSQL
Logstash
Elasticsearch
Kibana
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Speed (Real-time)
Ingest ServingSource Scale (Batch)
Transactions
Web logs
cookies
ERP
Internet
Data analysts
Data scientists
Business users
Engagement
platforms
Connected
devices
Social media
Automation /
events
GPS Location
AWS Direct Connect
API Gateway
VPN
Mobile
SFTP
AWS DMS
Storage Gateway
AppSync
Amazon MQ
s3
Raw
s3
Curated
s3
Analytics
EMR
Glue
Data lake
1011010
0011110010110
0000101
ML &
Analytics
Lake Formation
SageMaker AI Services
s3
Digested
EMR
Glue
EMR
Glue
Kinesis
Kafka (MSK)
Elasticsearch AthenaKinesis
Data Firehose
Event Capture Event Handler
Kinesis
Data Analytics
Lambda
Event Scoring
SageMaker
Event Action
AI Services
Step
Functions
Athena
Federated Query
New Preview
Fargate
EKS
ECS
API Gateway
Lambda
Data Warehouse
Database
Elasticsearch
DynamoDB
Aurora
Amazon Redshift
ElastiCache
QuickSight
BI Reporting
Analytics
kibana
Near-Zero Latency
DocumentDB
Jupyter
Kinesis
Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark
Stream
Flink on Kinesis
Data Analytics
Stream Analysis
Complemented by AWS Partner Network (APN)
Collection & preparation Governance Visualization
Data and analytics strategic & competency partners
Learn storage with AWS Training and Certification
45+ free digital courses cover topics related to cloud
storage, including:
Resources created by the experts at AWS to help you build cloud storage skills
Classroom offerings, such as Architecting on AWS, feature
AWS expert instructors and hands-on activities
• Amazon S3
• AWS Storage Gateway
• Amazon S3 Glacier
• Amazon Elastic File System
(Amazon EFS)
• Amazon Elastic Block Store
(Amazon EBS)
Visit the storage learning path at https://aws.training/storage

Contenu connexe

Tendances

DEM06 How Demandbase Cut Its Container Costs by 79%
DEM06 How Demandbase Cut Its Container Costs by 79%DEM06 How Demandbase Cut Its Container Costs by 79%
DEM06 How Demandbase Cut Its Container Costs by 79%Amazon Web Services
 
AWS Customer Presentation - Angelbeat Princeton Seminar
AWS Customer Presentation -  Angelbeat Princeton SeminarAWS Customer Presentation -  Angelbeat Princeton Seminar
AWS Customer Presentation - Angelbeat Princeton SeminarAmazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 
Track 5 Session 2_SEC01 多重帳戶安全策略與方針.pptx
Track 5 Session 2_SEC01 多重帳戶安全策略與方針.pptxTrack 5 Session 2_SEC01 多重帳戶安全策略與方針.pptx
Track 5 Session 2_SEC01 多重帳戶安全策略與方針.pptxAmazon Web Services
 
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptxTrack 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptxAmazon Web Services
 
Track 4 Session 5_ 架構即代碼 – AWS CDK 與 CDK8S 聯手打造下一代的 K8S 應用
Track 4 Session 5_ 架構即代碼 – AWS CDK 與 CDK8S 聯手打造下一代的 K8S 應用Track 4 Session 5_ 架構即代碼 – AWS CDK 與 CDK8S 聯手打造下一代的 K8S 應用
Track 4 Session 5_ 架構即代碼 – AWS CDK 與 CDK8S 聯手打造下一代的 K8S 應用Amazon Web Services
 
Webinar: Make Your Cloud Strategy Work for 2016
Webinar: Make Your Cloud Strategy Work for 2016Webinar: Make Your Cloud Strategy Work for 2016
Webinar: Make Your Cloud Strategy Work for 2016Alexandra Sasha Tchulkova
 
Journey Through the AWS Cloud; Building Powerful Web Applications
Journey Through the AWS Cloud; Building Powerful Web ApplicationsJourney Through the AWS Cloud; Building Powerful Web Applications
Journey Through the AWS Cloud; Building Powerful Web ApplicationsAmazon Web Services
 
Track 5 Session 4_ intel 透過AWS Outposts就地佈署 on-premises 雲端環境.pptx
Track 5 Session 4_ intel 透過AWS Outposts就地佈署 on-premises 雲端環境.pptxTrack 5 Session 4_ intel 透過AWS Outposts就地佈署 on-premises 雲端環境.pptx
Track 5 Session 4_ intel 透過AWS Outposts就地佈署 on-premises 雲端環境.pptxAmazon Web Services
 
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptx
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptxTrack 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptx
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptxAmazon Web Services
 
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Amazon Web Services
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxAmazon Web Services
 
Real-time Visibility at Scale with Sumo Logic
Real-time Visibility at Scale with Sumo LogicReal-time Visibility at Scale with Sumo Logic
Real-time Visibility at Scale with Sumo LogicAmazon Web Services
 
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Amazon Web Services
 
Using AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your ApplicationsUsing AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your ApplicationsAmazon Web Services
 
Track 3 Session 2_從傳統 legacy 邁向數位化與現代化架構
Track 3 Session 2_從傳統  legacy  邁向數位化與現代化架構Track 3 Session 2_從傳統  legacy  邁向數位化與現代化架構
Track 3 Session 2_從傳統 legacy 邁向數位化與現代化架構Amazon Web Services
 
Transform Your Business with VMware Cloud on AWS: Technical Overview
Transform Your Business with VMware Cloud on AWS: Technical Overview Transform Your Business with VMware Cloud on AWS: Technical Overview
Transform Your Business with VMware Cloud on AWS: Technical Overview Amazon Web Services
 
Come costruire apllicazioni "12-factor microservices" in AWS
Come costruire apllicazioni "12-factor microservices" in AWSCome costruire apllicazioni "12-factor microservices" in AWS
Come costruire apllicazioni "12-factor microservices" in AWSAmazon Web Services
 

Tendances (20)

DEM06 How Demandbase Cut Its Container Costs by 79%
DEM06 How Demandbase Cut Its Container Costs by 79%DEM06 How Demandbase Cut Its Container Costs by 79%
DEM06 How Demandbase Cut Its Container Costs by 79%
 
AWS Customer Presentation - Angelbeat Princeton Seminar
AWS Customer Presentation -  Angelbeat Princeton SeminarAWS Customer Presentation -  Angelbeat Princeton Seminar
AWS Customer Presentation - Angelbeat Princeton Seminar
 
AWS 資料數據與 IoT
AWS 資料數據與 IoTAWS 資料數據與 IoT
AWS 資料數據與 IoT
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 
Track 5 Session 2_SEC01 多重帳戶安全策略與方針.pptx
Track 5 Session 2_SEC01 多重帳戶安全策略與方針.pptxTrack 5 Session 2_SEC01 多重帳戶安全策略與方針.pptx
Track 5 Session 2_SEC01 多重帳戶安全策略與方針.pptx
 
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptxTrack 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
 
Track 4 Session 5_ 架構即代碼 – AWS CDK 與 CDK8S 聯手打造下一代的 K8S 應用
Track 4 Session 5_ 架構即代碼 – AWS CDK 與 CDK8S 聯手打造下一代的 K8S 應用Track 4 Session 5_ 架構即代碼 – AWS CDK 與 CDK8S 聯手打造下一代的 K8S 應用
Track 4 Session 5_ 架構即代碼 – AWS CDK 與 CDK8S 聯手打造下一代的 K8S 應用
 
Webinar: Make Your Cloud Strategy Work for 2016
Webinar: Make Your Cloud Strategy Work for 2016Webinar: Make Your Cloud Strategy Work for 2016
Webinar: Make Your Cloud Strategy Work for 2016
 
AWS 微服務架構分享
AWS 微服務架構分享AWS 微服務架構分享
AWS 微服務架構分享
 
Journey Through the AWS Cloud; Building Powerful Web Applications
Journey Through the AWS Cloud; Building Powerful Web ApplicationsJourney Through the AWS Cloud; Building Powerful Web Applications
Journey Through the AWS Cloud; Building Powerful Web Applications
 
Track 5 Session 4_ intel 透過AWS Outposts就地佈署 on-premises 雲端環境.pptx
Track 5 Session 4_ intel 透過AWS Outposts就地佈署 on-premises 雲端環境.pptxTrack 5 Session 4_ intel 透過AWS Outposts就地佈署 on-premises 雲端環境.pptx
Track 5 Session 4_ intel 透過AWS Outposts就地佈署 on-premises 雲端環境.pptx
 
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptx
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptxTrack 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptx
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptx
 
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
 
Real-time Visibility at Scale with Sumo Logic
Real-time Visibility at Scale with Sumo LogicReal-time Visibility at Scale with Sumo Logic
Real-time Visibility at Scale with Sumo Logic
 
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
 
Using AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your ApplicationsUsing AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your Applications
 
Track 3 Session 2_從傳統 legacy 邁向數位化與現代化架構
Track 3 Session 2_從傳統  legacy  邁向數位化與現代化架構Track 3 Session 2_從傳統  legacy  邁向數位化與現代化架構
Track 3 Session 2_從傳統 legacy 邁向數位化與現代化架構
 
Transform Your Business with VMware Cloud on AWS: Technical Overview
Transform Your Business with VMware Cloud on AWS: Technical Overview Transform Your Business with VMware Cloud on AWS: Technical Overview
Transform Your Business with VMware Cloud on AWS: Technical Overview
 
Come costruire apllicazioni "12-factor microservices" in AWS
Come costruire apllicazioni "12-factor microservices" in AWSCome costruire apllicazioni "12-factor microservices" in AWS
Come costruire apllicazioni "12-factor microservices" in AWS
 

Similaire à Track 6 Session 2_ 搭建現代化的資料數據湖.pptx

Modern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & EngagementModern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & EngagementAmazon Web Services
 
Modern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementModern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementAmazon Web Services
 
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017Building a Data Processing Pipeline on AWS - AWS Summit SG 2017
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017Amazon Web Services
 
Building a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWSBuilding a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWSAmazon Web Services
 
20141021 AWS Cloud Taekwon - Big Data on AWS
20141021 AWS Cloud Taekwon - Big Data on AWS20141021 AWS Cloud Taekwon - Big Data on AWS
20141021 AWS Cloud Taekwon - Big Data on AWSAmazon Web Services Korea
 
So You Think You're an AWS Master aka Serverless Computing
So You Think You're an AWS Master aka Serverless ComputingSo You Think You're an AWS Master aka Serverless Computing
So You Think You're an AWS Master aka Serverless ComputingAmazon Web Services
 
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Amazon Web Services
 
Serverless Big Data Architectures: Serverless Data Analytics
Serverless Big Data Architectures: Serverless Data AnalyticsServerless Big Data Architectures: Serverless Data Analytics
Serverless Big Data Architectures: Serverless Data AnalyticsKristana Kane
 
Seminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJSeminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJAlex Barbosa Coqueiro
 
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...Amazon Web Services
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
Serverless Generative AI on AWS, AWS User Groups of Florida
Serverless Generative AI on AWS, AWS User Groups of FloridaServerless Generative AI on AWS, AWS User Groups of Florida
Serverless Generative AI on AWS, AWS User Groups of FloridaCloudHesive
 
AWS Cloud Experience CA: Data Lakes & Analytics en AWS
AWS Cloud Experience CA: Data Lakes & Analytics en AWSAWS Cloud Experience CA: Data Lakes & Analytics en AWS
AWS Cloud Experience CA: Data Lakes & Analytics en AWSAmazon Web Services LATAM
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at ScaleModern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at ScaleAmazon Web Services
 
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)Amazon Web Services
 
Data Con LA 2019 - Large scale streaming analytics using cloud based managed ...
Data Con LA 2019 - Large scale streaming analytics using cloud based managed ...Data Con LA 2019 - Large scale streaming analytics using cloud based managed ...
Data Con LA 2019 - Large scale streaming analytics using cloud based managed ...Data Con LA
 
데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)Amazon Web Services Korea
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Amazon Web Services
 
Opening Keynote - AWS Summit SG 2017
Opening Keynote - AWS Summit SG 2017Opening Keynote - AWS Summit SG 2017
Opening Keynote - AWS Summit SG 2017Amazon Web Services
 

Similaire à Track 6 Session 2_ 搭建現代化的資料數據湖.pptx (20)

Modern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & EngagementModern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & Engagement
 
Modern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementModern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagement
 
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017Building a Data Processing Pipeline on AWS - AWS Summit SG 2017
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017
 
Building a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWSBuilding a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWS
 
20141021 AWS Cloud Taekwon - Big Data on AWS
20141021 AWS Cloud Taekwon - Big Data on AWS20141021 AWS Cloud Taekwon - Big Data on AWS
20141021 AWS Cloud Taekwon - Big Data on AWS
 
So You Think You're an AWS Master aka Serverless Computing
So You Think You're an AWS Master aka Serverless ComputingSo You Think You're an AWS Master aka Serverless Computing
So You Think You're an AWS Master aka Serverless Computing
 
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
 
Serverless Big Data Architectures: Serverless Data Analytics
Serverless Big Data Architectures: Serverless Data AnalyticsServerless Big Data Architectures: Serverless Data Analytics
Serverless Big Data Architectures: Serverless Data Analytics
 
Seminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJSeminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJ
 
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
Serverless Generative AI on AWS, AWS User Groups of Florida
Serverless Generative AI on AWS, AWS User Groups of FloridaServerless Generative AI on AWS, AWS User Groups of Florida
Serverless Generative AI on AWS, AWS User Groups of Florida
 
Media Workloads on AWS
Media Workloads on AWSMedia Workloads on AWS
Media Workloads on AWS
 
AWS Cloud Experience CA: Data Lakes & Analytics en AWS
AWS Cloud Experience CA: Data Lakes & Analytics en AWSAWS Cloud Experience CA: Data Lakes & Analytics en AWS
AWS Cloud Experience CA: Data Lakes & Analytics en AWS
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at ScaleModern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
 
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
 
Data Con LA 2019 - Large scale streaming analytics using cloud based managed ...
Data Con LA 2019 - Large scale streaming analytics using cloud based managed ...Data Con LA 2019 - Large scale streaming analytics using cloud based managed ...
Data Con LA 2019 - Large scale streaming analytics using cloud based managed ...
 
데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
 
Opening Keynote - AWS Summit SG 2017
Opening Keynote - AWS Summit SG 2017Opening Keynote - AWS Summit SG 2017
Opening Keynote - AWS Summit SG 2017
 

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSAmazon Web Services
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAmazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWS
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei server
 

Track 6 Session 2_ 搭建現代化的資料數據湖.pptx

  • 1. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. 搭建現代化的資料數據湖 Young Yang T r a c k 6 | S e s s i o n 2 ML Specialist SA Amazon Web Services
  • 2. AWS offers a modern data platform BI + A NA LYT I C S OLTP ERP CRM DW SILO 1 BUSINESS INTELLIGENCE DEVICES WEB LOGS MOBILE APPS DW SILO 2 LOB APPS BUSINESS INTELLIGENCE to MA C H I NE LE A R NI NG DA T A WA R E H O US I NG Data lakes OPEN FORMATS CENTRAL CATALOG (CSV, ORC, Parquet, Avro) Data silos Old guard data patterns Modern data architecture
  • 3. After this session, what you will take away?
  • 4. Speed (Real-time) Ingest ServingSource Scale (Batch) Data analysts Data scientists Business users Engagement platforms Automation / events Internet AWS Direct Connect VPN Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 Lake Formation s3 Digested EMR Glue EMR Glue API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK) ML & Analytics SageMaker AI ServicesElasticsearch AthenaKinesis Data Firehose Event Capture Event Handler Kinesis Data Analytics Lambda Event Scoring SageMaker Event Action AI Services Step Functions Fargate EKS ECS API Gateway Lambda Data Warehouse Database Elasticsearch DynamoDB Aurora Amazon Redshift ElastiCache QuickSight BI Reporting Analytics kibana Near-Zero Latency DocumentDB Jupyter Kinesis Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark Stream Flink on Kinesis Data Analytics Stream Analysis Athena Federated Query New Preview
  • 5. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile
  • 8. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN
  • 9. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK)
  • 10. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK) s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 Lake Formation s3 Digested EMR Glue EMR Glue
  • 11. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK) s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 Lake Formation s3 Digested EMR Glue EMR Glue ML & Analytics SageMaker AI ServicesElasticsearch Athena
  • 12. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK) s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 Lake Formation s3 Digested EMR Glue EMR Glue ML & Analytics SageMaker AI ServicesElasticsearch Athena Kinesis Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark Stream Flink on Kinesis Data Analytics Stream Analysis
  • 13. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK) s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 Lake Formation s3 Digested EMR Glue EMR Glue ML & Analytics SageMaker AI ServicesElasticsearch Athena Kinesis Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark Stream Flink on Kinesis Data Analytics Stream Analysis Kinesis Data Firehose
  • 14. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK) s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 Lake Formation s3 Digested EMR Glue EMR Glue ML & Analytics SageMaker AI ServicesElasticsearch Athena Kinesis Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark Stream Flink on Kinesis Data Analytics Stream Analysis Event Capture Event Handler Kinesis Data Analytics Lambda Event Scoring SageMaker Event Action AI Services Step Functions Kinesis Data Firehose
  • 15. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK) s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 Lake Formation s3 Digested EMR Glue EMR Glue ML & Analytics SageMaker AI ServicesElasticsearch Athena Kinesis Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark Stream Flink on Kinesis Data Analytics Stream Analysis Event Capture Event Handler Kinesis Data Analytics Lambda Event Scoring SageMaker Event Action AI Services Step Functions Kinesis Data Firehose Data Warehouse Database Elasticsearch DynamoDB Aurora Amazon Redshift ElastiCache QuickSight BI Reporting Analytics kibana Near-Zero Latency DocumentDB Jupyter
  • 16. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK) s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 Lake Formation s3 Digested EMR Glue EMR Glue ML & Analytics SageMaker AI ServicesElasticsearch Athena Kinesis Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark Stream Flink on Kinesis Data Analytics Stream Analysis Event Capture Event Handler Kinesis Data Analytics Lambda Event Scoring SageMaker Event Action AI Services Step Functions Kinesis Data Firehose Data Warehouse Database Elasticsearch DynamoDB Aurora Amazon Redshift ElastiCache QuickSight BI Reporting Analytics kibana Near-Zero Latency DocumentDB Jupyter Athena Federated Query New Preview
  • 17. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK) s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 Lake Formation s3 Digested EMR Glue EMR Glue ML & Analytics SageMaker AI ServicesElasticsearch Athena Kinesis Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark Stream Flink on Kinesis Data Analytics Stream Analysis Event Capture Event Handler Kinesis Data Analytics Lambda Event Scoring SageMaker Event Action AI Services Step Functions Kinesis Data Firehose Data Warehouse Database Elasticsearch DynamoDB Aurora Amazon Redshift ElastiCache QuickSight BI Reporting Analytics kibana Near-Zero Latency DocumentDB Jupyter Athena Federated Query New Preview Fargate EKS ECS API Gateway Lambda
  • 18. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Connected devices Social media GPS Location Mobile Internet AWS Direct Connect VPN API Gateway SFTP AWS DMS Storage Gateway AppSync Amazon MQ Kinesis Kafka (MSK) s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 Lake Formation s3 Digested EMR Glue EMR Glue ML & Analytics SageMaker AI ServicesElasticsearch Athena Kinesis Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark Stream Flink on Kinesis Data Analytics Stream Analysis Event Capture Event Handler Kinesis Data Analytics Lambda Event Scoring SageMaker Event Action AI Services Step Functions Kinesis Data Firehose Data Warehouse Database Elasticsearch DynamoDB Aurora Amazon Redshift ElastiCache QuickSight BI Reporting Analytics kibana Near-Zero Latency DocumentDB Jupyter Athena Federated Query New Preview Fargate EKS ECS API Gateway Lambda Data analysts Data scientists Business users Engagement platforms Automation / events
  • 19. Amazon S3 is the foundation of any data lake Multiple data input sources Supports many unique users and teams Storage scales on demand Analyzed by many applications
  • 20. Amazon S3 as the foundation for data lakes Durable, available, exabyte-scalable Secure, compliant, auditable High performance Low-cost storage and analytics Broad network integration Amazon S3 AWS Lake Formation & AWS Glue AWS Snowball Amazon Kinesis Data Streams AWS Snowmobile Amazon Kinesis Data Firehose Amazon Redshift Amazon EMR Amazon Athena Amazon Kinesis Amazon Elasticsearch Service Amazon SageMaker Amazon Comprehend Amazon Rekognition
  • 21. AWS Lake Formation Build a secure data lake in days Simplify security management Centrally define security, governance, and auditing policies Enforce policies consistently across multiple services Integrates with IAM and KMS Provide self-service access to data Build a data catalog that describes your data Enable analysts and data scientists to easily find relevant data Analyze with multiple analytics services without moving data Build data lakes quickly Move, store, catalog, and clean your data faster Transform to open formats like Parquet and ORC ML-based deduplication and record matching
  • 22. Single Source of Truth for Raw Data Use Least Transformations Use Lifecycle policies to S3-IA or GlacierAmazon S3 Tier 1 Data Lake: Raw or Ingestion
  • 23. Non-structed to structed Raw Data Annotation Data cleansing and transform Uniform the data of encoding, format, types (suchastimeformat,stringencoding,andetc) Amazon S3 Tier 2 Data Lake: Curated
  • 24. Use columnar formats – Parquet/ORC Organized into Partitions Coalescing to Larger Partitions over time Optimized for Analytics Amazon S3 Tier 3 Data Lake: Analytics
  • 25. Domain Level DataMart Organized by use cases Optimized for Specialized Analysis Amazon S3 Tier 4 Data Lake: Digested (Serving Stage)
  • 26. Amazon Redshift Amazon Redshift: What’s Under the Hood? Amazon Redshift Seamless Data Lake Integration Amazon Redshift is a fully managed data warehouse service that extends seamlessly to the data lake. It’s highly performant, scalable, resilient, easy-to- use, cost-effective, & secure.
  • 27. Our portfolio Broadanddeepportfolio,purpose-builtforbuilders S3/Glacier Glue ETL & Data Catalog Lake Formation Data Lakes Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Managed Streaming for Kafka Data Movement Data Lake Business Intelligence & Machine Learning Data Exchange Data exchange NEW QuickSight Visualizations SageMaker ML Comprehend NLP Transcribe Speech-to-text Textract Extract text Personalize Recommendation Forecast Forecasts Translate Translation CodeGuru Code reviews Kendra Enterprise search NEW NEW Analytics Databases Managed Blockchain Blockchain Templates Blockchain Redshift Data warehousing EMR Hadoop + Spark Kinesis Data Analytics Real time Elasticsearch Service Operational Analytics Athena Interactive analytics NEW NEW NEWAQUA EMR on Outposts UltraWarm RDS MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, RDS on VMware Aurora MySQL, PostgreSQL DynamoDB Key value, Document ElastiCache Redis, Memcached Neptune Graph Timestream Time Series QLDB Ledger Database Managed Apache Cassandra Service Wide column NEW DocumentDB Document NEW NEW RDS Proxy RDS on Outposts
  • 28. Broad database and analytics services portfolio Relational databases Non-relational databases Data warehouses Hadoop and Spark Amazon Redshift Amazon EMR Operational analytics Amazon ES Amazon Aurora Amazon DynamoDB Business intelligence Amazon QuickSight Amazon RDS Amazon DocumentDB Amazon ElastiCache Real-time analytics Amazon MSK PostgreSQL Logstash Elasticsearch Kibana
  • 29. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 30. Speed (Real-time) Ingest ServingSource Scale (Batch) Transactions Web logs cookies ERP Internet Data analysts Data scientists Business users Engagement platforms Connected devices Social media Automation / events GPS Location AWS Direct Connect API Gateway VPN Mobile SFTP AWS DMS Storage Gateway AppSync Amazon MQ s3 Raw s3 Curated s3 Analytics EMR Glue Data lake 1011010 0011110010110 0000101 ML & Analytics Lake Formation SageMaker AI Services s3 Digested EMR Glue EMR Glue Kinesis Kafka (MSK) Elasticsearch AthenaKinesis Data Firehose Event Capture Event Handler Kinesis Data Analytics Lambda Event Scoring SageMaker Event Action AI Services Step Functions Athena Federated Query New Preview Fargate EKS ECS API Gateway Lambda Data Warehouse Database Elasticsearch DynamoDB Aurora Amazon Redshift ElastiCache QuickSight BI Reporting Analytics kibana Near-Zero Latency DocumentDB Jupyter Kinesis Data Analytics LambdaCloudWatch ElasticsearchEMR -Spark Stream Flink on Kinesis Data Analytics Stream Analysis
  • 31. Complemented by AWS Partner Network (APN) Collection & preparation Governance Visualization
  • 32. Data and analytics strategic & competency partners
  • 33. Learn storage with AWS Training and Certification 45+ free digital courses cover topics related to cloud storage, including: Resources created by the experts at AWS to help you build cloud storage skills Classroom offerings, such as Architecting on AWS, feature AWS expert instructors and hands-on activities • Amazon S3 • AWS Storage Gateway • Amazon S3 Glacier • Amazon Elastic File System (Amazon EFS) • Amazon Elastic Block Store (Amazon EBS) Visit the storage learning path at https://aws.training/storage

Notes de l'éditeur

  1. One of the most common challenges afflicting legacy data architectures is that data that is collected, but proves difficult to extract value from. For example, data is difficult to access, costly to refine or analyze, not catalogued, in proprietary formats or platforms, etc. Traditional architectures & on-prem data warehousing pose many challenges Can’t scale easily/on-demand; long lead times for hardware procurement & upgrades High overhead costs for administration Proprietary formats & silo’d data make it costly & complex to access, refine, & join data from different sources Data not catalogued and/or of unreliable quality Cold and warm data inseparable – bloated costs & wasted capacity Anti-democratization – limits on how many users & how much data can be accommodated Inspire other legacy architecture patterns – e.g. retrofitting use cases to accommodate the wrong tools for the job, instead of simply using the right tool for each use case These challenges lead to dark data – data that is collected but challenging to extract insights. Dark data is no longer tenable. Amazon Redshift + Data Lake solution helps turn dark data into free data
  2. Before you start your set up, you want to think about where you are storing all your data. 1/ You want to store all the structured and unstructured data in a single place and 2/ ensure that you can immediately start pushing data in from different systems. 3/ And at the same time, when you discover new use cases or your business expands to newer domains, you can plug in more applications that can start analyzing that data without a need to re-think your architecture. 4/ Finally, you want to build a data lake that scales? That stands the test of time? Because you won’t know today all the use cases you’ll want to use your data lake for. It is difficult to know exactly which data sets are important and how they should be cleaned, enriched, and transformed to solve different business problems. All of this and more is exactly why you should be choosing s3 as the foundation of your data lake storage.
  3. S3: ubiquitous storage allows you to centralize datasets. It’s SIMPLE, and it has consistent behavior and predictability in operations The native features of S3 are exactly what you want from a Data Lake 11 9's of durability, HA, and scalable Best security, compliance, and audit capabilities. Object-level controls Massively parallel and scalable Cost-effective storage classes Broad ecosystem: catalog, ingest, and gathering insights
  4. 1/ Build data lakes quickly: With Lake Formation, you can move, store, catalog, and clean your data faster. You simply point Lake Formation at your data sources, and Lake Formation crawls those sources and moves the data into your new Amazon S3 data lake. Lake Formation also changes data into open formats like Apache Parquet and ORC for faster analytics. 2/ Enforce security policies across multiple services: In addition, you can use Lake Formation to centrally define security, governance, and auditing policies in one place, and then enforce those policies for your users across multiple services that access data stored in the data lake. This reduces the effort in configuring policies across services and provides consistent enforcement and compliance. 3/ Provide self-service access to data:Lake Formation helps you build a data catalog that describes the different data sets that are available along with which groups of users have access to each. This makes your users more productive by helping them find the right data set to analyze. By providing a central catalog of your data, LakeFormation makes it easier for your analysts and data scientists to find and access the data they need. TRANSITION: In this new age of massive data needs and capabilities, people want to consume data differently, too. The cloud has enabled such large datasets and cost-effective computing/analytics that customers are hungry to have an easy way to find the big, useful datasets, incorporate them into their data lakes and analytics, but it's hard today. -------------------------------------BACKGROUND------------------------------------ AWS Lake Formation automates many of the steps required to set up a data lake, allowing customers to get started with just a few clicks from a single, unified dashboard. 1/ Move, store, catalog, and clean your data faster: To get started you add connection information for the data stores you want to move data from, or point Lake Formation to data that has been moved by Kinesis, or identify data from an AWS database, and then Lake Formation will crawl those sources to identify the layout of the data. Then you train Lake Formation with ML to clean and prepare the data. To start training Lake Formation, you provide examples of what you would like your data to look like after it’s been cleaned, for example, you can train Lake Formation to dedupe locations in a commercial insurance database. This training process can be as quick as 15 minutes. Inside Amazon, this same technology is used to de-duplicate and match data records for things like movies, products, and points of interest. Then the cleaned data is written to your new Amazon S3 data lake. 2/ Enforce security policies across multiple services: From a single screen, you can set up permissions for specified users, and those permissions are implemented across security services like AWS Identity and Access Management and AWS Key Management Service, storage services like Amazon S3, and analytics services like Amazon Redshift, Amazon Athena, and Amazon EMR. Lake Formation enforces access permissions and policies by only allowing users with the right credentials to decrypt the data in the data lake. The only way you can access the data in your data lake is by authenticating to Lake Formation with a username and password or using single sign-on. If you have permission to access the data, Lake Formation will give you a temporary key that you use to decrypt and analyze the data. Lake Formation keeps your data lake secure, reduces the hassle in re-defining policies across multiple services, and provides consistent enforcement and compliance of those policies. 3/ Gain and manage new insights: As Lake Formation adds your data, it builds a catalog, based on the data layout, that describes the content of your data lake. You can then add text labels with more detail to better describe specific datasets. Using the catalog, users can more easily search and find the data they need for their analysis based on the details you’ve added. It seems like such a simple thing, but it makes a tremendous difference in productivity when your analysts can find the right data for the analysis they are trying to perform. Of course, like everything else in Lake Formation, this catalog also enforces your security rules consistently by only showing users the data they are allowed to see. Once you’ve found the right data, Lake Formation makes it easier for your analysts and data scientists to securely extract data for analysis using tools like Athena, Redshift, EMR, Sagemaker, and QuickSight across diverse data sets. (How to add a label: Select a data source within your data lake, then click “Edit metadata” and begin adding descriptors to further classify the data to make it easier for users to find and use the data they need from the data lake.) Lake Formation Use Cases: • Amgen is the world's largest independent biotechnology company. “At Amgen we've been heavy users of Amazon Redshift and Amazon EMR clusters for over three years. Setting up security and access controls for each AWS account, service, user, and data set at the level of detail that was required could be cumbersome,” said KerbyJohnson, Enterprise Data Lake Product Owner, Amgen. “AWS Lake Formation streamlines the process with a central point of control while also enabling us to manage who is using our data, and how, with more detail. AWS Lake Formation allows us to manage permissions on Amazon S3 objects like we would manage permissions on data in a database. Our users will be able to find, access, and analyze the data they need with the tools they prefer. This new workflow can make everyone more productive when using Amgen’s data.” • Life360 is the world's leading peace of mind service for families. The Life360 app brings families closer with smart features designed to protect and connect the people who matter most. “We wanted to use AWS Lake Formation to build our data lake for supporting location-based time-series data, and make it much easier to load data. The pre-fabricated blueprints helped get data into the data lake without our data engineering team having to write code from scratch, so they could focus on operationalizing ingest, not reinventing the wheel,” said Richard Chennault, Head of Cloud and Data Services, Life360, Inc. “With AWS Lake Formation we were able to quickly unlock data available in Amazon S3 and make it available to analyze across a broad spectrum of AWS data services. The data remains in place in Amazon S3, we can analyze it in many different ways, and we maintain full control over it.” • Accenture is a leading global professional services company, providing a broad range of services and solutions in strategy, consulting, digital, technology, and operations. “I focus on helping clients in their ‘Data on Cloud’ journey. Specific to that, we have seen that organizations are dealing with a lack of trusted data when they need to perform analytics on data coming from multiple sources,” said Namrata Maheshwary, Senior Architect for the Data Business Group, Accenture. “Data cleansing is a critical step in data analytics and can greatly impact the business outcome and decision making. The new features in AWS Lake Formation have been hugely beneficial to address the challenge of data veracity and securing access to the data lake. We found it tremendously useful to make use of the advanced machine learning techniques for data preparation to find matching records, clean, and deduplicate data from different data sources. This will help reduce the time, effort, and cost, while improving the quality and accuracy of the data in a customer’s data lakes.” Other Top Brands Using Lake Formation: Fender, Change Healthcare, Panasonic, Zalando, Change Healthcare, Cloudreach, Alcon, Quantiphi
  5. http://aws.amazon.com/redshift Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. You can start small for just $0.25 per hour with no commitments or upfront costs and scale to a petabyte or more for $1,000 per terabyte per year, less than a tenth of most other data warehousing solutions. Data transfer for >350TB/month is 0.05/GB = $50/TB – so if you’re pulling down 10TB/day, you’re looking at $600/year. A 10TB cluster would cost $10k/year = 6%. $1060/TB/Year
  6. https://en.wikipedia.org/wiki/Data_curation
  7. http://aws.amazon.com/redshift Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. You can start small for just $0.25 per hour with no commitments or upfront costs and scale to a petabyte or more for $1,000 per terabyte per year, less than a tenth of most other data warehousing solutions. Data transfer for >350TB/month is 0.05/GB = $50/TB – so if you’re pulling down 10TB/day, you’re looking at $600/year. A 10TB cluster would cost $10k/year = 6%. $1060/TB/Year
  8. http://aws.amazon.com/redshift Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. You can start small for just $0.25 per hour with no commitments or upfront costs and scale to a petabyte or more for $1,000 per terabyte per year, less than a tenth of most other data warehousing solutions. Data transfer for >350TB/month is 0.05/GB = $50/TB – so if you’re pulling down 10TB/day, you’re looking at $600/year. A 10TB cluster would cost $10k/year = 6%. $1060/TB/Year
  9. Battle-hardened database, re-architected into a cloud-first MPP data warehouse with resilient columnar storage and robust OLAP functionality Amazon Redshift started out as a PostGres fork, but we completely rewrote the storage engine to be columnar, we made it an OLAP relational data store by adding analytics functions such as window operations, and we also made it an Massively Parallel Processing system so that it scales significantly large. We have preserved compatibility with PostGres, which is why you could actually use a PostGres driver to connect to Redshift, but it is important to note that Redshift is an OLAP relational database – not an OLTP relational database like PostGres. We then leveraged and integrated Amazon Redshift with other AWS services in the AWS ecosystem such as VPC’s, KMS and IAM for security, S3 for data lake integration and backups, EC2s for its cluster implementation, and CloudWatch for monitoring All of this together makes up the service that we know as Amazon Redshift
  10. AWS offers the broadest set of databases and analytics services for customers to lift and shift their database and analytics workloads to the cloud. And customers are doing this at record levels across many different areas: 1/ relational databases – For customers wanting to move away from self-managing Oracle, SQL Server, MySQL, PostgreSQL, and MariaDB databases, AWS offers Amazon RDS and Amazon Aurora. 2/ non-relational databases – For customers wanting to move away from self-managed non-relational document- and key-value stores such as MongoDB, Redis, and Memcached, AWS offers DynamoDB, DocumentDB and ElastiCache. 3/ Data Warehouses – customers want to move from their expensive, proprietary Teradata, Oracle and SQL Server Data Warehouses to Amazon Redshift. 4/ Hadoop and Spark – customers want to move from their Hadoop and Spark deployments on-premises to EMR for cost savings and having a managed service. 5/ operational analytics – customers want to move from their elasticsearch, logstash, and kibana (ELK) on-premises to Elasticsearch Service for cost savings and having a managed service. 6/ real-time analytics – customers want to move from their Apache Kafka deployments to Amazon Managed Streaming for Kafka.
  11. AWS analytics services are complemented by a number of third-party software vendors, supplementing our in-house services with solutions around data collection and preparation, governance, and business intelligence/visualization.
  12. If customers want independent help in choosing and implementing analytics solutions, AWS has a wide range of global and specialized competency partners to assist.
  13. If you’re ready to continue learning, we offer 45+ free digital courses around storage, including Backup and Restore with AWS (90 minutes) and Migrating and Tiering Storage to AWS (1 hour). You can also take Architecting on AWS classroom training to get hands on practice and learn directly from an instructor. Visit the storage learning path for to learn how to get started learning about storage.