4. Rapid expansion of data requirements
Explosion of data Microservices change data
and analytics requirements
Accelerated rate of change
driven by DevOps
Data grows 10x every 5 years
driven by network-connected
smart devices
Microservices architecture
decreases need for “one size fits all’
databases and increases need for
real-time monitoring and analytics
Dev Ops
Transition from IT to DevOps
increases rate of change
5. 010010010
01010001
100010100
Data
1 Break free from
legacy databases
Move to managed2
Turn data to insights5
Build
data-driven apps
4
Modernize your
data warehouse
3
Data flywheel
Modernize your data infrastructure
Get the most value from your data
6. Hardware and software installation
Database configuration, patching, and backups
Cluster setup and data replication for high availability
Capacity planning, and scaling clusters for compute and storage
Managing databases on-premises:
Time-consuming and complex
7. The thorns of legacy databases
Costly Proprietary Lock-in Punitive licensing You’ve got mail
Audit
8. Break free with AWS
Performance at scale
Fully managed
Cost effective
Reliable
11. Performance
& scalability
5x throughput of standard
MySQL and 3x of standard
PostgreSQL; scale out up
to 15 read replicas
Availability
& durability
Fault-tolerant, self-healing
storage; 6 copies of
data across 3 AZs;
continuous backup to Amazon S3
Highly
secure
Network isolation,
encryption at
rest / in transit
Fully
managed
Managed by Amazon RDS:
On your part, no server provisioning,
software patching, setup,
configuration, or backups
Amazon Aurora
MySQL and PostgreSQL-compatible relational database built for the cloud
12. Performance
at scale
Consistent, single-digit
millisecond response times at
any scale; build applications
with virtually unlimited
throughput
Serverless architecture
No hardware provisioning,
software patching, or upgrades;
scales up or down automatically;
continuously backs up your data
Global replication
You can build global
applications with fast access
to local data by easily
replicating tables across
multiple AWS Regions
Enterprise
security
Encrypts all data by
default and fully integrates
with AWS Identity and
Access Management for
robust security
Amazon DynamoDB
Fast and flexible key-value database service for any scale
13. Secure and
compliant
2x throughput of
managed MongoDB services
Deeply integrated
with AWS services
Millions of requests per second,
millisecond latency
Same code, drivers, and tools
you use with MongoDB
Simple and
fully managed
Amazon DocumentDB
Fast, scalable, highly available MongoDB-compatible database service
14. Read scaling with replicas;
write and memory scaling with
sharding; nondisruptive scaling
Unlimited scale
AWS manages all hardware
and software setup,
configuration, and monitoring
Fully managed
In-memory data store
and cache for sub-millisecond
response times
Consistent high performance
Amazon ElastiCache
Managed, Redis, or Memcached-compatible in-memory data store
15. Fast ReliableOpen
Queries billions of
relationships with
millisecond latency
Six replicas of your
data across three
Availability Zones, with
full backup and restore
Build powerful
queries easily with
Gremlin and SPARQL
Supports Apache
TinkerPop & W3C
RDF graph models
Easy
Amazon Neptune
Fast, reliable graph database built for the cloud
16. 1,000x faster and 1/10th the
cost of relational databases
Collect data at the rate of
millions of inserts per
second (10M/second)
Trillions of
daily events
Adaptive query processing
engine maintains steady,
predictable performance
Time-series analytics
Built-in functions for
interpolation, smoothing,
and approximation
Serverless
Automated setup,
configuration, server
provisioning, and
software patching
Amazon Timestream
Fast, scalable, fully managed time series database
17. Immutable and transparent
Append-only, immutable
journal tracks history of all
changes that cannot be
deleted or modified; get
full visibility into entire
data lineage
Highly scalable
Executes 2–3x as many
transactions as ledgers in
common blockchain frameworks
Cryptographically
verifiable
All changes are
cryptographically
chained and verifiable
Easy to use
Flexible document model,
query with familiar
SQL-like interface
Amazon Quantum Ledger Database
Fully managed ledger database: Track and verify history of all changes made to your
application’s data
18. Amazon Keyspaces (for Apache Cassandra)
Scalable, highly available, and managed Apache Cassandra-compatible database service
Highly available
and secure
99.99% availability SLA
within an AWS Region
Data encrypted at rest;
integrated with IAM
No servers to
manage
No need to provision,
configure, and operate
large Cassandra
clusters
Apache Cassandra-
compatible
Use the same Cassandra
drivers and tools
Single-digit
millisecond
performance at scale
Automatically scale tables
up and down
Virtually unlimited
throughput and storage
21. Duolingo uses AWS databases to serve up
over 31 billion items for 80 language courses with
high performance and scalability
Primary database: Amazon DynamoDB
• 24,000 reads and 3,000 writes per second
• Personalize lessons for users taking 6B exercises per month
In-memory caching: Amazon ElastiCache
• Instance access to common words and phrases
Transactional data: Amazon Aurora
• Maintain user data
22. Capital One migrated its monolithic mainframe
to highly available AWS databases for
microservices-based applications
Transactional data: Amazon RDS
• State management
Analytics: Amazon Redshift
• Web logs
Consistent low latency: Amazon DynamoDB
• User data and mobile app
24. You
You
Fully managed services on AWS
Spend time innovating and building new applications, not managing infrastructure
AWS
Self-managed Fully managed
Schema design
Query construction
Query optimization
Automatic failover
Backup and recovery
Isolation and security
Industry compliance
Push-button scaling
Automated patching
Advanced monitoring
Routine maintenance
Built-in best practices
25. Move to managed relational databases
Reduce database administrative burden
No need to re-architect existing applications
Get better performance, availability, scalability, and security
Migrate on-premises or cloud-hosted relational databases to managed services
Amazon Aurora
MySQL, PostgreSQL
Amazon RDS
MySQL, PostgreSQL, MariaDB,
Oracle, SQL Server
PostgreSQL
26. Move to managed non-relational databases
Reduce database administrative burden
No need to re-architect existing applications
Get better performance, availability, scalability, and security
Migrate on-premises or cloud-hosted non-relational databases to managed services
Amazon DocumentDB
MongoDB
Amazon ElastiCache
Redis, Memcached
27. Move to AWS services to break free from the
infrastructure muck
Fully
managed
Broad
portfolio
Highly
available
and durable
Most
secure with
support for
compliance
28. Challenge
They experienced service admin challenges with their original
provider and wanted to scale business to the next level.
Solution
They moved from self-managed MySQL to Amazon Aurora
MySQL. They use Aurora as the primary transactional database,
Amazon DynamoDB for personalized search, and Amazon
ElastiCache as in-memory store for sub-millisecond site rendering.
Result
Initially, the appeal of AWS was the ease of managing and
customizing the stack. It was great to be able to ramp up more
servers without having to contact anyone and without having
minimum usage commitments. AWS is the easy answer for any
Internet business that wants to scale to the next level.
—Nathan Blecharczyk, Cofounder and CTO of Airbnb
“
”
MOVE TO MANAGED →
Amazon
Aurora
Amazon
ElastiCache
Amazon
DynamoDB
29. See more information at:
aws.amazon.com/databases
Contact us at:
https://aws.amazon.com/contact-us/
Get started
30. Learn databases with AWS Training and Certification
25+ free digital training courses cover topics and services
related to relational and nonrelational databases
Resources created by the experts at AWS to help you build and validate database skills
Validate expertise with the AWS Certified Database – Specialty
exam
The classroom offering, Planning and Designing Databases
on AWS, features AWS expert instructors and hands-on
activities
Visit the databases learning path at aws.amazon.com/training/path-databases
Let’s first talk about three major trends that impact the way you think about data.
There are three trends : 1/there is an explosion of relevant data you could track; 2/micro-services changes data and analytics requirements; and 3/the DevOps model drives a rapid rate of change.
1/ Explosion of data
There is an explosion of data being generated. You have to track a lot of data that comes from your business applications. However, the growth is coming from data generated by network-connected smart devices that drive variety and volume of data. Every “smart” device produces real-time data, such as mobile phones, connected cars, smart homes, wearable technologies, home appliances, security systems, industrial equipment, machinery, and electronics. Most new cars have built-in cellular connections, which account for one third of mobile sign-ups on cell phone networks. Applications also generate real-time data such as purchase data from e-commerce sites, user behavior from mobile apps, and tweets/posts from social media.
By our estimates, data grows 10x every 5 years. To take advantage of all of this data, you need to be able to partner with someone who can easily harness this volume of data.
2/ Micro-services changes data and analytics requirements
Organizations are moving from development of monolithic applications to a micro-services architecture. Micro-services lets organization break down a complex problem into independent units so developers can operate in small groups with less coordination and therefore respond more quickly and go faster. However, there are 2 implications: 1/it means developers can break down apps into smaller pieces and pick the best tool to solve each problem; and 2/it increases the need for real-time monitoring and analytics to understand what’s not working between all of the different micro-services, faster.
Developers can break down their applications into smaller pieces and are not beholden to using a single database for every workload. Instead, they can pick the right database purpose-built for the job.
Analytics is not an after the fact activity; it has to be built-in to everything that you do. You need to know what’s going on in your businesses in real-time. To fuel innovation, well run businesses now operate on data quickly (whether automatically or through human intervention).
3/ Rapid rate of change driven by DevOps
As we innovate quickly, and the velocity of changes, businesses are transforming IT to the DevOps model. This model uses automated development tool to enable continuous development, deployment, and improvement of software. It emphasizes communication, collaboration, and integration between software developers and IT operations. It also introduces a fast rate of change (and change management).
This means as you think about data and designing your data platform, you need to think about the rapid rate of change that occurs through the DevOps model.
<Note, this slide has an animated sequence that ends with isolated focus on Break free from legacy databases>
Intro slide: The Data Flywheel. We have known for a numbers of years now that the amount of data available to us is exploding and continues to grow at an exponential rate. This is driven by the fact that data is being produced by all devices, app and system logs, and machines and apps that produce telemetry. The types of data that need to be stored is more varied from traditional structured data to semi-structured and unstructured, and this data needs to be captured, stored, processed and analyzed in real-time.
Organizations that want to make the most of their data can no longer use traditional on-premises approaches to store, manage, and process the data at the scale that they need. As the cloud has been driving down the cost of compute and storage, and giving customers the agility and elasticity to store and process data on-demand, customers for the first time no longer have to worry about throwing away data they may need in the future, they can store everything they need, cost-effectively in the cloud, and process it as and when they need, paying as they go.
Click 1: Modernize/Data value
Click 2: It starts with breaking free from old guard databases with AWS Database Freedom.
For customers running legacy databases on premises, provisioning, operating, scaling, and managing databases is tedious, time-consuming, and expensive. Customer want to spend time innovating and building new applications, and not managing infrastructure.
Legacy databases are expensive in terms of license fees, maintenance, and support.
Proprietary platforms restrict innovation as customers are locked into using proprietary database features, missing out on innovation
and flexibility offered by open source and cloud-native databases.
Monolithic architectures make it difficult for customers to scale and iterate to support emerging use cases.
Punitive licensing tactics threaten customers with license audits.
Performance at scale:
AWS purpose-built database solutions are designed for fast, interactive query performance at any scale. Experience 3-5x the performance vs popular alternatives with capabilities to support over 20 million requests per second.
Cost effective:
Amazon database solutions provide the security, availability, and reliability of commercial-grade databases at 1/10th the cost of commercial databases.
Fully Managed:
Our fully managed services allow you to break free from the complexities of database and data warehouse administration. Serverless capabilities automatically scale throughput up or down based on demand.
Reliable:
Amazon databases are built for business critical workloads. Build scalable, reliable, and secure enterprise applications while your data is safeguarded behind the AWS infrastructure in highly secure data centers.
1/ Instead of looking at a list of hundreds of different databases, what if we instead think about common database categories. This is a simple mental model to help us reason how builders use these different systems.
2/ Relational is a category. Many builders understand relational systems, so I won’t spend a lot of time here. Other then reminding us if you have a workload where strong consistency matters, you will work with the team to define schemas, your not really sure every single question that will be asked of the data, but when you do, it matters that you always get back a consistent answer. Relational systems are awesome for workloads that need this. This is where Aurora fits, RDS open source Postgres, MySQL, and Maria DB, and commercial engines such as SQL Server and Oracle Database.
3/ To help with this change in how builders develop applications, over the years we have built a number of purpose-built non-relational database, starting in the key-value category with DynamoDB, a database that optimizes running key-value pairs at single-digit millisecond latency and at very large scale. A DynamoDB table can have one item or a trillion items and it will perform the same. Many, many companies, like Epic with Fortnite and Lyft, use DynamoDB. And just to give you an example of the type of scale that DynamoDB supports, over the two days of Prime Day, our biggest retail event in the history of the company, DynamoDB requests from Alexa, the Amazon.com sites, and the Amazon fulfillment centers totaled 7.11 trillion, peaking at 45.4 million requests per second. And, that is only a fraction of the total capacity that DynamoDB handles on any given day. This is unusual scale.
4/ Or, in the document category, let’s say that your developers want a flexible way to store and query data in the database by using the same document-model they use in their application code. Some refer to this as a schema on read system. Take Intuit as an example. Their automated compliance platform (ACP) ensures all of Intuit’s AWS resources, across thousands of accounts meet Intuit’s various compliance standards. ACP tracks audit events from tens of thousands of Intuit assets that are each modeled on a JSON document. As Intuit built the platform, they needed a database that can natively store, query, and index a diverse set of documents. That’s why we built Amazon DocumentDB, a fully managed, MongoDB-compatible document database that is purpose built to help developers easily work with JSON documents in their natural format but also architected to scale easily to meet Intuit’s growing needs. I always pause her for a minute. Do you remember when XML was a thing. I think XML 1.0 was first established in 1998. Back then what did commercial systems do to become an XML database? They added an XML data type. The problem is many of the database operators didn’t work with that data type. I would argue what document databases are today is what I think people were trying to do with SML years ago. Amazon DocumentDB launched in January and it’s off to a great start. And that’s why companies like Intuit, FINRA, Dow Jones and thousands of others use to store, retrieve, and manage JSON documents at scale.
5/ But let’s say your application can’t even stand single-digit millisecond latency. You need something even faster like microsecond latency. You would need to use an in-memory database and have a cache that can access data in microseconds. And that’s why we built ElastiCache, which is managed Redis and managed Memcached, and that’s what companies like Grab Taxi, McDonald’s, and MLBAM use to enable fast retrieval of data for real-time processing use cases such as messaging, real-time geospatial data like drive distance.
6/ Let's say that you have datasets that are really large and have a lot of interconnectedness. Take Nike as an example. They built an app on top of AWS which connects their athletes with their followers and provides personalized recommendations based on interests of more than 100 million users. Those are a lot of connections if you think about all the athletes and all the followers and all the interests. And they need fast queries over all of that connectedness. For example, running a complex query to learn the top five interests of all the followers of a certain athlete is easy to do on a graph database but unwieldly and slow on a relational database. That's why people are excited about graph databases and why we built Amazon Neptune, which is what companies like NBC Universal, Netflix, and Uber use for highly connected datasets.
7/ Driven by the rise of IoT devices, IT systems, and smart industrial machines, time-series data — data that measures how things change over time — is one of the fastest growing data types. Time-series data has specific characteristics such as typically arriving in time order form, data is append-only, and queries are always over a time interval. Time-series data is not just a timestamp or a datatype that you might use in a relational database. Instead, what makes a time-series database is that at its core, the single primary axis of the data model is time. Which means you can highly optimize how the data is stored, scaled, and retrieved. For this we have Amazon Timestream, a purpose-built time-series database that is in preview today.
8/ Ledgers are typically used to record a history of economic and financial activity in an organization. Many organizations build applications with ledger-like functionality because they want to maintain an accurate history of their applications' data, for example, tracking the history of credits and debits in banking transactions, verifying the data lineage of an insurance claim, or tracing movement of an item in a supply chain network. For this we built Amazon QLDB, a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log. BMW uses QLDB for their Digital Vehicle Passport Application, which maintains the complete and verifiable history of a vehicle, including maintenance records, tire changes, accidents, ownership, insurance, and loan records. This application will act as the single trusted authority where multiple third-party entities such as car dealerships, repair shops, banks, and insurance providers will submit data to BMW.
9/ People want the right tool for the right job, and they want the right database for whatever their workload is.
Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud, that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora is up to five times faster than standard MySQL databases and three times faster than standard PostgreSQL databases. It provides the security, availability, and reliability of commercial databases at 1/10th the cost. Amazon Aurora is fully managed by Amazon Relational Database Service (RDS), which automates time-consuming administration tasks like hardware provisioning, database setup, patching, and backups. Amazon Aurora features a distributed, fault-tolerant, self-healing storage system that auto-scales up to 64TB per database instance. It delivers high performance and availability with up to 15 low-latency read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across three Availability Zones (AZs).
Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. It's a fully managed, multiregion, multimaster, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications. DynamoDB can handle more than 10 trillion requests per day and can support peaks of more than 20 million requests per second. Many of the world's fastest growing businesses such as Lyft, Airbnb, and Redfin as well as enterprises such as Samsung, Toyota, and Capital One depend on the scale and performance of DynamoDB to support their mission-critical workloads. Hundreds of thousands of AWS customers have chosen DynamoDB as their key-value and document database for mobile, web, gaming, ad tech, IoT, and other applications that need low-latency data access at any scale. Create a new table for your application and let DynamoDB handle the rest.
Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. Amazon DocumentDB is designed from the ground-up to give you the performance, scalability, and availability you need when operating mission-critical MongoDB workloads at scale. Amazon DocumentDB implements the Apache 2.0 open source MongoDB 3.6 API by emulating the responses that a MongoDB client expects from a MongoDB server, allowing you to use your existing MongoDB drivers and tools with Amazon DocumentDB. In Amazon DocumentDB, the storage and compute are decoupled, allowing each to scale independently, and you can increase the read capacity to millions of requests per second by adding up to 15 low latency read replicas in minutes, regardless of the size of your data. Amazon DocumentDB is designed for 99.99% availability and replicates six copies of your data across three AWS Availability Zones (AZs). You can use AWS Database Migration Service (DMS) for free (for six months) to easily migrate their on-premises or Amazon Elastic Compute Cloud (EC2) MongoDB databases to Amazon DocumentDB with virtually no downtime.
Amazon ElastiCache offers fully managed Redis and Memcached. Seamlessly deploy, run, and scale popular open source compatible in-memory data stores. Build data-intensive apps or improve the performance of your existing apps by retrieving data from high throughput and low latency in-memory data stores. Amazon ElastiCache is a popular choice for Gaming, Ad-Tech, Financial Services, Healthcare, and IoT apps. Amazon ElastiCache for Redis is a blazing fast in-memory data store that provides sub-millisecond latency to power internet-scale real-time applications. Built on open-source Redis and compatible with the Redis APIs, ElastiCache for Redis works with your Redis clients and uses the open Redis data format to store your data. Your self-managed Redis applications can work seamlessly with ElastiCache for Redis without any code changes. ElastiCache for Redis combines the speed, simplicity, and versatility of open-source Redis with manageability, security, and scalability from Amazon to power the most demanding real-time applications in Gaming, Ad-Tech, E-Commerce, Healthcare, Financial Services, and IoT. Amazon ElastiCache for Memcached is a Memcached-compatible in-memory key-value store service that can be used as a cache or a data store. It delivers the performance, ease-of-use, and simplicity of Memcached. ElastiCache for Memcached is fully managed, scalable, and secure - making it an ideal candidate for use cases where frequently accessed data must be in-memory. It is a popular choice for use cases such as Web, Mobile Apps, Gaming, Ad-Tech, and E-Commerce.
Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. The core of Amazon Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with milliseconds latency. Amazon Neptune supports popular graph models Property Graph and W3C's RDF, and their respective query languages Apache TinkerPop Gremlin and SPARQL, allowing you to easily build queries that efficiently navigate highly connected datasets. Neptune powers graph use cases such as recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security. Amazon Neptune is highly available, with read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across Availability Zones. Neptune is secure with support for HTTPS encrypted client connections and encryption at rest. Neptune is fully managed, so you no longer need to worry about database management tasks such as hardware provisioning, software patching, setup, configuration, or backups.
Amazon Timestream is a fast, scalable, fully managed time series database service for IoT and operational applications that makes it easy to store and analyze trillions of events per day at 1/10th the cost of relational databases. Driven by the rise of IoT devices, IT systems, and smart industrial machines, time-series data — data that measures how things change over time — is one of the fastest growing data types. Time-series data has specific characteristics such as typically arriving in time order form, data is append-only, and queries are always over a time interval. While relational databases can store this data, they are inefficient at processing this data as they lack optimizations such as storing and retrieving data by time intervals. Timestream is a purpose-built time series database that efficiently stores and processes this data by time intervals. With Timestream, you can easily store and analyze log data for DevOps, sensor data for IoT applications, and industrial telemetry data for equipment maintenance. As your data grows over time, Timestream’s adaptive query processing engine understands its location and format, making your data simpler and faster to analyze. Timestream also automates rollups, retention, tiering, and compression of data, so you can manage your data at the lowest possible cost. Timestream is serverless, so there are no servers to manage. It manages time-consuming tasks such as server provisioning, software patching, setup, configuration, or data retention and tiering, freeing you to focus on building your applications.
Amazon QLDB provides a complete verifiable history of all application data changes and is built with tried and tested technology used inside Amazon for years, to solve building reliable system-of-record applications at scale.
QLDB gives you immutability. The database maintains a sequenced record of all changes to your data written to an append-only journal, allowing companies to query and analyze the full history
Data in QLDB is Cryptographically verifiable. The service uses a cryptographic hash function (SHA-256) to generate a secure output file of your data’s change history, known as a digest and the digest acts as a proof of your data’s change history, allowing you to look back and validate the integrity of your data changes.
QLDB is a serverless database and scales with the application. Unlike common Blockchain frameworks which require consensus, QLDB performs at low latency and higher throughput not requiring companies to trade performance for verifiability.
Finally, by leveraging familiar SQL APIs with PartiQL and a flexible open source document data model, Amazon Ion, QLDB is easy to use.
Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service.
1/ For those that are currently running Cassandra on-premises or in the cloud, Keyspaces brings the performance, manageability, scale, and security of our fully managed database services to Cassandra workloads.
2/ Amazon Keyspaces is compatible with the Apache open source Cassandra API, enabling customers to use the same Cassandra application code, Apache 2.0 licensed drivers, and tools that they use today.
3/ Amazon Keyspaces is serverless, so customers no longer need to provision, configure, and operate large Cassandra clusters or add and remove nodes manually and rebalance partitions as database traffic scales up and down.
4/ Amazon Keyspaces provides customers with single-digit millisecond performance at any scale and can scale tables up and down automatically based on actual application traffic, with virtually unlimited throughput and storage. There is no limit on the size of the table or the number of items.
5/ Amazon Keyspaces offers both provisioned and on-demand capacity mode so you can optimize costs by specifying capacity per workload or pay for only the resources your applications use.
6/ Customers with existing Cassandra tables running on premises or on Amazon Elastic Compute Cloud (EC2) can migrate these tables to Keyspaces easily with commonly used Cassandra migration tools.
7/ Finally, Amazon Keyspaces integrates with our existing AWS services such as Amazon CloudWatch for logging and performance monitoring, AWS IAM for access management, and AWS Key Management Service for managing encryption keys used for encryption at rest.
Duolingo uses Amazon DynamoDB as one of its primary database solutions. Each second, Duolingo’s DynamoDB implementation supports 24,000 reads and 3,000 writes, personalizing lessons for users taking 6 billion exercises per month. And Amazon DynamoDB provides autoscaling, which intelligently adjusts performance based on user demand—ensuring high availability and minimizing wasted costs due to over-provisioning. Duolingo also uses Amazon ElastiCache to provide instant access to common words and phrases, Amazon Aurora as the transactional database for maintaining user data, and Amazon Redshift for data analytics. With this database backbone, Duolingo teaches more language students than the entire US school system.
Capital One uses Amazon RDS to store transaction data for state management, Amazon Redshift to store web logs for analytics that need aggregations, and DynamoDB to store user data so that customers can quickly access their information with the Capital One app.
With AWS services, you don’t need to worry about administration tasks such as server provisioning, patching, setup, configuration, backups, or recovery. AWS continuously monitors your clusters to keep your workloads up and running with self-healing storage and automated scaling, so that you can focus on higher value application development. You focus on high value application development tasks such as schema design, query construction & optimization leaving AWS to take care of operational tasks on your behalf.
You never have to over or under provision infrastructure to accommodate application growth, intermittent spikes, and performance requirements and incur fixed capital costs which include software licensing and support, hardware refresh, and resources to maintain hardware. AWS does it all for you so you can spend time innovating and building new applications, not managing infrastructure.
The most straightforward and simple solution for many customers who are struggling to maintain their own relational databases at scale is a move to a managed database service, like Amazon RDS or Amazon Aurora. In most cases, these customers can migrate workloads and applications to a managed service without needing to rearchitect their application, and their teams can continue to leverage the same DB skill sets.
The target customer for a move from self-managed to managed relational DBs:
Is self-managing DBs on-premises, in EC2, and/or in another public cloud.
Would like to reduce DB admin burden and reallocate DBA resources to app-centric work.
Does not want to rearchitect their application. Wants to continue leveraging same skill sets.
Does need a simple path to a managed service in the cloud for DB workloads.
Wants better better performance, availability, scalability, and security
Customers can lift and shift their self-managed databases like Oracle, SQL Server, MySQL, PostgreSQL, and MariaDB to Amazon RDS.
For customers looking for better performance and availability, they can move their lift and shift their MySQL & PostgreSQL databases to Amazon Aurora and get 3-5X better throughput.
Customers use non-relational databases like MongoDB and Redis as document and in-memory databases for use cases such as content management, personalization, mobile apps, catalogs, and real-time use cases such as caching, gaming leaderboards, and session stores. The most straightforward and simple solution for many customers who are struggling to maintain their own non-relational databases at scale is a move to a managed database service, like 1/ Moving self-managed MongoDB databases to Amazon DocumentDB 2/ Moving self-managed in-memory databases like Redis & ElastiCache to Amazon ElastiCache. In most cases, these customers can migrate workloads and applications to a managed service without needing to rearchitect their application, and their teams can continue to leverage the same DB skill sets.
The target customer for a move from self-managed to managed non-relational DBs:
Is self-managing DBs on-premises, in EC2, and/or in another public cloud.
Would like to reduce DB admin burden and reallocate DBA resources to app-centric work.
Does not want to rearchitect their application. Wants to continue leveraging same skill sets.
Does need a simple path to a managed service in the cloud for DB workloads.
Wants better performance, availability, scalability, and security
The solution is for customers to move to AWS managed services for databases and analytics.
Why choose AWS for database and analytics? Here are some top level points, which we will explore further through the following slides. AWS provides the most comprehensive, fully managed, performant & scalable, available & durable, and secure& compliant portfolio of services that enable customers to easily build their applications in the cloud and process and analyze all their data with the broadest set of data management and analytical approaches, including relational databases, nonrelational databases, data lakes, and machine learning. As a result, there are more organizations running their databases, data lakes, and analytics on AWS than anywhere else, with customers, like Airbnb, Capital One, Verizon, NETFLIX, Zillow, NASDAQ, Yelp, iRobot, and FINRA, trusting AWS to run their analytics workloads.
Details supporting the above claims:
Broad services portfolio
AWS offers the broadest set of databases, analytic tools and engines that analyzes data using open formats and open standards. Customers can choose from 14 purpose-built database engines including relational, key-value, document, in-memory, graph, time series, and ledger databases. AWS’s portfolio of purpose-built databases supports diverse data models and allows you to build use case driven, highly scalable, distributed applications. By picking the best database to solve a specific problem or a group of problems, you can break away from restrictive one-size-fits-all monolithic databases and focus on building applications to meet the needs of your business. For analytics, customers can store data in the standards-based data format of their choice such as CSV, ORC, Grok, Avro, and Parquet, and have the flexibility to analyze the day in a variety of ways such as data warehousing, interactive SQL queries, real-time analytics, operational analytics, and big data processing. AWS also has the most partner solutions that have pre-built integration giving customers choice, ensuring their needs will be met for existing and future analytics use cases.
Fully managed
With AWS database and analytics services, you don’t need to worry about administration tasks such as server provisioning, patching, setup, configuration, backups, or recovery. AWS continuously monitors your clusters to keep your workloads up and running with self-healing storage and automated scaling, so that you can focus on higher value application development.
Most performant & scalable
With AWS you get relational databases that are 3-5X faster than popular alternatives, or non-relational databases that give you microsecond to sub-millisecond latency. Start small and scale as your applications grow. You can scale your database and analytics compute and storage resources easily, often with no downtime. Because AWS database and analytics services are optimized for the data model or the type of analytics you need, your applications can scale and perform better at 1/10 the cost versus commercial databases.
Most available & durable
When running critical production systems, minimizing downtime and service interruptions is a high priority. At AWS, our managed services run on the same highly reliable infrastructure used by other AWS services providing high availability and durability without sacrificing performance. AWS services provide multi-region and multi-availability zone deployments for protection against region-wide or availability zone outages; read replicas so you have multiple copies of your data for scalability and disaster recovery; and continuous backups to Amazon S3 for 11 9’s of durability. 99.9X availability: 99.95% for RDS MAZ instances and 99.99% for Aurora MAZ clusters
AWS has the most security capabilities to protect customer’s data. Customers can launch AWS services in a virtual network with Amazon VPC, and have the ability to control access and permissions with AWS Identity and Access Management, and authentication with Kerberos.
AWS services provide support for compliance and assurance programs like HIPAA, PCI DSS, FedRAMP, and ISO for finance, healthcare, government, and more
Here’s an example on a customer who’s all-in on AWS. Airbnb moved away for self managing databases to fully managed AWS databases such as Aurora, DynamoDB, and ElastiCache.
https://aws.amazon.com/solutions/case-studies/airbnb/
Image source: free stock image from Pexels.com (no license fee)
If you’re ready to continue learning, we offer free digital courses for database services.
The DATABASE learning path tells you how to get started
Then, validate your experience with an industry-recognized certification in Databases.