Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
Next

Share

Introducing Amazon Aurora with PostgreSQL Compatibility - AWS Online Tech Talks

Learning Objectives:
- Learn about optimizing relational databases for the cloud
- Learn about Amazon Aurora scalability and high availability
- Learn about Amazon Aurora compatibility with PostgreSQL

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Introducing Amazon Aurora with PostgreSQL Compatibility - AWS Online Tech Talks

  1. 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kevin Jernigan, Senior Product Manager, Amazon Aurora PostgreSQL January 24, 2018 Amazon Aurora with PostgreSQL Compatibility
  2. 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda  Why did we build Amazon Aurora?  PostgreSQL compatibility  Durability and availability architecture  Performance architecture  Performance results vs. PostgreSQL  Announcing Performance Insights  Getting data In  Feature roadmap  Questions +
  3. 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A Brief History of Amazon RDS • 2006: Amazon launches Amazon Web Services (AWS) with Simple Storage Service (S3) • 2009: Amazon Relational Database Service (RDS) launches with support for MySQL • 2012: Amazon RDS adds support for Oracle and SQL Server • 2013: Amazon RDS adds support for PostgreSQL • 2014: Amazon RDS announces Amazon Aurora • July 2015: Amazon Aurora with MySQL compatibility goes GA • October 2017: Amazon Aurora with PostgreSQL compatibility goes GA
  4. 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Traditional relational databases are hard to scale Multiple layers of functionality all in a monolithic stack SQL Transactions Caching Logging Storage
  5. 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Traditional approaches to scale databases Each architecture is limited by the monolithic mindset SQL Transactions Caching Logging SQL Transactions Caching Logging Sharding Coupled at the application layer Application Shared Nothing Coupled at the SQL layer Application SQL Transactions Caching Logging SQL Transactions Caching Logging Shared Disk Coupled at the caching and storage layer Storage Application Storage Storage SQL Transactions Caching Logging Storage SQL Transactions Caching Logging Storage
  6. 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reimagining the relational database What if you were inventing the database today? You would break apart the stack You would build something that:  Can scale out…  Is self-healing…  Leverages distributed services…
  7. 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A service-oriented architecture applied to the database Move the logging and storage layer into a multitenant, scale-out, database-optimized storage service Integrate with other AWS services like Amazon EC2, Amazon VPC, Amazon DynamoDB, Amazon SWF, and Amazon Route 53 for control & monitoring Make it a managed service – using Amazon RDS. Takes care of management and administrative functions. Amazon DynamoDB Amazon SWF Amazon Route 53 Logging + Storage SQL Transactions Caching Amazon S3 1 2 3 Amazon RDS
  8. 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. In 2014, we launched Amazon Aurora with MySQL compatibility. Now we have added PostgreSQL compatibility. Customers can now choose how to use Amazon’s cloud-optimized relational database, with the performance and availability of commercial databases and the simplicity and cost- effectiveness of open source databases. Making Amazon Aurora better
  9. 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.  Open source database  In active development for 20 years  Owned by a foundation, not a single company  Permissive innovation-friendly open source license  High performance out of the box  Object-oriented and ANSI-SQL:2008 compatible  Most geospatial features of any open source database  Supports stored procedures in 12 languages (Java, Perl, Python, Ruby, Tcl, C/C++, its own Oracle-like PL/pgSQL, etc.)  Most Oracle-compatible open source databases  Highest AWS Schema Conversion Tool automatic conversion rates are from Oracle to PostgreSQL PostgreSQL fast facts Open Source Initiative
  10. 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. PostgreSQL compatibility PostgreSQL 9.6 + Amazon Aurora cloud-optimized storage  Performance: 2x-3x higher throughput than PostgreSQL alone  Availability: failover time of < 30 seconds  Durability: 6 copies across 3 Availability Zones  Read Replicas: single-digit millisecond lag times on up to 15 replicas  Fully compatible with PostgreSQL  Not a compatibility layer  Native PostgreSQL implementation Amazon Aurora Storage
  11. 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora with PostgreSQL Compatibility Durability & Availability
  12. 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scale-out, distributed, log structured storage Master Replica Replica Replica Availability Zone 1 Shared Storage Volume – Transaction Aware Primary Database Node Read Replica / Secondary Node Read Replica / Secondary Node Read Replica / Secondary Node Availability Zone 2 Availability Zone 3 AWS Region Storage Monitoring Database and Instance Monitoring
  13. 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora storage engine overview Data is replicated 6 times across 3 Availability Zones Continuous backup to Amazon S3 (built for 11 9s durability) Continuous monitoring of nodes and disks for repair 10GB segments as unit of repair or hotspot rebalance Quorum system for read/write; latency tolerant Quorum membership changes do not stall writes Storage volume automatically grows up to 64 TB AZ 1 AZ 2 AZ 3 Amazon S3 Database Node Storage Node Storage Node Storage Node Storage Node Storage Node Storage Node Storage Monitoring
  14. 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What can fail? Segment failures (disks) Node failures (machines) AZ failures (network or datacenter) Optimizations 4 out of 6 write quorum 3 out of 6 read quorum Peer-to-peer replication for repairs SQL Transaction AZ 1 AZ 2 AZ 3 Caching SQL Transaction AZ 1 AZ 2 AZ 3 Caching Amazon Aurora storage engine fault tolerance
  15. 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora Read Nodes Availability Failing database nodes are automatically detected and replaced Failing database processes are automatically detected and recycled Replicas are automatically promoted to primary if needed (failover) Customer specifiable fail-over order AZ 1 AZ 3AZ 2 Primary Node Primary Node Primary Database Node Primary Node Primary Node Read Replica Primary Node Primary Node Read Replica Database and Instance Monitoring Performance Customer applications can scale out read traffic across read replicas Read balancing across read replicas
  16. 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora continuous backup Segment snapshot Log records Recovery point Segment 1 Segment 2 Segment 3 Time • Take periodic snapshot of each segment in parallel; stream the logs to Amazon S3 • Backup happens continuously without performance or availability impact • At restore, retrieve the appropriate segment snapshots and log streams to storage nodes • Apply log streams to segment snapshots in parallel and asynchronously
  17. 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Traditional databases Have to replay logs since the last checkpoint Typically 5 minutes between checkpoints Single-threaded in MySQL and PostgreSQL; requires a large number of disk accesses Amazon Aurora No replay at startup because storage system is transaction-aware Underlying storage replays log records continuously, whether in recovery or not Coalescing is parallel, distributed, and asynchronous Checkpointed Data Log Crash at T0 requires a re-application of the SQL in the log since last checkpoint T0 T0 Crash at T0 will result in logs being applied to each segment on demand, in parallel, asynchronously Amazon Aurora instant crash recovery
  18. 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Faster, more predictable failover with Amazon Aurora App RunningFailure Detection DNS Propagation Recovery Database Failure Amazon RDS for PostgreSQL is good: failover times of ~60 seconds Replica-Aware App Running Failure Detection DNS Propagation Recovery Database Failure Amazon Aurora is better: failover times < 30 seconds 1 5 - 2 0 s e c 3 - 1 0 s e c App Running
  19. 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora with PostgreSQL Compatibility Performance Architecture
  20. 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Do fewer IOs Minimize network packets Offload the database engine DO LESS WORK Process asynchronously Reduce latency path Use lock-free data structures Batch operations together BE MORE EFFICIENT How does Amazon Aurora achieve high performance? DATABASES ARE ALL ABOUT I/O NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND HIGH-THROUGHPUT PROCESSING NEEDS CPU AND MEMORY OPTIMIZATIONS
  21. 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. WAL DATA COMMIT LOG & FILES RDS FOR POSTGRESQL WITH MULTI-AZ EBS mirrorEBS mirror AZ 1 AZ 2 Amazon S3 EBS Amazon Elastic Block Store (EBS) Primary Database Node Standby Database Node 1 2 3 4 5 Issue write to Amazon EBS, EBS issues to mirror, acknowledge when both done Stage write to standby instance Issue write to EBS on standby instance IO FLOW Steps 1, 3, 5 are sequential and synchronous This amplifies both latency and jitter Many types of writes for each user operation OBSERVATIONS T Y P E O F W R IT E Write IO Traffic in Amazon RDS for PostgreSQL
  22. 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Write IO Traffic in an Amazon Aurora database node AZ 1 AZ 3 Primary Database Node Amazon S3 AZ 2 Read Replica / Secondary Node AMAZON AURORA ASYNC 4/6 QUORUM DISTRIBUTED WRITES DATAAMAZON AURORA + WAL LOG COMMIT LOG & FILES IO FLOW Only write WAL records; all steps asynchronous No data block writes (checkpoint, cache replacement) 6X more log writes, but 9X less network traffic Tolerant of network and storage outlier latency OBSERVATIONS 2x or better PostgreSQL Community Edition performance on write-only or mixed read-write workloads PERFORMANCE Boxcar log records – fully ordered by LSN Shuffle to appropriate segments – partially ordered Boxcar to storage nodes and issue writes WAL T Y P E O F W R IT E Read Replica / Secondary Node
  23. 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Write IO Traffic in an Amazon Aurora storage node LOG RECORDS Primary Database Node INCOMING QUEUE STORAGE NODE AMAZON S3 BACKUP 1 2 3 4 5 6 7 8 UPDATE QUEUE ACK HOT LOG DATA BLOCKS POINT IN TIME SNAPSHOT GC SCRUB COALESCE SORT GROUP PEER TO PEER GOSSIPPeer Storage Nodes All steps are asynchronous Only steps 1 and 2 are in foreground latency path Input queue is far smaller than PostgreSQL Favors latency-sensitive operations Uses disk space to buffer against spikes in activity OBSERVATIONS IO FLOW ① Receive record and add to in-memory queue ② Persist record and acknowledge ③ Organize records and identify gaps in log ④ Gossip with peers to fill in holes ⑤ Coalesce log records into new data block versions ⑥ Periodically stage log and new block versions to Amazon S3 ⑦ Periodically garbage collect old versions ⑧ Periodically validate CRC codes on blocks
  24. 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. IO Traffic in Aurora Replicas PAGE CACHE UPDATE Aurora Master 30% Read 70% Write Aurora Replica 100% New Reads Shared Multi-AZ Storage PostgreSQL Master 30% Read 70% Write PostgreSQL Replica 30% New Reads 70% Write SINGLE-THREADED WAL APPLY Data Volume Data Volume Physical: Ship redo (WAL) to Replica Write workload similar on both instances Independent storage Physical: Ship redo (WAL) from Master to Replica Replica shares storage: No writes performed Cached pages have redo applied Advance read view when all commits seen POSTGRESQL READ SCALING AMAZON AURORA READ SCALING
  25. 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Applications restart faster with survivable caches Cache normally lives inside the operating system database process– and goes away when/if that database dies Aurora moves the cache out of the database process Cache remains warm in the event of a database restart Lets the database resume fully loaded operations much faster Cache lives outside the database process and remains warm across database restarts SQL Transactions Caching SQL Transactions Caching SQL Transactions Caching RUNNING CRASH AND RESTART RUNNING
  26. 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora with PostgreSQL Compatibility Performance vs. PostgreSQL
  27. 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. System Configuration PostgreSQL – Single AZ, no backup Amazon Aurora EBS EBS EBS 60,000 total IOPS AZ 1 AZ 2 AZ 3 Amazon S3 r4.16xlarge database instance Storage Node Storage Node Storage Node Storage Node Storage Node Storage Node r4.8xlarge client driver r4.16xlarge database instance r4.8xlarge client driver r4.8xlarge client driver Amazon S3 Amazon RDS r4.16xlarge 30,000 iops
  28. 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora is Faster on PgBench Running the standard pgbench benchmark, Amazon Aurora delivers 1.6x the peak throughput of PostgreSQL and 2.9x at high client counts 0 5 10 15 20 25 30 35 40 45 128 256 512 768 1024 1280 1536 1792 2048 Throughput(tps,thousands) Number of clients pgbench tpcb-like throughput, 150 GiB PostgreSQL (Single AZ) Amazon Aurora (Three AZs)
  29. 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora is 2x–5x Faster on Sysbench 0 20 40 60 80 100 120 140 256 512 768 1024 1280 1536 1792 2048 2305 2560 writes/second,thousands Number of clients sysbench write-only 30 GiB PostgreSQL (Single AZ, No Backup) Amazon Aurora (Three AZs, Continuous Backup) 2.2x 5.3x Running the standard sysbench benchmark, Amazon Aurora delivers >2x the absolute peak of PostgreSQL and 5x at high client counts.
  30. 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora is 4x Faster at Large Scale Scales from 1.8x to 4.4x better as database grows from 10GiB to 100GiB 74 49 30 136 134 131 0 20 40 60 80 100 120 140 160 10 GiB 30 GiB 100 GiB writes/second,thousands Database size sysbench write-only PostgreSQL (Single AZ, No Backup) Amazon Aurora (Three AZs, Continuous Backup) 4.4x1.8x 2.8x
  31. 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora Improvements For GA GA release improves performance on larger working sets 112 92 83 136 134 131 0 20 40 60 80 100 120 140 160 10 GiB 30 GiB 100 GiB writes/sec.thousands Database Size sysbench write-only Amazon Aurora Preview Amazon Aurora GA
  32. 32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora Loads Data Faster Database initialization is >2x faster than PostgreSQL using the standard pgbench benchmark Copy In Copy In Vacuum Vacuum Index Build Index Build 0 500 1000 1500 2000 2500 3000 3500 4000 PostgreSQL Amazon Aurora Runtime (seconds) pgbench initialization, scale 10000 (150 GiB) 86% reduction in vacuum time
  33. 33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora gives >2x Lower Response Times Response time under heavy write load >2x lower than PostgreSQL, variance reduced by 99% 0.00 100.00 200.00 300.00 400.00 500.00 600.00 700.00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Responsetime,ms Minutes SYSBENCH RESPONSE TIME (p95), 30 GiB, 1024 CLIENTS PostgreSQL (Single AZ, No Backup) Amazon Aurora (Three AZs, Continuous Backup)
  34. 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora has more Consistent Throughput While running pgbench at load, throughput is 3x more consistent than PostgreSQL 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 10 15 20 25 30 35 40 45 50 55 60 Throughput,tps Minutes pgbench throughput over time, 150 GiB, 1024 clients PostgreSQL (Single AZ) Amazon Aurora (Three AZs)
  35. 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora Reduced Recovery Time Up to 97% Transaction-aware storage system recovers almost instantly 3 GiB redo recovered in 19 seconds 10 GiB redo recovered in 50 seconds 30 GiB redo recovered in 123 seconds Amazon Aurora has no redo. Recovered in 3 seconds while maintaining signficantly greater throughput. 0 20 40 60 80 100 120 140 160 0 20000 40000 60000 80000 100000 120000 140000 Recoverytimeinseconds(lessisbetter) writes / second (more is better) Recovery time from crash under load Bubble size represents redo log which must be recovered As PostgreSQL throughput goes up, so does log size and crash recovery time.
  36. 36. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora with PostgreSQL Compatibility Performance By The Numbers Measurement Result PgBench Up to 2.9x faster Sysbench 2x-5x faster Vacuum Time reduced up to 86% Response time >2x lower, 99% < variance Throughput jitter 3x more consistent Throughput at scale 4x faster Recovery speed Time reduced up to 97%
  37. 37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora with PostgreSQL Compatibility Performance Monitoring & Management
  38. 38. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. First Step: Enhanced Monitoring Released 2016 O/S Metrics Process & thread List Up to 1 second granularity
  39. 39. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Next Step: Performance Insights Why database tuning? Amazon RDS is all about managed databases Customers want performance managed too:  Want easy tool for optimizing cloud database workloads  May not have deep tuning expertise  Want a single pane of glass to achieve this
  40. 40. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  41. 41. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  42. 42. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora with PostgreSQL Compatibility Getting Your Data in
  43. 43. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Start your first migration in 10 minutes or less Keep your apps running during the migration Replicate within, to, or from Amazon EC2 or Amazon RDS Move data to the same or a different database engine AWS Database Migration Service
  44. 44. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Customer premises Application users AWS Internet VPN Start a replication instance Connect to source and target databases Select tables, schemas, or databases Let AWS DMS create tables, load data, and keep them in sync Switch applications over to the target at your convenience Keep your apps running during the migration AWS DMS
  45. 45. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Schema Conversion Tool Features Oracle and Microsoft SQL Server schema conversion to MySQL, Amazon Aurora, MariaDB, and PostgreSQL Or convert your schema between PostgreSQL and any MySQL engine Database Migration Assessment report for choosing the best target engine Code browser that highlights places where manual edits are required Secure connections to your databases with SSL Cloud native code optimization The AWS Schema Conversion Tool helps automate many database schema and code conversion tasks when migrating between database engines or data warehouse engines
  46. 46. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Schema Conversion Tool Converts relational databases Converts warehouses
  47. 47. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora with PostgreSQL Compatibility Recent Launch
  48. 48. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. High Performance Easy to Operate & Compatible High Availability Secure by Design Amazon Aurora with PostgreSQL compatibility – 10/24/2017 launch  2x-3x more throughput than PostgreSQL  Up to 64 TB of storage per instance  Write jitter reduction  Near synchronous replicas  Reader endpoint  Enhanced OS monitoring  Performance Insights  Push button migration  Auto-scaling storage  Continuous backup and PITR  Easy provisioning / patching  All PostgreSQL features  All RDS for PostgreSQL extensions  AWS DMS supported inbound  Failover in less than 30 seconds  Customer specifiable failover order  Up to 15 readable failover targets  Instant crash recovery  Survivable buffer cache  X-region snapshot copy  Encryption at rest (AWS KMS)  Encryption in transit (SSL)  Amazon VPC by default  Row Level Security
  49. 49. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora Available Durable The Amazon Aurora Database Family AWS DMS Amazon RDS AWS IAM, KMS & VPC Amazon S3 Convenient Compatible Automatic Failover Read Replicas X 6 Copies High Performance & Scale Secure Encryption at rest and in transit Enterprise Performance 64TB Storage PostgreSQL MySQL
  50. 50. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Questions Kevin Jernigan, Senior Product Manager: kmj@amazon.com Amazon Aurora documentation http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Aurora.html Amazon Aurora PostgreSQL forum https://forums.aws.amazon.com/forum.jspa?forumID=227
  51. 51. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you!
  • seanliuds

    Jun. 13, 2019
  • ak5491_sbc

    Jan. 28, 2019
  • shubhamsingh598

    Dec. 10, 2018
  • ericycchong

    Jul. 12, 2018

Learning Objectives: - Learn about optimizing relational databases for the cloud - Learn about Amazon Aurora scalability and high availability - Learn about Amazon Aurora compatibility with PostgreSQL

Views

Total views

1,146

On Slideshare

0

From embeds

0

Number of embeds

1

Actions

Downloads

27

Shares

0

Comments

0

Likes

4

×