AWS launched Amazon DynamoDB today.
Amazon DynamoDB is a fully managed NoSQL database service that provides extremely fast and predictable performance with seamless scalability. It enables customers to offload the administrative burdens of operating and scaling distributed databases so they don't have to worry about hardware provisioning, configuration, replication, software patching, partitioning, or cluster scaling.
See Video: http://www.youtube.com/watch?v=oz-7wJJ9HZ0
2. AWS Database Services
• Managed services designed to reduce administration, accelerate
deployment and minimize cost
• Enable customers to choose the most effective data store for their
requirements
Non-Relational (NoSQL) Database
Schema-less data store that enables fast deployment of new
applications without the burden of database administration
Relational Database
Manage existing database applications without the effort required
to provision, upgrade, backup and scale highly available instances
In-Memory Cache
Accelerate data retrieval performance by caching data in
memory and avoiding slower disk-based systems
3. High Performance Relational Databases
Amazon RDS Improve Increase Reduce ElastiCache
Configuration Availability Throughput Latency
Push-Button Scaling
Multi AZ
Read Replicas
ElastiCache
Push-Button Scaling Multi-AZ Read Replicas
Availability Availability
Zone Zone
Region
4. Relational Or Non-Relational?
Note: One type does not fit all apps. The choice depends on several factors.
Factors Relational (RDS) NoSQL (DynamoDB)
• Existing database apps • New Web scale applications
• Business process-centric apps • Large # of small writes and reads
Application Type
Example: Financial transactions, ERP apps, Multi- Example: Web, social, mobile apps, shopping
stage approval flows cart, order mgt, user preferences
Application • Relational data models, transactions • Simple data models, transactions
Characteristics • Complex queries, joins and updates • Range queries, simple updates
Application or DBA architected (clustering, Seamless, on-demand scaling per application
Scaling partitions, sharding) needs
• Performance – depends on data model, • Performance – Automatically optimized by
indexing, query, and storage optimization the system
QoS • Reliability and availability – Managed Durability • Reliability and availability – Managed
– Managed • Durability – Managed
Existing programming skills – SQL + Programming Web style programming – queries managed
Skill Set languages through programming and developers
Possible to use both relational and NoSQL in one application, depending on requirements
5. The “Big Data” Scalability Challenge
Requirement:
predictable, consistent
performance
Hardware purchase
Performance
and provisioning
$! Data sharding
Data caching
Cluster management
Reality: performance Fault management
degrades with scale
Scalability
6. Amazon DynamoDB
DynamoDB is a fully managed NoSQL database
service that provides extremely fast and
predictable performance with seamless scalability
Zero Administration
Low Latency SSD’s
Reserved Capacity
Unlimited Potential
Storage and Throughput
8. DynamoDB Highlights
• Low Latency
– SSD-based storage nodes ADMIN
– Latency = single-digit milliseconds
• Massive and Seamless Scalability
– No table size or throughput limits
– Live repartitioning for changes to storage and throughput
• Predictable Performance
– Provisioned throughput model
• Durable and Available
– Consistent, disk-only writes
• Zero Administration
9. Provisioned Throughput
• Reserve the IOPS needed for each table
"ProvisionedThroughput": {"ReadsPerSecond":500,"WritesPerSecond":100}…
• Set at table creation
• Increase / decrease any time via API call
• Pay for throughput and storage (not instances)
– $0.01 per hour for every 10 units of Write Capacity
– $0.01 per hour for every 50 units of Read Capacity
– $1.00 per GB-month of Storage
plus standard data transfer rates into and out of DynamoDB
10. Reducing Risk
• Consistency
– DynamoDB writes are always consistent
– Reads are consistent, or eventually consistent (default)
• Durability
– All writes occur to disk, not memory
– A write is only acknowledged (committed) once it exists in at least two
physical data centers
• Availability
– Regional service
– spans multiple availability zones
– All data is continuously
replicated to multiple AZ’s
11. DynamoDB and Elastic MapReduce
Seamless Integration
• Archive
– Efficient export of DynamoDB tables to S3 (as CSV files)
• Data Load
– Efficient import of exported tables from S3 back into DynamoDB
• Complex Queries
– Sophisticated, SQL-based querying of DynamoDB tables GROUP
BY, JOIN, HAVING, secondary indices, etc)
• Complex Joins
– Ability to join live tables in DynamoDB with archived tables in S3
12. DynamoDB – Unique Value
essentials for a low-cost throughput service
• Fast, Predictable Performance
– Low latency with user-requested throughput
• Zero Administration
– Effortless scalability
– Managed service automates resource
allocation, data partitioning and re-partitioning
• Always Durable
– Performance without compromise
– No reduction in durability or consistency
in order to achieve throughput
13. Getting Started with DynamoDB
• Quick Start Assistance
– Simple set of APIs and code samples
– White papers and best practice guides
– Jump start training course
– Developer Guide
– Calculator
• Free Tier (per month)
– 5 writes/sec
– 10 consistent reads/sec
– 100MB storage