SlideShare une entreprise Scribd logo
1  sur  52
RDBMS to MongoDB Migration
Considerations and Best Practices
Mat Keep
MongoDB Product Marketing

mat.keep@mongodb.com
@matkeep
Agenda
• Migration Roadmap
• Schema Design
• Application Integration
• Data Migration
• Operational Considerations

• Resources to Get Started

2
Strategic Priorities

Enabling New &
Enhancing Existing Apps

Faster Time to Market

3

Better Customer Experience

Lower TCO
Hitting RDBMS Limits
Data Types

Agile Development

• Unstructured data

• Iterative

• Semi-structured
data

• Short development
cycles

• Polymorphic data

• New workloads

Volume of Data

New Architectures

• Petabytes of data

• Horizontal scaling

• Trillions of records

• Commodity
servers

• Millions of queries per
second
4

• Cloud computing
Migration: Proven Benefits
Organization Migrated From

Application

edmunds.com

Oracle

Billing, online advertising, user data

Cisco

Multiple RDBMS

Craigslist

MySQL

Content management

Salesforce
Marketing Cloud

RDBMS

Social marketing, analytics

Foursquare

PostgreSQL

MTV Networks

Multiple RDBMS

Orange Digital

MySQL

Analytics, social networking

Social, mobile networking platforms
Centralized content management
Content Management

http://www.mongodb.com/customers
5
Migration Steps

6
Migration Roadmap

• Backed by Free, Online MongoDB Training
• 100k+ registrations to date
• Consulting and Support also available
7
Schema Design
On-Demand Webinar:
http://www.mongodb.com/presentations/webinar-relationaldatabases-mongodb-what-you-need-know-0
From Relational to MongoDB – What you Need to Know
8
Definitions
RDBMS
Database

Database

Table

Collection

Row

Document

Index

Index

JOIN

9

MongoDB

Embedded
Document or
Reference
Data Models: Relational to Document
MongoDB Document

Relational
{

first_name: „Paul‟,
surname: „Miller‟
city: „London‟,
location: [45.123,47.232],
cars: [
{ model: „Bentley‟,
year: 1973,
value: 100000, … },
{ model: „Rolls Royce‟,
year: 1965,
value: 330000, … }
]
}
10
Document Model Benefits
• Rich data model, natural data representation
– Embed related data in sub-documents & arrays
– Support indexes and rich queries against any element

• Data aggregated to a single structure (preJOINed)
– Programming becomes simple
– Performance can be delivered at scale

• Dynamic schema
– Data models can evolve easily
– Adapt to changes quickly: agile methodology
11
The Power of Dynamic Schema
RDBMS

MongoDB

{

_id : ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham, Justin",
department : "Marketing",
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{

type :

"Health",

plan : "PPO Plus" },
{

type :

"Dental",

plan : "Standard" }
]
}

12
RDBMS: Blogging Platform

JOIN 5 tables
13
MongoDB:
Denormalized to 2 BSON Documents

Higher Performance: Data Locality
14
Defining the Data Model
Application

RDBMS Action

Create Product Record

INSERT to (n) tables
insert() to 1 document
(product description, price, with sub-documents,
manufacturer, etc.)
arrays

Display Product Record

SELECT and JOIN (n)
product tables

find() aggregated
document

Add Product Review

INSERT to “review” table,
foreign key to product
record

insert() to “review”
collection, reference to
product document

More actions…..

……

MongoDB Action

……

• Analyze data access patterns of the application
– Identify data that is accessed together, model within a
document

• Identify most common queries queries from logs
15
Modeling Relationships:
Embedding and Referencing
• Embedding
– For 1:1 or 1:Many (where “many” viewed with the
parent)
– Ownership and containment
– Document limit of 16MB, consider document growth

• Referencing
–
–
–
–
–
16

_id field is referenced in the related document
Application runs 2nd query to retrieve the data
Data duplication vs performance gain
Object referenced by many different sources
Models complex Many : Many & hierarchical structures
Referencing Publisher ID in Book
 publisher = {

_id: "oreilly",

name: "O‟Reilly Media",

founded: "1980",

location: "CA"
 }
 book = {

title: "MongoDB: The Definitive Guide",

authors: [ "Kristina Chodorow", "Mike Dirolf" ],

published_date: ISODate("2010-09-24"),

pages: 216,

language: "English",

publisher_id: "oreilly"
 }
17
Indexing in MongoDB
• MongoDB indexing will be familiar to DBAs
– B-Tree Indexes, Secondary Indexes

• Single biggest tunable performance factor
– Define indexes by identifying common queries
– Use MongoDB explain to ensure index coverage
– MongoDB profiler logs all slow queries

• Compound

• Geospatial

• Unique

• Hash

• Array

• Sparse

• TTL

• Text Search

18
Application Integration

19
MongoDB Drivers and API
Drivers
Drivers for most popular
programming languages and
frameworks

Java

Ruby

JavaScript

Implemented as methods
within API of the language, not
a separate language like SQL

Perl

Python

Haskell

IBM
MongoDB API selected as
standard for mobile app
development
20
Mapping the MongoDB API to SQL

Mapping Chart:
http://docs.mongodb.org/manual/reference/sql-comparison/
21
Application Integration
MongoDB Aggregation Framework
• Ad-hoc reporting, grouping and aggregations,
without the complexity of MapReduce
– Max, Min, Averages, Sum

• Similar functionality to SQL GROUP_BY
• Processes a stream of documents
– Original input is a collection
– Final output is a result document

• Series of operators
– Filter or transform data
– Input/output chain

• Supports single servers & shards
22
SQL to Aggregation Mapping

Mapping Chart:
http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/
23
Application Integration
Advanced Analytics
• Native MapReduce in MongoDB
– Enables more complex analysis than Aggregation
Framework

• MongoDB Connector for Hadoop
– Integrates real time data from MongoDB with Hadoop
– Reads and writes directly from MongoDB, avoiding
copying TBs of data across the network
– Support for SQL-like queries from Apache Hive
– Support for MapReduce, Pig, Hadoop
Streaming, Flume

24
BI Integration

Dashboards

Reports
Reporting

Visualizations
25
Document Level Atomicity
{
first_name: „Paul‟,
surname: „Miller‟
city: „London‟,
location: [45.123,47.232],
cars: [
{ model: „Bently‟,
year: 1973,
value: 100000, … },
{ model: „Rolls Royce‟,
year: 1965,
value: 330000, … }
}
}

26

• “All or Nothing” updates

• Extends to embedded documents
and arrays
• Consistent view to application

• Transaction-like semantics for
multi-doc updates with
findandmodify() or 2PC
Maintaining Strong Consistency
• By default, all reads and writes sent
to Primary
• Reads to secondary replicas will be
eventually consistent
• Scale by sharding

• Read Preferences control how
reads are routed

27
Data Durability – Write Concerns
• Configurable per
operation
• Default is ACK by
primary

28
Data Durability – Journaling
• Guarantees write durability & crash resistance
– All operations written to journal before being applied to
the database (WAL)
– Configure writes to wait until committed to journal
before ACK to application
– Replay journal after a server crash

• Operations committed in groups, at configurable
intervals
– 2ms – 300ms

29
Migration and
Operations

30
Data Migration

31
Operations
• Monitoring, Management and Backup
• High Availability
• Scalability
• Hardware selection
– Commodity Servers: Prioritize RAM, Fast CPUs & SSD

• Security
– Access Control, Authentication, Encryption

Download the Whitepaper
MongoDB Operations Best Practices
32
MongoDB Management Service
Cloud-based suite of services for managing
MongoDB deployments
• Monitoring, with charts,
dashboards and alerts on 100+
metrics

• Backup and restore, with pointin-time recovery, support for
sharded clusters
• MMS On-Prem included with MongoDB Enterprise
(backup coming soon)
33
High Availability: Replica Sets

• Automated replication and failover
• Multi-data center support
• Improved operational simplicity (e.g., HW swaps)
• Maintenance & Disaster Recovery
34
Scalability: Auto-Sharding

• Three types of sharding: hash-based, range-based, tagaware. Application transparent
• Increase or decrease capacity as you go
• Automatic balancing
35
Summary and
Getting Started

36
Summary
• Benefits of migration are well understood
• Many successful projects
• Largest differences in data model and query
language
– MongoDB is much more suited to the way applications
are built and run today

• Many principles of RDBMS apply to MongoDB

Download the Whitepaper
http://www.mongodb.com/dl/migrate-rdbms-nosql
37
For More Information
Resource

MongoDB Downloads

mongodb.com/download

Free Online Training

education.mongodb.com

Webinars and Events

mongodb.com/events

White Papers

mongodb.com/white-papers

Case Studies

mongodb.com/customers

Presentations

mongodb.com/presentations

Documentation

docs.mongodb.org

Additional Info

38

Location

info@mongodb.com
BACKUP

40
Enable Success:
MongoDB University
Public
• 2-3 day courses

• Dev, Admin and
Essentials courses
• Worldwide

Private

Online

• Customized to your
needs

• Free, runs over 7 weeks

• On-Site

• Lectures, homework, final
exam
• 100k+ enrollments
• Private online for Enterprise
users

41
Enable Success:
MongoDB Support & Consulting
Community Resource

Commercial
Support

• Google Groups &
StackOverflow Forums

• Access to MongoDB
engineers

• MUGs, Office Hours

• Up to 24 x 7, 30 minute
response

• IRC Channels
• Docs

42

• Unlimited incidents &
hot fixes

Consulting
• Lightning consults
• Healthchecks
• Custom consults

• Dedicated TAM
Case Study
Serves variety of content and user services on
multiple platforms to 7M web and mobile users
Problem

Why MongoDB

• MySQL reached scale
ceiling – could not cope
with performance and
scalability demands

• Unrivaled performance

• Metadata management
too challenging with
relational model

• Intuitive mapping

• Hard to integrate
external data sources
43

• Simple scalability and
high availability

• Eliminated 6B+ rows of
attributes – instead
creates single document
per user / piece of content

Results
• Supports 115,000+
queries per second
• Saved £2M+ over 3 yrs.
• “Lead time for new
implementations is cut
massively”
• MongoDB is default
choice for all new
projects
Case Study
Runs social marketing suite with real-time
analytics on MongoDB
Problem
• RDBMS could not meet
speed and scale
requirements of
measuring massive
online activity
• Inability to provide realtime analytics and
aggregations
• Unpredictable peak
loads
44

Why MongoDB
• Ease of use, developer
ramp-up

Results
• Decreased app
development from months
to weeks

• Solution maturity – depth
of functionality, failover
• 30M social events per day
stored in MongoDB
• High-performance with
• 6x increase in customers
write-heavy system
supported over one year
• Queuing and logging for
easy search at app layer
Case Study
Uses MongoDB to safeguard over 6 billion images
served to millions of customers
Problem

Why MongoDB

• 6B images, 20TB of
data

• JSON-based data
model

• Brittle code base on top
of Oracle database –
hard to scale, add
features

• Agile, high
performance, scalable

• High SW and HW costs

45

• Alignment with
Shutterfly‟s servicesbased architecture

Results
• 80% cost reduction

• 900% performance
improvement
• Faster time-to-market
• Dev. cycles in weeks
vs. tens of months
Case Study
Uses MongoDB to power enterprise social
networking platform
Problem
• Complex SQL queries,
highly normalized
schema not aligned with
new data types
• Poor performance
• Lack of horizontal
scalability

46

Why MongoDB

Results

• Dynamic schemas
using JSON

• Flexibility to roll out new
social features quickly

• Ability to handle
complex data while
maintaining high
performance

• Sped up reads from 30
seconds to tens of
milliseconds

• Social network analytics
with lightweight
MapReduce

• Dramatically increased
write performance
Case Study
Stores billions of posts in myriad formats
with MongoDB
Problem

Why MongoDB

Results

• 1.5M posts per
day, different structures

• Flexible documentbased model

• Inflexible
MySQL, lengthy delays
for making changes

• Horizontal scalability
built in

• Data piling up in
production database

• Easy to use

• Automated failover
provides high
availability

• Interface in familiar
language

• Schema changes are
quick and easy

• Poor performance

47

• Initial deployment held
over 5B documents and
10TB of data
Case Study
Uses MongoDB as go-to database for all new
projects
Problem

Why MongoDB

• RDBMS had poor
performance and could
not scale

• Ease of use and
integration with systems

• Too much operational
overhead
• Needed more developer
control

48

• Small operational footprint
• Document model supports
continuous development

• Flexible licensing model

Results
• Time from release to
production reduced to
<30 minutes
• Easy to add new features
• Developers can focus on
apps instead of ops
Case Study
Stores user and location-based data in MongoDB
for social networking mobile app
Problem
• Relational architecture
could not scale
• Check-in data growth
hit single-node capacity
ceiling
• Significant work to build
custom sharding layer

49

Why MongoDB

Results

• Auto-sharding to scale
high-traffic and fastgrowing application

• Focus engineering on
building mobile app vs.
back-end

• Geo-indexing for easy
querying of locationbased data

• Scale efficiently with
limited resources

• Simple data model

• Increased developer
productivity
Case Study
MongoDB enables Gilt to roll out new revenuegenerating features faster and cheaper
Problem
• Monolithic Postgres
architecture expensive
to scale

Why MongoDB
• Dynamic schema makes it
easy to build new features
• Alignment with SOA

• Limited ability to add
new features for
different business silos

• Cost-effective, horizontal
scaling

• Spiky server loads

• Easy to use and maintain

Results
• Developers can launch
new services
faster, e.g., customized
upsell emails
• Stable, sub-ms
performance on
commodity hardware
• Reduced complexity
yields lower overhead

50
Case Study
Built custom ecommerce platform on MongoDB in 8
Months
Problem
•

Dated e-commerce site
with limited capabilities

•

Usability issues

•

SQL database did not
scale

Why MongoDB
• Multi-data center
replication and sharding
for DR and scalability
• Dynamic schema
• Fast performance (reads
and writes)

Results
• Developers, users are
empowered
• Fast time to market
• Database can meet
evolving business
needs
• Superior user
experience

51
MongoDB Features
• JSON Document Model
with Dynamic Schemas

• Full, Flexible Index Support
and Rich Queries

• Auto-Sharding for
Horizontal Scalability

• Built-In Replication for High
Availability

• Text Search, Geospatial
queries

• Advanced Security

• Aggregation Framework
and MapReduce

52

• Large Media Storage with
GridFS

Contenu connexe

Plus de Mat Keep

PayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterMat Keep
 
MySQL HA Solutions
MySQL HA SolutionsMySQL HA Solutions
MySQL HA SolutionsMat Keep
 
MySQL Cluster NoSQL Memcached API
MySQL Cluster NoSQL Memcached APIMySQL Cluster NoSQL Memcached API
MySQL Cluster NoSQL Memcached APIMat Keep
 
MySQL Cluster performance best practices
MySQL Cluster performance best practicesMySQL Cluster performance best practices
MySQL Cluster performance best practicesMat Keep
 
My sql 5.6_replwebinar_may12
My sql 5.6_replwebinar_may12My sql 5.6_replwebinar_may12
My sql 5.6_replwebinar_may12Mat Keep
 
NoSQL and MySQL webinar - best of both worlds
NoSQL and MySQL webinar - best of both worldsNoSQL and MySQL webinar - best of both worlds
NoSQL and MySQL webinar - best of both worldsMat Keep
 

Plus de Mat Keep (6)

PayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL Cluster
 
MySQL HA Solutions
MySQL HA SolutionsMySQL HA Solutions
MySQL HA Solutions
 
MySQL Cluster NoSQL Memcached API
MySQL Cluster NoSQL Memcached APIMySQL Cluster NoSQL Memcached API
MySQL Cluster NoSQL Memcached API
 
MySQL Cluster performance best practices
MySQL Cluster performance best practicesMySQL Cluster performance best practices
MySQL Cluster performance best practices
 
My sql 5.6_replwebinar_may12
My sql 5.6_replwebinar_may12My sql 5.6_replwebinar_may12
My sql 5.6_replwebinar_may12
 
NoSQL and MySQL webinar - best of both worlds
NoSQL and MySQL webinar - best of both worldsNoSQL and MySQL webinar - best of both worlds
NoSQL and MySQL webinar - best of both worlds
 

Dernier

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 

Dernier (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 

Migrating from Relational Databases to MongoDB

  • 1. RDBMS to MongoDB Migration Considerations and Best Practices Mat Keep MongoDB Product Marketing mat.keep@mongodb.com @matkeep
  • 2. Agenda • Migration Roadmap • Schema Design • Application Integration • Data Migration • Operational Considerations • Resources to Get Started 2
  • 3. Strategic Priorities Enabling New & Enhancing Existing Apps Faster Time to Market 3 Better Customer Experience Lower TCO
  • 4. Hitting RDBMS Limits Data Types Agile Development • Unstructured data • Iterative • Semi-structured data • Short development cycles • Polymorphic data • New workloads Volume of Data New Architectures • Petabytes of data • Horizontal scaling • Trillions of records • Commodity servers • Millions of queries per second 4 • Cloud computing
  • 5. Migration: Proven Benefits Organization Migrated From Application edmunds.com Oracle Billing, online advertising, user data Cisco Multiple RDBMS Craigslist MySQL Content management Salesforce Marketing Cloud RDBMS Social marketing, analytics Foursquare PostgreSQL MTV Networks Multiple RDBMS Orange Digital MySQL Analytics, social networking Social, mobile networking platforms Centralized content management Content Management http://www.mongodb.com/customers 5
  • 7. Migration Roadmap • Backed by Free, Online MongoDB Training • 100k+ registrations to date • Consulting and Support also available 7
  • 10. Data Models: Relational to Document MongoDB Document Relational { first_name: „Paul‟, surname: „Miller‟ city: „London‟, location: [45.123,47.232], cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } ] } 10
  • 11. Document Model Benefits • Rich data model, natural data representation – Embed related data in sub-documents & arrays – Support indexes and rich queries against any element • Data aggregated to a single structure (preJOINed) – Programming becomes simple – Performance can be delivered at scale • Dynamic schema – Data models can evolve easily – Adapt to changes quickly: agile methodology 11
  • 12. The Power of Dynamic Schema RDBMS MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] } 12
  • 14. MongoDB: Denormalized to 2 BSON Documents Higher Performance: Data Locality 14
  • 15. Defining the Data Model Application RDBMS Action Create Product Record INSERT to (n) tables insert() to 1 document (product description, price, with sub-documents, manufacturer, etc.) arrays Display Product Record SELECT and JOIN (n) product tables find() aggregated document Add Product Review INSERT to “review” table, foreign key to product record insert() to “review” collection, reference to product document More actions….. …… MongoDB Action …… • Analyze data access patterns of the application – Identify data that is accessed together, model within a document • Identify most common queries queries from logs 15
  • 16. Modeling Relationships: Embedding and Referencing • Embedding – For 1:1 or 1:Many (where “many” viewed with the parent) – Ownership and containment – Document limit of 16MB, consider document growth • Referencing – – – – – 16 _id field is referenced in the related document Application runs 2nd query to retrieve the data Data duplication vs performance gain Object referenced by many different sources Models complex Many : Many & hierarchical structures
  • 17. Referencing Publisher ID in Book  publisher = {  _id: "oreilly",  name: "O‟Reilly Media",  founded: "1980",  location: "CA"  }  book = {  title: "MongoDB: The Definitive Guide",  authors: [ "Kristina Chodorow", "Mike Dirolf" ],  published_date: ISODate("2010-09-24"),  pages: 216,  language: "English",  publisher_id: "oreilly"  } 17
  • 18. Indexing in MongoDB • MongoDB indexing will be familiar to DBAs – B-Tree Indexes, Secondary Indexes • Single biggest tunable performance factor – Define indexes by identifying common queries – Use MongoDB explain to ensure index coverage – MongoDB profiler logs all slow queries • Compound • Geospatial • Unique • Hash • Array • Sparse • TTL • Text Search 18
  • 20. MongoDB Drivers and API Drivers Drivers for most popular programming languages and frameworks Java Ruby JavaScript Implemented as methods within API of the language, not a separate language like SQL Perl Python Haskell IBM MongoDB API selected as standard for mobile app development 20
  • 21. Mapping the MongoDB API to SQL Mapping Chart: http://docs.mongodb.org/manual/reference/sql-comparison/ 21
  • 22. Application Integration MongoDB Aggregation Framework • Ad-hoc reporting, grouping and aggregations, without the complexity of MapReduce – Max, Min, Averages, Sum • Similar functionality to SQL GROUP_BY • Processes a stream of documents – Original input is a collection – Final output is a result document • Series of operators – Filter or transform data – Input/output chain • Supports single servers & shards 22
  • 23. SQL to Aggregation Mapping Mapping Chart: http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/ 23
  • 24. Application Integration Advanced Analytics • Native MapReduce in MongoDB – Enables more complex analysis than Aggregation Framework • MongoDB Connector for Hadoop – Integrates real time data from MongoDB with Hadoop – Reads and writes directly from MongoDB, avoiding copying TBs of data across the network – Support for SQL-like queries from Apache Hive – Support for MapReduce, Pig, Hadoop Streaming, Flume 24
  • 26. Document Level Atomicity { first_name: „Paul‟, surname: „Miller‟ city: „London‟, location: [45.123,47.232], cars: [ { model: „Bently‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } } 26 • “All or Nothing” updates • Extends to embedded documents and arrays • Consistent view to application • Transaction-like semantics for multi-doc updates with findandmodify() or 2PC
  • 27. Maintaining Strong Consistency • By default, all reads and writes sent to Primary • Reads to secondary replicas will be eventually consistent • Scale by sharding • Read Preferences control how reads are routed 27
  • 28. Data Durability – Write Concerns • Configurable per operation • Default is ACK by primary 28
  • 29. Data Durability – Journaling • Guarantees write durability & crash resistance – All operations written to journal before being applied to the database (WAL) – Configure writes to wait until committed to journal before ACK to application – Replay journal after a server crash • Operations committed in groups, at configurable intervals – 2ms – 300ms 29
  • 32. Operations • Monitoring, Management and Backup • High Availability • Scalability • Hardware selection – Commodity Servers: Prioritize RAM, Fast CPUs & SSD • Security – Access Control, Authentication, Encryption Download the Whitepaper MongoDB Operations Best Practices 32
  • 33. MongoDB Management Service Cloud-based suite of services for managing MongoDB deployments • Monitoring, with charts, dashboards and alerts on 100+ metrics • Backup and restore, with pointin-time recovery, support for sharded clusters • MMS On-Prem included with MongoDB Enterprise (backup coming soon) 33
  • 34. High Availability: Replica Sets • Automated replication and failover • Multi-data center support • Improved operational simplicity (e.g., HW swaps) • Maintenance & Disaster Recovery 34
  • 35. Scalability: Auto-Sharding • Three types of sharding: hash-based, range-based, tagaware. Application transparent • Increase or decrease capacity as you go • Automatic balancing 35
  • 37. Summary • Benefits of migration are well understood • Many successful projects • Largest differences in data model and query language – MongoDB is much more suited to the way applications are built and run today • Many principles of RDBMS apply to MongoDB Download the Whitepaper http://www.mongodb.com/dl/migrate-rdbms-nosql 37
  • 38. For More Information Resource MongoDB Downloads mongodb.com/download Free Online Training education.mongodb.com Webinars and Events mongodb.com/events White Papers mongodb.com/white-papers Case Studies mongodb.com/customers Presentations mongodb.com/presentations Documentation docs.mongodb.org Additional Info 38 Location info@mongodb.com
  • 39.
  • 41. Enable Success: MongoDB University Public • 2-3 day courses • Dev, Admin and Essentials courses • Worldwide Private Online • Customized to your needs • Free, runs over 7 weeks • On-Site • Lectures, homework, final exam • 100k+ enrollments • Private online for Enterprise users 41
  • 42. Enable Success: MongoDB Support & Consulting Community Resource Commercial Support • Google Groups & StackOverflow Forums • Access to MongoDB engineers • MUGs, Office Hours • Up to 24 x 7, 30 minute response • IRC Channels • Docs 42 • Unlimited incidents & hot fixes Consulting • Lightning consults • Healthchecks • Custom consults • Dedicated TAM
  • 43. Case Study Serves variety of content and user services on multiple platforms to 7M web and mobile users Problem Why MongoDB • MySQL reached scale ceiling – could not cope with performance and scalability demands • Unrivaled performance • Metadata management too challenging with relational model • Intuitive mapping • Hard to integrate external data sources 43 • Simple scalability and high availability • Eliminated 6B+ rows of attributes – instead creates single document per user / piece of content Results • Supports 115,000+ queries per second • Saved £2M+ over 3 yrs. • “Lead time for new implementations is cut massively” • MongoDB is default choice for all new projects
  • 44. Case Study Runs social marketing suite with real-time analytics on MongoDB Problem • RDBMS could not meet speed and scale requirements of measuring massive online activity • Inability to provide realtime analytics and aggregations • Unpredictable peak loads 44 Why MongoDB • Ease of use, developer ramp-up Results • Decreased app development from months to weeks • Solution maturity – depth of functionality, failover • 30M social events per day stored in MongoDB • High-performance with • 6x increase in customers write-heavy system supported over one year • Queuing and logging for easy search at app layer
  • 45. Case Study Uses MongoDB to safeguard over 6 billion images served to millions of customers Problem Why MongoDB • 6B images, 20TB of data • JSON-based data model • Brittle code base on top of Oracle database – hard to scale, add features • Agile, high performance, scalable • High SW and HW costs 45 • Alignment with Shutterfly‟s servicesbased architecture Results • 80% cost reduction • 900% performance improvement • Faster time-to-market • Dev. cycles in weeks vs. tens of months
  • 46. Case Study Uses MongoDB to power enterprise social networking platform Problem • Complex SQL queries, highly normalized schema not aligned with new data types • Poor performance • Lack of horizontal scalability 46 Why MongoDB Results • Dynamic schemas using JSON • Flexibility to roll out new social features quickly • Ability to handle complex data while maintaining high performance • Sped up reads from 30 seconds to tens of milliseconds • Social network analytics with lightweight MapReduce • Dramatically increased write performance
  • 47. Case Study Stores billions of posts in myriad formats with MongoDB Problem Why MongoDB Results • 1.5M posts per day, different structures • Flexible documentbased model • Inflexible MySQL, lengthy delays for making changes • Horizontal scalability built in • Data piling up in production database • Easy to use • Automated failover provides high availability • Interface in familiar language • Schema changes are quick and easy • Poor performance 47 • Initial deployment held over 5B documents and 10TB of data
  • 48. Case Study Uses MongoDB as go-to database for all new projects Problem Why MongoDB • RDBMS had poor performance and could not scale • Ease of use and integration with systems • Too much operational overhead • Needed more developer control 48 • Small operational footprint • Document model supports continuous development • Flexible licensing model Results • Time from release to production reduced to <30 minutes • Easy to add new features • Developers can focus on apps instead of ops
  • 49. Case Study Stores user and location-based data in MongoDB for social networking mobile app Problem • Relational architecture could not scale • Check-in data growth hit single-node capacity ceiling • Significant work to build custom sharding layer 49 Why MongoDB Results • Auto-sharding to scale high-traffic and fastgrowing application • Focus engineering on building mobile app vs. back-end • Geo-indexing for easy querying of locationbased data • Scale efficiently with limited resources • Simple data model • Increased developer productivity
  • 50. Case Study MongoDB enables Gilt to roll out new revenuegenerating features faster and cheaper Problem • Monolithic Postgres architecture expensive to scale Why MongoDB • Dynamic schema makes it easy to build new features • Alignment with SOA • Limited ability to add new features for different business silos • Cost-effective, horizontal scaling • Spiky server loads • Easy to use and maintain Results • Developers can launch new services faster, e.g., customized upsell emails • Stable, sub-ms performance on commodity hardware • Reduced complexity yields lower overhead 50
  • 51. Case Study Built custom ecommerce platform on MongoDB in 8 Months Problem • Dated e-commerce site with limited capabilities • Usability issues • SQL database did not scale Why MongoDB • Multi-data center replication and sharding for DR and scalability • Dynamic schema • Fast performance (reads and writes) Results • Developers, users are empowered • Fast time to market • Database can meet evolving business needs • Superior user experience 51
  • 52. MongoDB Features • JSON Document Model with Dynamic Schemas • Full, Flexible Index Support and Rich Queries • Auto-Sharding for Horizontal Scalability • Built-In Replication for High Availability • Text Search, Geospatial queries • Advanced Security • Aggregation Framework and MapReduce 52 • Large Media Storage with GridFS

Notes de l'éditeur

  1. Welcome to the Relational DB to MongoDB Migration webinarMy name Mat Keep, taking through the considerations and best practices for db migrationTo set the context – we will see challenges in the way we build and run applications today, that a number of users are starting to look beyond relational databases to NoSQL as they evolve their applications – some are beyond looking and have actually done it So in this session, discuss considerations need to make as you explore potential migration from RDBMS to MongoDBAlso have Bryan Reinero on line – he is here to answer your questions – place in the webinar tool
  2. Start by talking about developing a roadmap for migration – then get into each stage of the migrationNot going to go massively deep on each topic – but rather look at key considerations you need to make as you evaluate a migration
  3. Before that, look at some of the drivers for migrationLook across the enterprise – see 4 x core strategic initiativesOrganizations concerned with enabling new apps, and enhancing existing apps – specifically to drive new revenue streams, enhancing operational processes, delivering better customer experiences:These updated applications involve handling rapid growth in data from new channels: from mobile and social apps, from sensor networks and so onAt same time, deliver application enhancements with faster time to market and lower ongoing TCO
  4. Many of these requirements causing RDBMS hit their design limitsMuch of the data being generated from new web, social or mobile apps is highly complex, very often semi-structured or unstructured, difficult to map into 2D structures of rows and columns. See what that data looks like laterData volumes and diversity of data increasingly, Development methodologies are changing, from traditional waterfall approaches where requirements are defined at the very start of the project, into agile approaches supporting iterative development and shorter release cyclesEmergence of new architectures for scaling out in the cloud – on-prem or public infrastructureAll of these requirements place massive strain on existing relational infrastructure
  5. Because of these challenges, Migration is well trodden path – see range of examples from web, media, telecoms, mobile, range of apps: ecommerce, social networking, real-time analytics, billing:Working with many of these, given insight into considerations discussing here
  6. So, how do we get started
  7. High level roadmap – divide into 5 stages, keys steps in each stageStarts with formation of project team, ensure we have all key stakeholders on board. successful projects bring together LOB, devs, DBAs Operational staff. Then get to the most important area – schema design, capture app requirements, model our data, define indexesApp integration – drivers and API we are using – models for consistency and durability, Also at this stage, start load testingPhysical migration of data, multiple options for doing thatOperational considerations, strategies in place for all things listedTo support you on your way – have free online mongodb training for devs and dbas – already had over 100k registrantsConsulting and support packages also available
  8. 1st and most critical phase of the migration is schema design – spend a little bit of time on thisAlso point you to a very recent webinar, available on-demand which explores this topic In much more depth
  9. Before getting into schema design, lets start by standardising terminology between relational and mongodb worlds
  10. lets cover comceptually the differences between the relational model and the documentrequires a change in perspective for data architects - from the relational data model that flattens data into rigid 2-dimensional tabular structures of rows and columns - to a document data model allows embedding of sub-documents and arrays. In this example of a car ownership database with 2 tables – 1 for person and another for their cars, the Relational model uses the Pers_ID field to JOIN the “Person” table with the “Car” table so that the application canreport owners of each car.Modeling the same data in MongoDB enables us to create a schema where we embed each of the cars as an array of sub-documents directly within the Person document.
  11. So what are the advantages of this approach
  12. A significant advantage of NoSQL databases is that the schema is dynamic. Collections can be created without first defining the structure of documents. Documents in a collection need not have an identical set of fields. The structure of documents can be changed adding new fields or deleting existing ones. Enables schema to be evolved much more rapidly. Do have a lot of flexibility during the migration phase to quickly iterate over schema design, and that flexibility carries on to production
  13. Lets continue the comparison between relational and document model - consider the example of a blogging platformGot 5 tables - Category, article, user, comments and tags - the application relies on the RDBMS to join five separate tables in order to build the blog entry. –
  14. In the case of MongoDB, all of the blog data is aggregated within a single document, linked with a single reference to a user document containing authors ofboth the blog and commentsFrom a performance and scalability perspective, the aggregated document can be accessed in a single call to the database, rather than having to JOIN multiple tables to respond to a query
  15. How do we go about defining the data model?The golden rule of schema design is that the data access patterns of the application should govern the design process, look specifically: The read / write ratio of database operations;The types of queries and updates performed by the database;The growth rate of documents.As a first step,capture the operations the application needs to perform on it’s data,using example of a product catalogFor illustration we show the process of creating the product record, retrieving the record and adding a review – show the steps the relational database takes for each of those tasks and look at how that same functionality is handled by MongoDB – this helps inform the schema design for our new databaseAnother important reference point is to look at logs of current database to identify most common queries, identify what data is stored togetherThese steps will help to identify the ideal document schema for the application’s data, based on the queries and operations to be performed against it.
  16. Impt design decision is deciding when to embed related data in a document or instead create a reference between separate documents. This is an application-specific consideration. There aresome general considerations that can be made to guide the decision during schema design.Embedding data via sub-documents or arrays, makes senseFor 1:1 or 1:Many (where “many” viewed with the parent)Consider the concept of ownership and containment. Using the product data example earlier, product pricing - should be embedded within the product document as it is owned by and contained within the product. If the product is deleted, the pricing becomes redundant. Makes sense to embedNeed to consider document growth – MongoDB current max document size is 16MBReferencing gives greater degree of data normalization, and can give more flexibility than embedding; however to resolve the reference, the application will issue follow-up queries, requiring additional round-trips to the server.References are generally implemented by saving the _id field of one document in the related document as a reference. The application then runs a second query to return the referenced data. Referencing is generally used when:Embedding would not provide sufficient read performance advantages to outweigh the implications of data duplication – got example;To represent more complex many-to-many relationships;To model large hierarchical data sets.
  17. Good example of referencing iswhere we we have 1: Many relationship, in this case 1 publisher to many books, as we show on this chartEmbedding books inside publisher create rapidly growing docs, because new books continually being added Embedding publisher sub-document inside of each book causes lots of data duplication,so instead we create a reference using publisher ID – in this case on o’reilly
  18. During schema design, think about how we query our data – some NoSQL databases that are little more than key/value stores, so you maybe able to ingest data quickly, but you can’t do anything other than primary key lookups – huge backward step coming from the relational worldMongoDB on the other hand has a rich query model enabled by extensive indexing. Indexes can be defined for any key or array within the document, as secondary indexesMongoDB indexing will be familiar to DBAs- B-Tree Indexes, Secondary IndexesAs with a relational DB, indexes are the single biggest tunable performance factor- Define indexes by identifying common queries- Use MongoDB explain to ensure index coverage- Use MongoDB profiler log all slow queriesListed index types on the slide, include text search and geospatialArray indexes allow you to index each element of an embedded array, ie in a document describing a product, each of the categories that the product can be classified under can be included in an array and indexed, so get a major performance boost when users are searching by those classificationsThis sort of flexibility gives MongoDB ability to run complex queries quickly
  19. Move on to App IntegrationDiscuss drivers, discuss application query requirements, and how Mongo can be configured to meet data integrity &amp; consistency requirements of the app
  20. MongoDB has idiomatic drivers for the most popular languages with over a dozen developed and supported by MongoDB and 30+ community-supported drivers. MongoDB API is implemented as methods within the API of a specific programming language, as opposed to a completely separate language like SQL. If we couple this withMongoDB’s document model and their affinity data structures used in object-oriented programming, makes integration with applications very simple.The MongoDB API is becoming a standardIBM recently announced its collaboration in developing standards for mobile app development with the API in their own DB2 database.
  21. For developers familiar with SQL it is useful to understand how core SQL statements map to the MongoDB API. Got an example on screen.The documentation includes a mapping chart to help in that transition
  22. impt to understand you’re not losing query functionality when migrating to MongoDB.Beyond powerful API and secondary indexes, MongoDB also offers extensive capabilities for analysis of data, both within the database and through integration with other tooling and frameworks –MongoDB provides the Aggregation Framework which delivers similar functionality to the SQL GROUP_BY statementGives you Ad-hoc reporting, grouping and aggregations, without the complexity of MapReduce: Max, Min, Averages, SumSupports aggregations on single servers &amp; across shards
  23. The SQL to Aggregation Mapping Chart in our docs shows a number of examples of how queries in SQL are handled in MongoDB’s aggregation framework- very simple to get started with this functionality:
  24. Move on to even more advanced analyticsMongodbprovides native support for MapReduce operations over both sharded and unsharded collections. MapReduce can give developers finer grained control over grouping keys and a variety of output options.MongoDB Connector for Hadoop – read slide
  25. In terms of reporting, A number of Business Intelligence (BI) vendors have developed connectors to integrate MongoDB as a data source with their suites, alongside traditional relational dbs. This integration provides reporting, visualisations, dash-boarding of MongoDB data
  26. RDBMS offers stong data integrity and consistency controls – look at how you maintain those things in MongoDB – key part of application integrationStart with data integrityAll write operations are atomic at a document level within MongoDB - including the updating of embedded arrays and sub-documents. By embedding related fields within a single document, users get the same integrity guarantees as a traditional RDBMS without perf overhead of having to synchronize ACID operations and maintain referential integrity across separate tables. If you need multi-doc updates, can get tx like semantics using findanymodify or in some cases, 2PCfindandmodify command allows a document to be updated atomically and returned in the same round trip. This allows users to build job queues and state machines which can be used to implement basic transaction semantics
  27. MongoDB by default is strongly consistent - all reads directed to primary servers. Also by default, any reads from secondary servers within a MongoDB replica set will be eventually consistent - much like legacy master / slave replication in a relational database. Administrators can configure the secondary replicas to handle read traffic if needed,ie for long running reports or geographic distribution of replicas across TZs.
  28. MongoDB ensures durability through write concerns - used to configure the level of guarantee the server provides in response to a write operation. By default, MongoDB drivers wait for the mongo server to acknowledge the write from the primary before commiting to the application – this is configurable on a per operation basis, ie you might wait for the write to be ack by primary + 1 replica, a majority of replicas or all replicas. Perf impact in doing thatAlternatively, can relaxed write concern, the application can send a write operation to MongoDB and then continue without waiting for a response from the database, useful for logging apps
  29. In addition, MongoDB uses WAL to guarantee write durability and crash resistance
  30. Start with how to migrate data from the relational database to MongoDBThere are multiple options.-Tried to show different options on this 1 chart1 of most common is users writing scripts that extract data in a JSON format, matching the desired schema with embedding and arrays. Then use mongoimport to read directly into the databaseETL tools commonly used.A number of migrations have involved running the existing RDBMS in parallel with the new MongoDB database, incrementally transferring production data:- As records are retrieved from the RDBMS, the application writes them back out to MongoDB in the required document schema. - Consistency checkers, for example using MD5 checksums, can be used to validate the migrated data. All newly created or updated data is written to MongoDB onlyShutterfly used this incremental approach to migrate the metadata of 6 billion images and 20TB of data from Oracle to MongoDB.
  31. The considerations discussed so far fall into the domain of the data architects, developers and DBAs. Need to ensure database performs reliably at scale, and can be proactively monitored and managedThe final set of considerations in migration planning should focus on operational issues.The MongoDB Operations Best Practices guide is the definitive reference to learn more on this key area:Will touch on a couple of areas
  32. MongoDB has a range of mgmt and monitoring such as mongostat, mongotop, database profiler can be used to profile key operational metrics– one I want to highlight is MMSCloud-based suite of services for managing MongoDB deploymentsMonitoring, with charts, dashboards and alerts on 100+ metricsBackup and restore, with point-in-time recovery, support for sharded clustersMMS On-Prem included with MongoDB Enterprise (backup coming soon)
  33. Look at HA with replica sets - Touched on these briefly earlierMongoDB maintains multiple copies of data in what we call replica sets.. Failover between replicas is fully automated, eliminating the need for administrators to intervene ManuallyMulti-data center supportAs well as handling failures, also be used to run maintenance operations using rolling restarts to avoid downtime
  34. MongoDB provides horizontal scale-out for databases using a technique called sharding, which is transparent to applications. Sharding distributes data across multiple physical partitions,allows us to scale beyond the hardware limitations of a single server, without adding complexity to the applicationData is automatically balanced across the shards to ensure even distributionNeed to ensure you pick the right shard key
  35. That was a very quick view of operations – download Ops Guide to learn moreSummary
  36. Also look at how we enables our migration teams to be successful – training is a key way to do this, via MongoDB UniSpecific courses are available for both developers and for DBAs, with options in how the courses are consumedFree, web-based classes, delivered over a 7 week period, supported by lectures, homework and forums to engage with instructors and other students – over 100k enrollments;2-day public training events held at MongoDB facilities;Private training customised to an organization’s specific requirements, delivered at their site.
  37. SLAs – as low as 30 minutes
  38. MongoDB-enabled applications include:Dealer Analytics. This mobile and wired app built on MongoDB provides dealers with real-time insight into how users are interacting with inventory on Edmunds.com.Ad Sales Management. A MongoDB-based web app provides a consolidated view on inventory of Edmunds ad placements and helps ensure clients are accurately billed for ads served.Ratings / Reviews. MongoDB’s document database was a good choice for replacing Edmunds’ CMS as a means of data storage for this domain as the domain itself fits very naturally into a document model.User Registration. MongoDB replaced a proprietary system to help Edmunds.com manage user data.
  39. Core features of MongoDB are well aligned to organisations looking to evolve their applications to support new workloads– exploring some of these as part of the migration process