SlideShare une entreprise Scribd logo
1  sur  51
Perl Engineer & Evangelist, 10gen
Mike Friedman
#MongoDBdays
Schema Design
Four Real-World Use
Cases
Single Table En
Agenda
• Why is schema design important
• 4 Real World Schemas
– Inbox
– History
– IndexedAttributes
– Multiple Identities
• Conclusions
Why is Schema Design
important?
• Largest factor for a performant system
• Schema design with MongoDB is different
• RDBMS – "What answers do I have?"
• MongoDB – "What question will I have?"
#1 - Message Inbox
Let’s get
Social
Sending Messages
?
Design Goals
• Efficiently send new messages to recipients
• Efficiently read inbox
Reading my Inbox
?
3 Approaches (there are
more)
• Fan out on Read
• Fan out on Write
• Fan out on Write with Bucketing
// Shard on "from"
db.shardCollection( "mongodbdays.inbox", { from: 1 } )
// Make sure we have an index to handle inbox reads
db.inbox.ensureIndex( { to: 1, sent: 1 } )
msg = {
from: "Joe",
to: [ "Bob", "Jane" ],
sent: new Date(),
message: "Hi!",
}
// Send a message
db.inbox.save( msg )
// Read my inbox
db.inbox.find( { to: "Joe" } ).sort( { sent: -1 } )
Fan out on read
Fan out on read – Send
Message
Shard 1 Shard 2 Shard 3
Send
Message
Fan out on read – Inbox Read
Shard 1 Shard 2 Shard 3
Read
Inbox
Considerations
• One document per message sent
• Reading an inbox means finding all messages
with my own name in the recipient field
• Requires scatter-gather on sharded cluster
• Then a lot of random IO on a shard to find
everything
// Shard on “recipient” and “sent”
db.shardCollection( "mongodbdays.inbox", { ”recipient”: 1, ”sent”: 1 } )
msg = {
from: "Joe",
to: [ "Bob", "Jane" ],
sent: new Date(),
message: "Hi!",
}
// Send a message
for ( recipient in msg.to ) {
msg.recipient = msg.to[recipient]
db.inbox.save( msg );
}
// Read my inbox
db.inbox.find( { recipient: "Joe" } ).sort( { sent: -1 } )
Fan out on write
Fan out on write – Send
Message
Shard 1 Shard 2 Shard 3
Send
Message
Fan out on write– Read Inbox
Shard 1 Shard 2 Shard 3
Read
Inbox
Considerations
• One document per recipient
• Reading my inbox is just finding all of the
messages with me as the recipient
• Can shard on recipient, so inbox reads hit one
shard
• But still lots of random IO on the shard
// Shard on “owner / sequence”
db.shardCollection( "mongodbdays.inbox", { owner: 1, sequence: 1 } )
db.shardCollection( "mongodbdays.users", { user_name: 1 } )
msg = {
from: "Joe",
to: [ "Bob", "Jane" ],
sent: new Date(),
message: "Hi!",
}
Fan out on write with buckets
// Send a message
for( recipient in msg.to) {
count = db.users.findAndModify({
query: { user_name: msg.to[recipient] },
update: { "$inc": { "msg_count": 1 } },
upsert: true,
new: true }).msg_count;
sequence = Math.floor(count / 50);
db.inbox.update({
owner: msg.to[recipient], sequence: sequence },
{ $push: { "messages": msg } },
{ upsert: true } );
}
// Read my inbox
db.inbox.find( { owner: "Joe" } ).sort ( { sequence: -1 } ).limit( 2 )
Fan out on write with buckets
Fan out on write with buckets
• Each “inbox” document is an array of messages
• Append a message onto “inbox” of recipient
• Bucket inboxes so there’s not too many
messages per document
• Can shard on recipient, so inbox reads hit one
shard
• 1 or 2 documents to read the whole inbox
Fan out on write with buckets -
Send
Shard 1 Shard 2 Shard 3
Send
Message
Fan out on write with buckets -
Read
Shard 1 Shard 2 Shard 3
Read
Inbox
#2 – History
Design Goals
• Need to retain a limited amount of history e.g.
– Hours, Days, Weeks
– May be legislative requirement (e.g. HIPPA, SOX, DPA)
• Need to query efficiently by
– match
– ranges
3 Approaches (there are
more)
• Bucket by Number of messages
• Fixed size Array
• Bucket by Date + TTL Collections
db.inbox.find()
{ owner: "Joe", sequence: 25,
messages: [
{ from: "Joe",
to: [ "Bob", "Jane" ],
sent: ISODate("2013-03-01T09:59:42.689Z"),
message: "Hi!"
},
…
] }
// Query with a date range
db.inbox.find ({owner: "friend1",
messages: {
$elemMatch: {sent:{$gte: ISODate("…") }}}})
// Remove elements based on a date
db.inbox.update({owner: "friend1" },
{ $pull: { messages: {
sent: { $gte: ISODate("…") } } } } )
Inbox – Bucket by #
messages
Considerations
• Shrinking documents, space can be reclaimed
with
– db.runCommand ( { compact: '<collection>' } )
• Removing the document after the last element in
the array as been removed
– { "_id" : …, "messages" : [ ], "owner" : "friend1",
"sequence" : 0 }
msg = {
from: "Your Boss",
to: [ "Bob" ],
sent: new Date(),
message: "CALL ME NOW!"
}
// 2.4 Introduces $each, $sort and $slice for $push
db.messages.update(
{ _id: 1 },
{ $push: { messages: { $each: [ msg ],
$sort: { sent: 1 },
$slice: -50 }
}
}
)
Maintain the latest – Fixed
Size Array
Considerations
• Need to compute the size of the array based on
retention period
// messages: one doc per user per day
db.inbox.findOne()
{
_id: 1,
to: "Joe",
sequence: ISODate("2013-02-04T00:00:00.392Z"),
messages: [ ]
}
// Auto expires data after 31536000 seconds = 1 year
db.messages.ensureIndex( { sequence: 1 },
{ expireAfterSeconds: 31536000 } )
TTL Collections
#3 – Indexed Attributes
Design Goal
• Application needs to stored a variable number of
attributes e.g.
– User defined Form
– Meta Data tags
• Queries needed
– Equality
– Range based
• Need to be efficient, regardless of the number of
attributes
2 Approaches (there are
more)
• Attributes as Embedded Document
• Attributes as Objects in an Array
db.files.insert( { _id: "local.0",
attr: { type: "text", size: 64,
created: ISODate("..." } } )
db.files.insert( { _id: "local.1",
attr: { type: "text", size: 128} } )
db.files.insert( { _id: "mongod",
attr: { type: "binary", size: 256,
created: ISODate("...") } } )
// Need to create an index for each item in the sub-document
db.files.ensureIndex( { "attr.type": 1 } )
db.files.find( { "attr.type": "text"} )
// Can perform range queries
db.files.ensureIndex( { "attr.size": 1 } )
db.files.find( { "attr.size": { $gt: 64, $lte: 16384 } } )
Attributes as a Sub-
Document
Considerations
• Each attribute needs an Index
• Each time you extend, you add an index
• Lots and lots of indexes
db.files.insert( {_id: "local.0",
attr: [ { type: "text" },
{ size: 64 },
{ created: ISODate("...") } ] } )
db.files.insert( { _id: "local.1",
attr: [ { type: "text" },
{ size: 128 } ] } )
db.files.insert( { _id: "mongod",
attr: [ { type: "binary" },
{ size: 256 },
{ created: ISODate("...") } ] } )
db.files.ensureIndex( { attr: 1 } )
Attributes as Objects in Array
Considerations
• Only one index needed on attr
• Can support range queries, etc.
• Index can be used only once per query
#4 – Multiple Identities
Design Goal
• Ability to look up by a number of different
identities e.g.
• Username
• Email address
• FB Handle
• LinkedIn URL
2 Approaches (there are
more)
• Identifiers in a single document
• Separate Identifiers from Content
db.users.findOne()
{ _id: "joe",
email: "joe@example.com,
fb: "joe.smith", // facebook
li: "joe.e.smith", // linkedin
other: {…}
}
// Shard collection by _id
db.shardCollection("mongodbdays.users", { _id: 1 } )
// Create indexes on each key
db.users.ensureIndex( { email: 1} )
db.users.ensureIndex( { fb: 1 } )
db.users.ensureIndex( { li: 1 } )
Single Document by User
Read by _id (shard key)
Shard 1 Shard 2 Shard 3
find( { _id: "joe"} )
Read by email (non-shard
key)
Shard 1 Shard 2 Shard 3
find ( { email: joe@example.com }
)
Considerations
• Lookup by shard key is routed to 1 shard
• Lookup by other identifier is scatter gathered
across all shards
• Secondary keys cannot have a unique index
// Create unique index
db.identities.ensureIndex( { identifier : 1} , { unique: true} )
// Create a document for each users document
db.identities.save(
{ identifier : { hndl: "joe" }, user: "1200-42" } )
db.identities.save(
{ identifier : { email: "joe@abc.com" }, user: "1200-42" } )
db.identities.save(
{ identifier : { li: "joe.e.smith" }, user: "1200-42" } )
// Shard collection by _id
db.shardCollection( "mydb.identities", { identifier : 1 } )
// Create unique index
db.users.ensureIndex( { _id: 1} , { unique: true} )
// Shard collection by _id
db.shardCollection( "mydb.users", { _id: 1 } )
Document per Identity
Read requires 2 reads
Shard 1 Shard 2 Shard 3
db.identities.find({"identifier" : {
"hndl" : "joe" }})
db.users.find( { _id: "1200-42"}
)
Considerations
• Lookup to Identities is a routed query
• Lookup to Users is a routed query
• Unique indexes available
Conclusion
Summary
• Multiple ways to model a domain problem
• Understand the key uses cases of your app
• Balance between ease of query vs. ease of write
• Random IO should be avoided
Perl Engineer & Evangelist, 10gen
Mike Friedman
#MongoDBdays
Thank You

Contenu connexe

Tendances

Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Ontico
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB ClusterMongoDB
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema DesignMongoDB
 
Webinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance ImplicationsWebinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance ImplicationsMongoDB
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningMongoDB
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB
 
Copy of MongoDB .pptx
Copy of MongoDB .pptxCopy of MongoDB .pptx
Copy of MongoDB .pptxnehabsairam
 
Indexing and Performance Tuning
Indexing and Performance TuningIndexing and Performance Tuning
Indexing and Performance TuningMongoDB
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudMongoDB
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerMongoDB
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patternsjoergreichert
 
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]MongoDB
 
MongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To TransactionsMongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To TransactionsMydbops
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)MongoDB
 

Tendances (20)

Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB Cluster
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
Mongo db intro.pptx
Mongo db intro.pptxMongo db intro.pptx
Mongo db intro.pptx
 
Indexing
IndexingIndexing
Indexing
 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
 
Webinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance ImplicationsWebinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance Implications
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation Performance
 
Copy of MongoDB .pptx
Copy of MongoDB .pptxCopy of MongoDB .pptx
Copy of MongoDB .pptx
 
Indexing and Performance Tuning
Indexing and Performance TuningIndexing and Performance Tuning
Indexing and Performance Tuning
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTiger
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patterns
 
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
 
MongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To TransactionsMongoDB WiredTiger Internals: Journey To Transactions
MongoDB WiredTiger Internals: Journey To Transactions
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
Mongo DB Presentation
Mongo DB PresentationMongo DB Presentation
Mongo DB Presentation
 

En vedette

Advanced Schema Design Patterns
Advanced Schema Design Patterns Advanced Schema Design Patterns
Advanced Schema Design Patterns MongoDB
 
Building your first app with mongo db
Building your first app with mongo dbBuilding your first app with mongo db
Building your first app with mongo dbMongoDB
 
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB
 
Agile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDBAgile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDBStennie Steneker
 
Business Metrics and Web Marketing
Business Metrics and Web MarketingBusiness Metrics and Web Marketing
Business Metrics and Web MarketingAlper AKBAS
 
World-Class Web Metrics by Dan Olsen
World-Class Web Metrics by Dan OlsenWorld-Class Web Metrics by Dan Olsen
World-Class Web Metrics by Dan OlsenDan Olsen
 
Web analytics 101: Web Metrics
Web analytics 101: Web MetricsWeb analytics 101: Web Metrics
Web analytics 101: Web MetricsSociety_Consulting
 
Web Metrics vs Web Behavioral Analytics and Why You Need to Know the Difference
Web Metrics vs Web Behavioral Analytics and Why You Need to Know the DifferenceWeb Metrics vs Web Behavioral Analytics and Why You Need to Know the Difference
Web Metrics vs Web Behavioral Analytics and Why You Need to Know the DifferenceAlterian
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Dimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleDimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleSajjad Zaheer
 
Data Visualization and Dashboard Design
Data Visualization and Dashboard DesignData Visualization and Dashboard Design
Data Visualization and Dashboard DesignJacques Warren
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modelingaksrauf
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEZalpa Rathod
 
Data warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designSarita Kataria
 
Multi dimensional model vs (1)
Multi dimensional model vs (1)Multi dimensional model vs (1)
Multi dimensional model vs (1)JamesDempsey1
 

En vedette (17)

Advanced Schema Design Patterns
Advanced Schema Design Patterns Advanced Schema Design Patterns
Advanced Schema Design Patterns
 
Building your first app with mongo db
Building your first app with mongo dbBuilding your first app with mongo db
Building your first app with mongo db
 
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
 
Agile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDBAgile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDB
 
Business Metrics and Web Marketing
Business Metrics and Web MarketingBusiness Metrics and Web Marketing
Business Metrics and Web Marketing
 
World-Class Web Metrics by Dan Olsen
World-Class Web Metrics by Dan OlsenWorld-Class Web Metrics by Dan Olsen
World-Class Web Metrics by Dan Olsen
 
Web analytics 101: Web Metrics
Web analytics 101: Web MetricsWeb analytics 101: Web Metrics
Web analytics 101: Web Metrics
 
Web Metrics vs Web Behavioral Analytics and Why You Need to Know the Difference
Web Metrics vs Web Behavioral Analytics and Why You Need to Know the DifferenceWeb Metrics vs Web Behavioral Analytics and Why You Need to Know the Difference
Web Metrics vs Web Behavioral Analytics and Why You Need to Know the Difference
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Dimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleDimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with Example
 
Data Visualization and Dashboard Design
Data Visualization and Dashboard DesignData Visualization and Dashboard Design
Data Visualization and Dashboard Design
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
Data warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-design
 
Multi dimensional model vs (1)
Multi dimensional model vs (1)Multi dimensional model vs (1)
Multi dimensional model vs (1)
 

Similaire à MongoDB Schema Design: Four Real-World Examples

Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard keyMongoDB
 
Data Modeling for the Real World
Data Modeling for the Real WorldData Modeling for the Real World
Data Modeling for the Real WorldMike Friedman
 
Data Modeling Examples from the Real World
Data Modeling Examples from the Real WorldData Modeling Examples from the Real World
Data Modeling Examples from the Real WorldMongoDB
 
Webinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real WorldWebinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real WorldMongoDB
 
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...MongoDB
 
Data Modeling Deep Dive
Data Modeling Deep DiveData Modeling Deep Dive
Data Modeling Deep DiveMongoDB
 
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB
 
Schema Design - Real world use case
Schema Design - Real world use caseSchema Design - Real world use case
Schema Design - Real world use caseMatias Cascallares
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data ModelingDATAVERSITY
 
MongoDB Strange Loop 2009
MongoDB Strange Loop 2009MongoDB Strange Loop 2009
MongoDB Strange Loop 2009Mike Dirolf
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDBMongoDB
 
Schema Design (Mongo Austin)
Schema Design (Mongo Austin)Schema Design (Mongo Austin)
Schema Design (Mongo Austin)MongoDB
 
Mongodb intro
Mongodb introMongodb intro
Mongodb introchristkv
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling rogerbodamer
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)Uwe Printz
 
Schema design
Schema designSchema design
Schema designchristkv
 
MongoDB NYC Python
MongoDB NYC PythonMongoDB NYC Python
MongoDB NYC PythonMike Dirolf
 

Similaire à MongoDB Schema Design: Four Real-World Examples (20)

Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard key
 
Data Modeling for the Real World
Data Modeling for the Real WorldData Modeling for the Real World
Data Modeling for the Real World
 
Data Modeling Examples from the Real World
Data Modeling Examples from the Real WorldData Modeling Examples from the Real World
Data Modeling Examples from the Real World
 
Webinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real WorldWebinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real World
 
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
 
Data Modeling Deep Dive
Data Modeling Deep DiveData Modeling Deep Dive
Data Modeling Deep Dive
 
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
 
Schema Design - Real world use case
Schema Design - Real world use caseSchema Design - Real world use case
Schema Design - Real world use case
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
MongoDB Strange Loop 2009
MongoDB Strange Loop 2009MongoDB Strange Loop 2009
MongoDB Strange Loop 2009
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
 
Schema Design (Mongo Austin)
Schema Design (Mongo Austin)Schema Design (Mongo Austin)
Schema Design (Mongo Austin)
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling
 
Full metal mongo
Full metal mongoFull metal mongo
Full metal mongo
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
MongoDB at GUL
MongoDB at GULMongoDB at GUL
MongoDB at GUL
 
MongoDB at RuPy
MongoDB at RuPyMongoDB at RuPy
MongoDB at RuPy
 
Schema design
Schema designSchema design
Schema design
 
MongoDB NYC Python
MongoDB NYC PythonMongoDB NYC Python
MongoDB NYC Python
 

Plus de Mike Friedman

Basic Symbolic Computation in Perl
Basic Symbolic Computation in PerlBasic Symbolic Computation in Perl
Basic Symbolic Computation in PerlMike Friedman
 
Make Your Own Perl with Moops
Make Your Own Perl with MoopsMake Your Own Perl with Moops
Make Your Own Perl with MoopsMike Friedman
 
The Perl API for the Mortally Terrified (beta)
The Perl API for the Mortally Terrified (beta)The Perl API for the Mortally Terrified (beta)
The Perl API for the Mortally Terrified (beta)Mike Friedman
 
21st Century CPAN Testing: CPANci
21st Century CPAN Testing: CPANci21st Century CPAN Testing: CPANci
21st Century CPAN Testing: CPANciMike Friedman
 
CPANci: Continuous Integration for CPAN
CPANci: Continuous Integration for CPANCPANci: Continuous Integration for CPAN
CPANci: Continuous Integration for CPANMike Friedman
 
Building a MongoDB App with Perl
Building a MongoDB App with PerlBuilding a MongoDB App with Perl
Building a MongoDB App with PerlMike Friedman
 
Building Your First App with MongoDB
Building Your First App with MongoDBBuilding Your First App with MongoDB
Building Your First App with MongoDBMike Friedman
 
Building Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::ClientBuilding Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::ClientMike Friedman
 

Plus de Mike Friedman (8)

Basic Symbolic Computation in Perl
Basic Symbolic Computation in PerlBasic Symbolic Computation in Perl
Basic Symbolic Computation in Perl
 
Make Your Own Perl with Moops
Make Your Own Perl with MoopsMake Your Own Perl with Moops
Make Your Own Perl with Moops
 
The Perl API for the Mortally Terrified (beta)
The Perl API for the Mortally Terrified (beta)The Perl API for the Mortally Terrified (beta)
The Perl API for the Mortally Terrified (beta)
 
21st Century CPAN Testing: CPANci
21st Century CPAN Testing: CPANci21st Century CPAN Testing: CPANci
21st Century CPAN Testing: CPANci
 
CPANci: Continuous Integration for CPAN
CPANci: Continuous Integration for CPANCPANci: Continuous Integration for CPAN
CPANci: Continuous Integration for CPAN
 
Building a MongoDB App with Perl
Building a MongoDB App with PerlBuilding a MongoDB App with Perl
Building a MongoDB App with Perl
 
Building Your First App with MongoDB
Building Your First App with MongoDBBuilding Your First App with MongoDB
Building Your First App with MongoDB
 
Building Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::ClientBuilding Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::Client
 

Dernier

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Dernier (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

MongoDB Schema Design: Four Real-World Examples

  • 1. Perl Engineer & Evangelist, 10gen Mike Friedman #MongoDBdays Schema Design Four Real-World Use Cases
  • 2. Single Table En Agenda • Why is schema design important • 4 Real World Schemas – Inbox – History – IndexedAttributes – Multiple Identities • Conclusions
  • 3. Why is Schema Design important? • Largest factor for a performant system • Schema design with MongoDB is different • RDBMS – "What answers do I have?" • MongoDB – "What question will I have?"
  • 4. #1 - Message Inbox
  • 7. Design Goals • Efficiently send new messages to recipients • Efficiently read inbox
  • 9. 3 Approaches (there are more) • Fan out on Read • Fan out on Write • Fan out on Write with Bucketing
  • 10. // Shard on "from" db.shardCollection( "mongodbdays.inbox", { from: 1 } ) // Make sure we have an index to handle inbox reads db.inbox.ensureIndex( { to: 1, sent: 1 } ) msg = { from: "Joe", to: [ "Bob", "Jane" ], sent: new Date(), message: "Hi!", } // Send a message db.inbox.save( msg ) // Read my inbox db.inbox.find( { to: "Joe" } ).sort( { sent: -1 } ) Fan out on read
  • 11. Fan out on read – Send Message Shard 1 Shard 2 Shard 3 Send Message
  • 12. Fan out on read – Inbox Read Shard 1 Shard 2 Shard 3 Read Inbox
  • 13. Considerations • One document per message sent • Reading an inbox means finding all messages with my own name in the recipient field • Requires scatter-gather on sharded cluster • Then a lot of random IO on a shard to find everything
  • 14. // Shard on “recipient” and “sent” db.shardCollection( "mongodbdays.inbox", { ”recipient”: 1, ”sent”: 1 } ) msg = { from: "Joe", to: [ "Bob", "Jane" ], sent: new Date(), message: "Hi!", } // Send a message for ( recipient in msg.to ) { msg.recipient = msg.to[recipient] db.inbox.save( msg ); } // Read my inbox db.inbox.find( { recipient: "Joe" } ).sort( { sent: -1 } ) Fan out on write
  • 15. Fan out on write – Send Message Shard 1 Shard 2 Shard 3 Send Message
  • 16. Fan out on write– Read Inbox Shard 1 Shard 2 Shard 3 Read Inbox
  • 17. Considerations • One document per recipient • Reading my inbox is just finding all of the messages with me as the recipient • Can shard on recipient, so inbox reads hit one shard • But still lots of random IO on the shard
  • 18. // Shard on “owner / sequence” db.shardCollection( "mongodbdays.inbox", { owner: 1, sequence: 1 } ) db.shardCollection( "mongodbdays.users", { user_name: 1 } ) msg = { from: "Joe", to: [ "Bob", "Jane" ], sent: new Date(), message: "Hi!", } Fan out on write with buckets
  • 19. // Send a message for( recipient in msg.to) { count = db.users.findAndModify({ query: { user_name: msg.to[recipient] }, update: { "$inc": { "msg_count": 1 } }, upsert: true, new: true }).msg_count; sequence = Math.floor(count / 50); db.inbox.update({ owner: msg.to[recipient], sequence: sequence }, { $push: { "messages": msg } }, { upsert: true } ); } // Read my inbox db.inbox.find( { owner: "Joe" } ).sort ( { sequence: -1 } ).limit( 2 ) Fan out on write with buckets
  • 20. Fan out on write with buckets • Each “inbox” document is an array of messages • Append a message onto “inbox” of recipient • Bucket inboxes so there’s not too many messages per document • Can shard on recipient, so inbox reads hit one shard • 1 or 2 documents to read the whole inbox
  • 21. Fan out on write with buckets - Send Shard 1 Shard 2 Shard 3 Send Message
  • 22. Fan out on write with buckets - Read Shard 1 Shard 2 Shard 3 Read Inbox
  • 24.
  • 25. Design Goals • Need to retain a limited amount of history e.g. – Hours, Days, Weeks – May be legislative requirement (e.g. HIPPA, SOX, DPA) • Need to query efficiently by – match – ranges
  • 26. 3 Approaches (there are more) • Bucket by Number of messages • Fixed size Array • Bucket by Date + TTL Collections
  • 27. db.inbox.find() { owner: "Joe", sequence: 25, messages: [ { from: "Joe", to: [ "Bob", "Jane" ], sent: ISODate("2013-03-01T09:59:42.689Z"), message: "Hi!" }, … ] } // Query with a date range db.inbox.find ({owner: "friend1", messages: { $elemMatch: {sent:{$gte: ISODate("…") }}}}) // Remove elements based on a date db.inbox.update({owner: "friend1" }, { $pull: { messages: { sent: { $gte: ISODate("…") } } } } ) Inbox – Bucket by # messages
  • 28. Considerations • Shrinking documents, space can be reclaimed with – db.runCommand ( { compact: '<collection>' } ) • Removing the document after the last element in the array as been removed – { "_id" : …, "messages" : [ ], "owner" : "friend1", "sequence" : 0 }
  • 29. msg = { from: "Your Boss", to: [ "Bob" ], sent: new Date(), message: "CALL ME NOW!" } // 2.4 Introduces $each, $sort and $slice for $push db.messages.update( { _id: 1 }, { $push: { messages: { $each: [ msg ], $sort: { sent: 1 }, $slice: -50 } } } ) Maintain the latest – Fixed Size Array
  • 30. Considerations • Need to compute the size of the array based on retention period
  • 31. // messages: one doc per user per day db.inbox.findOne() { _id: 1, to: "Joe", sequence: ISODate("2013-02-04T00:00:00.392Z"), messages: [ ] } // Auto expires data after 31536000 seconds = 1 year db.messages.ensureIndex( { sequence: 1 }, { expireAfterSeconds: 31536000 } ) TTL Collections
  • 32. #3 – Indexed Attributes
  • 33. Design Goal • Application needs to stored a variable number of attributes e.g. – User defined Form – Meta Data tags • Queries needed – Equality – Range based • Need to be efficient, regardless of the number of attributes
  • 34. 2 Approaches (there are more) • Attributes as Embedded Document • Attributes as Objects in an Array
  • 35. db.files.insert( { _id: "local.0", attr: { type: "text", size: 64, created: ISODate("..." } } ) db.files.insert( { _id: "local.1", attr: { type: "text", size: 128} } ) db.files.insert( { _id: "mongod", attr: { type: "binary", size: 256, created: ISODate("...") } } ) // Need to create an index for each item in the sub-document db.files.ensureIndex( { "attr.type": 1 } ) db.files.find( { "attr.type": "text"} ) // Can perform range queries db.files.ensureIndex( { "attr.size": 1 } ) db.files.find( { "attr.size": { $gt: 64, $lte: 16384 } } ) Attributes as a Sub- Document
  • 36. Considerations • Each attribute needs an Index • Each time you extend, you add an index • Lots and lots of indexes
  • 37. db.files.insert( {_id: "local.0", attr: [ { type: "text" }, { size: 64 }, { created: ISODate("...") } ] } ) db.files.insert( { _id: "local.1", attr: [ { type: "text" }, { size: 128 } ] } ) db.files.insert( { _id: "mongod", attr: [ { type: "binary" }, { size: 256 }, { created: ISODate("...") } ] } ) db.files.ensureIndex( { attr: 1 } ) Attributes as Objects in Array
  • 38. Considerations • Only one index needed on attr • Can support range queries, etc. • Index can be used only once per query
  • 39. #4 – Multiple Identities
  • 40. Design Goal • Ability to look up by a number of different identities e.g. • Username • Email address • FB Handle • LinkedIn URL
  • 41. 2 Approaches (there are more) • Identifiers in a single document • Separate Identifiers from Content
  • 42. db.users.findOne() { _id: "joe", email: "joe@example.com, fb: "joe.smith", // facebook li: "joe.e.smith", // linkedin other: {…} } // Shard collection by _id db.shardCollection("mongodbdays.users", { _id: 1 } ) // Create indexes on each key db.users.ensureIndex( { email: 1} ) db.users.ensureIndex( { fb: 1 } ) db.users.ensureIndex( { li: 1 } ) Single Document by User
  • 43. Read by _id (shard key) Shard 1 Shard 2 Shard 3 find( { _id: "joe"} )
  • 44. Read by email (non-shard key) Shard 1 Shard 2 Shard 3 find ( { email: joe@example.com } )
  • 45. Considerations • Lookup by shard key is routed to 1 shard • Lookup by other identifier is scatter gathered across all shards • Secondary keys cannot have a unique index
  • 46. // Create unique index db.identities.ensureIndex( { identifier : 1} , { unique: true} ) // Create a document for each users document db.identities.save( { identifier : { hndl: "joe" }, user: "1200-42" } ) db.identities.save( { identifier : { email: "joe@abc.com" }, user: "1200-42" } ) db.identities.save( { identifier : { li: "joe.e.smith" }, user: "1200-42" } ) // Shard collection by _id db.shardCollection( "mydb.identities", { identifier : 1 } ) // Create unique index db.users.ensureIndex( { _id: 1} , { unique: true} ) // Shard collection by _id db.shardCollection( "mydb.users", { _id: 1 } ) Document per Identity
  • 47. Read requires 2 reads Shard 1 Shard 2 Shard 3 db.identities.find({"identifier" : { "hndl" : "joe" }}) db.users.find( { _id: "1200-42"} )
  • 48. Considerations • Lookup to Identities is a routed query • Lookup to Users is a routed query • Unique indexes available
  • 50. Summary • Multiple ways to model a domain problem • Understand the key uses cases of your app • Balance between ease of query vs. ease of write • Random IO should be avoided
  • 51. Perl Engineer & Evangelist, 10gen Mike Friedman #MongoDBdays Thank You