Professional Documents
Culture Documents
Agenda
Why is schema design important 4 Real World Schemas
Inbox History Indexed Attributes Multiple Identities
Conclusions
Single Table En
#1 - Message Inbox
Lets get
Social
Sending Messages
Design Goals
Efciently send new messages to recipients Efciently read inbox
Reading my Inbox
Shard 1
Shard 2
Shard 3
Shard 1
Shard 2
Shard 3
Considerations
1 document per message sent Multiple recipients in an array key Reading an inbox is nding all messages with my
everything
Shard 1
Shard 2
Shard 3
Shard 1
Shard 2
Shard 3
Considerations
1 document per recipient Reading my inbox is just nding all of the
shard
But still lots of random IO on the shard
Can shard on recipient, so inbox reads hit one A few documents to read the whole inbox
Shard 1
Shard 2
Shard 3
Shard 1
Shard 2
Shard 3
#2 History
Design Goals
Need to retain a limited amount of history e.g.
Hours, Days, Weeks May be legislative requirement (e.g. HIPPA, SOX, DPA)
Considerations
Shrinking documents, space can be reclaimed with
db.runCommand ( { compact: '<collection>' } )
"sequence" : 0 }
Considerations
Need to compute the size of the array based on
retention period
TTL Collections
// messages: one doc per user per day db.inbox.findOne() { _id: 1, to: "Joe", sequence: ISODate("2013-02-04T00:00:00.392Z"), messages: [ ]
// Auto expires data after 31536000 seconds = 1 year db.messages.ensureIndex( { sequence: 1 }, { expireAfterSeconds: 31536000 } )
#3 Indexed Attributes
Design Goal
Application needs to stored a variable number of
attributes e.g.
Queries needed
Equality Range based
attributes
Attributes as a Sub-Document
db.files.insert( { _id: "local.0", attr: { type: "text", size: 64, created: ISODate("2013-03-01T09:59:42.689Z" } } ) db.files.insert( { _id:"local.1", attr: { type: "text", size: 128} } ) db.files.insert( { _id:"mongod", attr: { type: "binary", size: 256, created: ISODate("2013-04-01T18:13:42.689Z") } } ) // Need to create an index for each item in the sub-document db.files.ensureIndex( { "attr.type": 1 } ) db.files.find( { "attr.type": "text"} )
// Can perform range queries db.files.ensureIndex( { "attr.size": 1 } ) db.files.find( { "attr.size": { $gt: 64, $lte: 16384 } } )
Considerations
Each attribute needs an Index Each time you extend, you add an index Lots and lots of indexes
Queries
// Range queries db.files.find( { attr: { $gt: { size:64 }, $lte: { size: 16384 } } } ) db.files.find( { attr: { $gte: { created: ISODate("2013-02-01T00:00:01.689Z") } } } ) // Multiple condition Only the first predicate on the query can use the Index // ensure that this is the most selective. // Index Intersection will allow multiple indexes, see SERVER-3071 db.files.find( { $and: [ { attr: { $gte: { created: ISODate("2013-02-01T ") } } }, { attr: { $gt: { size:128 }, $lte: { size: 16384 } } } ]}) // Each $or can use an index db.files.find( { $or: [ { attr: { $gte: { created: ISODate("2013-02-01T ") } } }, { attr: { $gt: { size:128 }, $lte: { size: 16384 } } } ]})
#4 Multiple Identities
Design Goal
Ability to look up by a number of different
identities e.g.
Shard 1
Shard 2
Shard 3
Shard 1
Shard 2
Shard 3
Considerations
Lookup by shard key is routed to 1 shard Lookup by other identier is scatter gathered
// Shard collection by _id db.shardCollection( "mongodbdays.identities", { identifier : 1 } ) // Create unique index db.users.ensureIndex( { _id: 1} , { unique: true} ) // Create a docuemnt that holds all the other user attributes db.users.save( { _id: "1200-42", ... } ) // Shard collection by _id db.shardCollection( "mongodbdays.users", { _id: 1 } )
Shard 1
Shard 2
Shard 3
Solution
Lookup to Identities is a routed query Lookup to Users is a routed query Unique indexes available
Conclusion
Summary
Multiple ways to model a domain problem Understand the key uses cases of your app Balance between ease of query vs. ease of write Random IO should be avoided
#MongoDBdays
Thank You
Alvin Richards
@jonnyeight Technical Director, 10gen alvin@10gen.com alvinonmongodb.com