P. 1
Schema Chalk Talk

Schema Chalk Talk

|Views: 38|Likes:
Published by Alvin John Richards
How do you model one-to-many, many-to-many, polymorphism, trees and queues in MongoDB? This calk talk will cover these basic patterns and more.
How do you model one-to-many, many-to-many, polymorphism, trees and queues in MongoDB? This calk talk will cover these basic patterns and more.

More info:

Published by: Alvin John Richards on May 25, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

05/25/2011

pdf

text

original

Schema Chalk Talk Alvin Richards alvin@10gen.

com

Wednesday, May 25, 2011

1

Topics
Common patterns • Single table inheritance • One-to-Many & Many-to-Many • Trees • Queues

Wednesday, May 25, 2011

2

So why model data?

http://www.flickr.com/photos/42304632@N00/493639870/
Wednesday, May 25, 2011 3

A brief history of normalization
• 1970 E.F.Codd introduces 1st Normal Form (1NF) • 1971 E.F.Codd introduces 2nd and 3rd Normal Form (2NF, 3NF) • 1974 Codd & Boyce define Boyce/Codd Normal Form (BCNF) • 2002 Date, Darween, Lorentzos define 6th Normal Form (6NF) Goals: • Avoid anomalies when inserting, updating or deleting • Minimize redesign when extending the schema • Make the model informative to users • Avoid bias towards a particular style of query

* source : wikipedia
Wednesday, May 25, 2011 4

Relational made normalized data look like this

Wednesday, May 25, 2011

5

Document databases make normalized data look like this

Wednesday, May 25, 2011

6

Terminology
RDBMS Table Row(s) Index Join Partition Partition  Key MongoDB Collection JSON  Document Index Embedding  &  Linking Shard Shard  Key

Wednesday, May 25, 2011

7

DB Considerations
How can we manipulate this data ? Access Patterns ?

• Dynamic Queries • Secondary Indexes • Atomic Updates • Map Reduce

• Read / Write Ratio • Types of updates • Types of queries • Data life-cycle

Considerations • No Joins • Document writes are atomic
Wednesday, May 25, 2011 8

Inheritance

Wednesday, May 25, 2011

9

Single Table Inheritance - RDBMS
shapes table id type
1 area radius d 1 length width

circle 3.14

2

square 4

2

3

rect

10

5

2

Wednesday, May 25, 2011

10

Single Table Inheritance
>  db.shapes.find()
 {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}  {  _id:  "2",  type:  "square",area:  4,  d:  2}  {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

Wednesday, May 25, 2011

11

Single Table Inheritance
>  db.shapes.find()
 {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}  {  _id:  "2",  type:  "square",area:  4,  d:  2}  {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

//  find  shapes  where  radius  >  0   >  db.shapes.find({radius:  {$gt:  0}})

Wednesday, May 25, 2011

12

Single Table Inheritance
>  db.shapes.find()
 {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}  {  _id:  "2",  type:  "square",area:  4,  d:  2}  {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

//  find  shapes  where  radius  >  0   >  db.shapes.find({radius:  {$gt:  0}}) //  create  index >  db.shapes.ensureIndex({radius:  1})

Wednesday, May 25, 2011

13

Single Table Inheritance
Considerations • Simple to query across sub-types • Indexes on specialized values will be small

Wednesday, May 25, 2011

14

One to Many
One to Many relationships can specify • degree of association between objects • containment • life-cycle

Wednesday, May 25, 2011

15

One to Many
- Embedded Array / Array Keys - slice operator to return subset of array - some queries harder e.g find latest comments across all documents
blogs:  {                author  :  "Hergé",        date  :  "Sat  Jul  24  2010  19:47:11  GMT-­‐0700  (PDT)",          comments  :  [      {     author  :  "Kyle",     date  :  "Sat  Jul  24  2010  20:51:03  GMT-­‐0700  (PDT)",     text  :  "great  book"      }        ]}
Wednesday, May 25, 2011 16

One to Many
- Embedded tree - Single document - Natural - Hard to query
blogs:  {                author  :  "Hergé",        date  :  "Sat  Jul  24  2010  19:47:11  GMT-­‐0700  (PDT)",          comments  :  [      {     author  :  "Kyle",     date  :  "Sat  Jul  24  2010  20:51:03  GMT-­‐0700  (PDT)",     text  :  "great  book",                replies:  [  {  author  :  “James”,  ...}  ]      }        ]}
Wednesday, May 25, 2011 17

One to Many
- Normalized (2 collections) - most flexible - more queries
blogs:  {                author  :  "Hergé",        date  :  "Sat  Jul  24  2010  19:47:11  GMT-­‐0700  (PDT)",          comments  :  [        {comment  :  ObjectId(“1”)}        ]} comments  :  {  _id  :  “1”,                          author  :  "James",              date  :  "Sat  Jul  24  2010  20:51:03  ..."}

Wednesday, May 25, 2011

18

One to Many - patterns

- Embedded Array / Array Keys

- Embedded Array / Array Keys - Embedded tree - Normalized
Wednesday, May 25, 2011 19

Many - Many
Example:
- Product can be in many categories - Category can have many products

Wednesday, May 25, 2011

20

Many - Many
products:
     {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}

   

Wednesday, May 25, 2011

21

Many - Many
products:
     {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}

   
categories:
     {  _id:  ObjectId("20"),            name:  "adventure",            product_ids:  [  ObjectId("10"),                                        ObjectId("11"),                                        ObjectId("12"]}

Wednesday, May 25, 2011

22

Many - Many
products:
     {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}

   
categories:
     {  _id:  ObjectId("20"),            name:  "adventure",            product_ids:  [  ObjectId("10"),                                        ObjectId("11"),                                        ObjectId("12"]}

//All  categories  for  a  given  product >  db.categories.find({product_ids:  ObjectId("10")})

Wednesday, May 25, 2011

23

Alternative
products:      {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}

   
categories:      {  _id:  ObjectId("20"),            name:  "adventure"}

Wednesday, May 25, 2011

24

Alternative
products:      {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}

   
categories:      {  _id:  ObjectId("20"),            name:  "adventure"} //  All  products  for  a  given  category >  db.products.find({category_ids:  ObjectId("20")})  

Wednesday, May 25, 2011

25

Alternative
products:      {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}

   
categories:      {  _id:  ObjectId("20"),            name:  "adventure"} //  All  products  for  a  given  category >  db.products.find({category_ids:  ObjectId("20")})   //  All  categories  for  a  given  product product    =  db.products.find(_id  :  some_id) >  db.categories.find({_id  :  {$in  :  product.category_ids}})  

Wednesday, May 25, 2011

26

Embedding versus Linking
Embedding • Simple data structure • Limited to 16MB • Larger documents

• How often do you update? • Will the document grow and grow?

Linking • More complex data structure • Unlimited data size • More, smaller documents

• What are the maintenance needs?
27

Wednesday, May 25, 2011

Trees
Full Tree in Document
{  comments:  [          {  author:  “Kyle”,  text:  “...”,                replies:  [                                            {author:  “James”,  text:  “...”,                                              replies:  []}                ]}    ] }

Pros: Single Document, Performance, Intuitive Cons: Hard to search, Partial Results, 4MB limit
   
Wednesday, May 25, 2011 28

Trees
Parent Links - Each node is stored as a document - Contains the id of the parent Child Links - Each node contains the id’s of the children - Can support graphs (multiple parents / child)

Wednesday, May 25, 2011

29

Array of Ancestors
- Store all Ancestors of a node    {  _id:  "a"  }    {  _id:  "b",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "c",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "d",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "e",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "f",  ancestors:  [  "a",  "e"  ],  parent:  "e"  }

Wednesday, May 25, 2011

30

Array of Ancestors
- Store all Ancestors of a node    {  _id:  "a"  }    {  _id:  "b",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "c",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "d",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "e",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "f",  ancestors:  [  "a",  "e"  ],  parent:  "e"  } //find  all  descendants  of  b: >  db.tree2.find({ancestors:  ‘b’}) //find  all  direct  descendants  of  b: >  db.tree2.find({parent:  ‘b’})

Wednesday, May 25, 2011

31

Array of Ancestors
- Store all Ancestors of a node    {  _id:  "a"  }    {  _id:  "b",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "c",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "d",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "e",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "f",  ancestors:  [  "a",  "e"  ],  parent:  "e"  } //find  all  descendants  of  b: >  db.tree2.find({ancestors:  ‘b’}) //find  all  direct  descendants  of  b: >  db.tree2.find({parent:  ‘b’}) //find  all  ancestors  of  f: >  ancestors  =  db.tree2.findOne({_id:’f’}).ancestors >  db.tree2.find({_id:  {  $in  :  ancestors})
Wednesday, May 25, 2011 32

Trees as Paths
Store hierarchy as a path expression - Separate each node by a delimiter, e.g. “/” - Use text search for find parts of a tree
{  comments:  [          {  author:  “Kyle”,  text:  “initial  post”,                path:  “/”  },          {  author:  “Jim”,    text:  “jim’s  comment”,              path:  “/jim”  },          {  author:  “Kyle”,  text:  “Kyle’s  reply  to  Jim”,              path  :  “/jim/kyle”}  ]  } //  Find  the  conversations  Jim  was  part  of   >  db.posts.find({path:  /^jim/i})
Wednesday, May 25, 2011 33

Queue
• Need to maintain order and state • Ensure that updates to the queue are atomic
     {  inprogress:  false,          priority:  1,        ...      }

Wednesday, May 25, 2011

34

Queue
• Need to maintain order and state • Ensure that updates to the queue are atomic
     {  inprogress:  false,          priority:  1,        ...      } //  find  highest  priority  job  and  mark  as  in-­‐progress job  =  db.jobs.findAndModify({                              query:    {inprogress:  false},                              sort:      {priority:  -­‐1),                                update:  {$set:  {inprogress:  true,                                                                started:  new  Date()}},                              new:  true})    
Wednesday, May 25, 2011 35

download at mongodb.org

We’re Hiring !
alvin@10gen.com
conferences,  appearances,  and  meetups
http://www.10gen.com/events

http://bit.ly/mongo>  

Facebook                    |                  Twitter                  |                  LinkedIn
@mongodb

http://linkd.in/joinmongo

Wednesday, May 25, 2011

36

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->