Professional Documents
Culture Documents
November 20 1
BY THE END OF THIS LESSON, YOU SHOULD
KNOW:
How to model a document NoSQL database.
November 20 2
NOSQL DATA MODELLING VS.
RELATIONAL MODELLING
NoSQL data modeling often starts from the application-specific queries as opposed
to relational modelling:
Relational modeling is typically driven by the structure of available data. The main design theme
is “What answers do I have?”
NoSQL data modeling is typically driven by application-specific access patterns, i.e. the types of
queries to be supported. The main design theme is “What questions do I have?”
November 20 3
November 20 4
November 20 5
November 20 6
MODELLING TECHNIQUES
Referencing documents.
Embedding documents.
Denormalisation.
Heterogeneous collection.
November 20 7
REFERENCING DOCUMENTS
References store the relationships between data by including links or references from
one document to another.
You can reference another document using the document key. This is similar to
normalisation in relational db.
Referencing enables document databases to cache, store and retrieve the documents
independently.
Provides better write speed/performance.
Reading may require more round trips to the server.
November 20 8
EXAMPLE
November 20 9
REFERENCING DOCUMENTS CAN BE
BENEFICIAL…
Key document
If a document is a key document, it means that it is referenced by many other documents. It is more efficient
and less error prone to reference key documents.
November 20 10
EMBEDDING DOCUMENTS
Embedded documents capture relationships between data by storing related data in
a single document structure. You can embed a document in another document by
simply defining an attribute to be an embedded document.
These denormalized data models allow applications to retrieve and manipulate
related data in a single database operation. Embedding enables document
databases to cache, store and retrieve the complex document with embedded
documents as a single piece.
Eliminates the need to retrieve two separate documents and join them.
Provides better read speed/performance.
November 20 11
EXAMPLE
November 20 12
EMBEDDING CAN BE ADVANTAGEOUS WHEN….
Two data items are often queried together.
By embedding one document into another, the query performance will be improved since all data will be
stored in the single document. In other words, embedding supports locality.
1:1 relationship.
This means that there is no redundancy between the documents and embedding one document into another is
a natural and efficient way to implement their relationship. This is an easy-to-query structure that also
guarantees cinsistency when data is updated/removed in these embedded documents.
November 20 13
1-TO-1 RELATIONSHIPS: REFERENCING
If the address data is frequently retrieved with
the name information, then with referencing,
your application needs to issue multiple queries
to resolve the reference. The better data
model would be to embed the address data in
the patron data, as in the following document:
November 20 14
1-TO-1 RELATIONSHIPS: EMBEDDED
November 20 15
MANY-TO-MANY RELATIONSHIP: EMBEDDED
November 20 16
MANY-TO-MANY RELATIONSHIP: REFERENCING
When using references, the growth of the
relationships determine where to store the
reference.
If the number of books per publisher is small
with limited growth, storing the book
reference inside the publisher document may
sometimes be useful.
Otherwise, if the number of books per
publisher is unbounded, this data model
would lead to mutable, growing arrays
November 20 17
MANY-TO-MANY RELATIONSHIP: REFERENCING
November 20 18
1-TO-MANY RELATIONSHIPS (UNBOUNDED)
November 20 19
1-TO-MANY RELATIONSHIPS (UNBOUNDED)
November 20 20
MANY-TO-MANY RELATIONSHIPS
November 20 21
MANY-TO-MANY RELATIONSHIPS
Greater volatility
November 20 23
RELATED DATA CHANGES WITH DIFFERING
VOLATILITY
November 20 24
TWO DATA ITEMS ARE OFTEN QUERIED TOGETHER
November 20 25
ONE DATA ITEM IS DEPENDENT ON ANOTHER
Dependent
on Order
November 20 26
1:1 RELATIONSHIP
November 20 27
SIMILAR VOLATILITY
November 20 28
NORMALISED
Query
Two
reads
are
needed
November 20 29
DENORMALISED
November 20 30
NORMALISATION VS. DENORMALISATION
Normalised:
Requires multiple reads.
Doesn’t align with instances.
Provides faster write speed.
Denormalised:
Requires updates in multiple places.
Provides faster read speed.
November 20 31
KEY CONSIDERATIONS WITH DATA MODELING
Read and write operations
When designing a data model for MongoDB, it is important to know your application patterns. They will help you to better understand how data will be
created and used. Based on that understanding, you should be in a better position to improve the design of your data model applying the data model
design patterns that are the best fit for the patterns of your application. One of the main questions are:
How your data will grow and change over time?
What is the read/write ratio?
What kinds of queries your application will perform?
Are there any concurrency related constrains you should look at?
Document growth.
Documents can grow by either adding new fields to them, or adding new elements to its array fields, or by frequently updating them. MongoDB has a
document size limit of 16 MB. MongoDB will move documents to accommodate their new space requirements. Document moves are generally slow and can
also fragment space where the file with document's collection resides.
Atomicity
There is no concept of transaction in MongoDB. All operations that create or change data (e.g., write, update, delete) are atomic at the document level
only. If fields of a document have to be modified together, all of them have to be embedded in a single document in order to guarantee atomicity.
MongoDB does not support multi-document transactions.
November 20 32
HOMOGENEOUS COLLECTIONS
One collection per data type.
Speaker
Session
Room
But, this would require three different queries over three different collections.
November 20 33
HETEROGENEOUS COLLECTIONS
Multiple types in a single collection.
November 20 34