Professional Documents
Culture Documents
Humongous
Database Shop
Schemaless!
JSON (BSON) Data Format
{
"name": "Max",
"age": 29,
"address":
{
"city": "Munich"
},
"hobbies": [
{ "name": "Cooking" },
{ "name": "Sports" }
]
}
BSON Data Structure
No Schema!
id: 3 "age": 31 …
Users Collection
Relations
No / Few Relations!
Kind of…
MongoDB Ecosystem
Self-Managed /
Atlas (Cloud) Mobile Serverless Query API
Enterprise
CloudManager /
Serverless Functions
OpsManager
MongoDB Charts
Working with MongoDB
Application Data
Frontend Backend
Drivers
(UI) (Server)
MongoDB Storage
Node.js Queries Server Engine
Java
MongoDB Shell
Playground, Administration
A Closer Look
Data
Memory
Your Application
How To Get The Most Out Of The Course
Answering makes
Ask + Answer in the Q&A Section
you a Pro!
Document & CRUD Basics
Database
Collection Collection
Document Document
Document Document
Document Document
Created Implicitly
JSON
{ This is called a “Field” or
"name": "Max", “Property” of the JSON
"age": 29, document. Multiple Fields
are separated by commas
"isInstructor": true,
"hobbies": [ Key Value “Fields” consist of a “Key”
(or “name”) and “Value”
Surrounding curly "Sports", part. "Key and Value are
braces delimit the "Cooking" separated by a colon.
JSON document
],
"address": {
"street": "My Street 5",
"city": "Munich" Values can be strings (e.g.
} “Max”), numbers (e.g. 29),
booleans (e.g. true), arrays
} ([ … ]) and other documents
(also called objects; { … })
JSON vs BSON
JSON BSON
{
"name": "Max", MongoDB
Binary Data
"age": 29 Drivers
}
Efficient Storage
CRUD Operations & MongoDB
User
<Code>
Read
Create, Read, Update, Delete Create, Read, Update, Delete
MongoDB Server
Collection A Collection B
{ … }, { … } { … }, { … }
CRUD Operations
Create Update
insertOne(data, options) updateOne(filter, data, options)
Read Delete
find(filter, options) deleteOne(filter, options)
Flight
Schedule
Create Delete Cancel Flight
(Add) a Flight
{
"departureAirport": "MUC",
Update Display Flight
Update "arrivalAirport": "SFO", Read
Information Information
"aircraft": "Airbus A380",
"distance": 12000,
"intercontinental": true
},
{ … }
Unique IDs
Up to 100 Levels
of Nesting
Max. 16mb /
document
Documents
Arrays
Embedded
Array of
find()
[ { … }, { … }, … ] Cursor Object
In Database
{
"_id": "...",
"name": "Max",
"age": 29,
"job": "instructor"
}
Projection
In Application
{
"name": "Max",
"age": 29
}
update() vs updateOne() vs updateMany()
Use $set to patch values Use $set to patch values Use $set to patch values
Use these!
Example #2: Patient Data
Patient
{
"firstName": "Max",
"lastName": "Schwarzmueller",
"age": 29,
"history": [
{ "disease": "cold", "treatment": … },
{ … }
]
}
Tasks
2 Update patient data of 1 patient with new age, name and history entry
3 Find all patients who are older than 30 (or a value of your choice)
Modelling Relations
Schema Validation
Schema-less Or Not?
But that does not mean that you can’t use some kind of
schema!
To Schema Or Not To Schema
{ { {
"title": "Book", "title": "Book", "title": "Book",
"price": 12.99 "price": 12.99 "price": 12.99
} } }
Very Different! Extra Data Full Equality
{
{ {
"title": "Bottle",
"name": "Bottle", "title": "Bottle",
"price": 5.99
"available": true "price": 5.99
"available": true
} }
}
Data Types
Text
"Max"
Boolean
true
Number
Which Data does my App need User Information, Product Defines the Fields you’ll need
or generate? Information, Orders, … (and how they relate)
Which kind of Data or Welcome Page: Product Defines which queries you’ll
Information do I want to display? Names; Products Page: … need
How often do I write or change Orders => Often Defines whether you should
my data? Product Data => Rarely optimize for easy writing
Data Schemas & Data Modelling
Customers { Customers
Lots of data duplication!
userName: 'max’,
{ favBooks: [{…}, {…}]
userName: 'max', }
age: 29,
address: { { Customers
street: 'Second Street', userName: 'max',
city: 'New York' favBooks: ['id1', 'id2']
} }
}
{ Books
_id: 'id1',
name: 'Lord of the Rings 1'
}
Example #1 – Patient <-> Disease Summary
Patient A Summary A
Patient B Summary B
Patient C Summary C
Example #2 – Person <-> Car
Person A Car 1
Person B Car 2
Person C Car 2
Example #3 – Thread <-> Answers
Answer 2
Example #4 – City <-> Citizens
City A Citizen 1
City B Citizen 1
City C Citizen 1
Citizen 2
Example #5 – Customers <-> Products (Orders)
Customer A Product 1
Customer B Product 3
Customer C Product 4
Product 5
Example #6 – Books <-> Authors
Book A Author 1
Book B Author 3
Book C Author 4
Author 5
Relations - Options
{ Books
_id: 'id1',
name: 'Lord of the Rings 1'
}
customers.aggregate([
{ $lookup: {
from: "books",
localField: "favBooks",
foreignField: "_id"
as: "favBookData"
}
}
])
Example Project: A Blog
Server-side rendered
User
Views, SPA, Mobile App
Create Post Edit Post Delete Post Fetch Posts Fetch Post Comment
App Server
<Your Code>
PHP .NET Node.js
MongoDB Driver
MongoDB Server
Database
Schema Validation
insertOne()
Collection Rejected!
Validation Schema
Accepted
Schema Validation
validationLevel validationAction
bypassDocumentValidation()
Data Modelling & Structuring – Things to Consider
How much data will you save (and how big is it)?
Fixing Issues
Diving Deeper Into CREATE
Importing Documents
CREATE Documents
insertMany() db.collectionName.insertMany([
{field: "value"},
{field: "value"}])
insert() db.collectionName.insert()
MongoDB Server
Client (e.g. Shell) Storage Engine
(mongod)
e.g. insertOne()
Data on
Memory
Disk
{ w: 1, j: undefined }
Journal
(”Todos”)
{ w: 1, j: true}
Error Success
MongoDB CRUD Operations are Atomic on the Document Level (including Embedded
Documents)
Tasks
Write data for a new company with both journaling being guaranteed and
3
not being guaranteed
Module Summary
§ You can insert documents with § By default, when using insertMany(), inserts are
insertOne() (one document at a time) ordered – that means, that the inserting process
or insertMany() (multiple documents) stops if an error occurs
§ insert() also exists but it’s not § You can change this by switching to “unordered
recommended to use it anymore – it inserts” – your inserting process will then
also doesn’t return the inserted ids continue, even if errors occurred
§ In both cases, no successful inserts (before the
error) will be rolled back
WriteConcern
Field Value
Read Update
Modify data
Projection Operator $
presentation
Modify + add
Update Operator $inc
additional data
Query Selectors & Projection Operators
Comparison Evaluation $
Import the attached data into a new database (e.g. boxOffice) and
1
collection (e.g. movieStarts)
Search all movies that have a rating higher than 9.2 and a runtime lower
2
than 100 minutes
Import the attached data file into a new collection (e.g. exmoviestarts) in
1
the boxOffice database
4 Find all movies which have ratings greater than 8 but lower than 10
Understanding Cursors
find()
MongoDB Server /
Client (Cursor)
Database
Request Batch #1
Request Batch #2
…
Tasks
Filter for any data of your choice (e.g. all data) and make sure to only
2
include title + visitors in your result data.
Search for all movies that have an entry of 10 in their ratings array and
3
return just that array entry (inside of the array) in the result data
§ You can read documents with find() § find() returns a cursor to allow you to efficiently
and findOne() retrieve data step by step (instead of fetching all
§ find() returns a cursor which allows the documents in one step)
you to fetch data step-by-step § You can use a cursor to move through the
§ Both find() and findOne() take a filter documents
(optional) to narrow down the set of § sort(), skip() and limit() can be used to control the
documents they return order, portion and quantity of the retrieved results
§ Filters can use a variety of query
selectors/ operators to control which Projection
documents are retrieved
§ Projection allows you to control which fields are
returned in your result set
§ You can include fields (field: 1) and exclude them
(field: 0)
§ For arrays, special projection operators help you
return the right field data
Understanding Document UPDATEs
Updating Fields
Updating Arrays
Operators
Read Update
Bitwise
How Operators Impact our Data
Modify data
Projection Operator $elemMatch
presentation
Modify + add
Update Operator $rename
additional data
Update Operators
Create a new collection (“sports”) and upsert two new documents into it
1
(with these fields: “title”, “requiresTeam”)
Update all documents which do require a team by adding a new field with
2
the minimum amount of players required
Replacing Documents
db.products.find({seller: "Max"})
Or
de
r
No Index With Index Products ed!
Products
Seller Index
{…} "Anna"
{…} "Chris"
COLLSCAN
IXSCAN
{…}
{…} "Manu"
{…}
{…} "Manu"
{…} "Max"
Scan ALL documents, Directly “jump” to
{…} then filter filtered documents "Max"
Don’t Use Too Many Indexes!
Products Collection
Products Indexes
Custom Name
Unique
Partial
Sparse
TTL
Query Diagnosis & Query Planning
explain()
Cached
Cache
Cleared after certain amount of inserts, db restart etc
Clearing the Winning Plan from Cache
Stored Forever?
Text Index
Foreground Background
Faster Slower
Module Summary
§ Indexes allow you to retrieve data more efficiently § Use explain() to understand how
(if used correctly) because your queries only have MongoDB will execute your queries
to look at a subset of all documents § This allows you to optimize both your
§ You can use single-field, compound, multi-key queries and indexes
(array) and text indexes
§ Indexes don’t come for free, they will slow down
your writes Index Options
Finding Places
What’s Inside This Module?
2 Pick a point and find the nearest points within a min and max distance
Pick an area and see which points (that are stored in your collection) it
3
contains
5 Pick a point and find out which areas in your collection contain that point
Module Summary
Geospatial Indexes
Collection
{ $match }
{ $group }
{ $project }
Output (List of
Documents)
Pipeline Stages
$group $project
n:1 1:1
$unwind
§ There are plenty of available stages § The most important stages are $match, $group,
and operators you can choose from $project, $sort and $unwind – you’ll work with
§ Stages define the different steps your these a lot
data is funneled through § Whilst there are some common behaviors
§ Each stage receives the output of the between find() filters + projection and $match +
last stage as input $project, the aggregation stages generally are
§ Operators can be used inside of stages more flexible
to transform, limit or re-calculate data
Working with Numeric Data
“High Precision
Integers (int32) Longs (int64) Doubles (64bit)
Doubles” (128bit)
-9,223,372,036,854,
Decimal values are
-2,147,483,648 775,808
Decimal values are stored with high
to to
approximated precision (34 decimal
2,147,483,647 9,223,372,036,854,
digits)
775,807
Authentication &
Transport Encryption Encryption at Rest
Authorization
Authentication Authorization
Resources Actions
Login with username + password
Shop =>
insert()
Products
Logged in but no
rights to do anything! A u th
enab
MongoDB Server le d
updateUser() createUser()
“maxschwarzmueller”
Roles
Privileges
readAnyDatabase
dbAdmin
read readWriteAnyDatabase
userAdmin
readWrite userAdminAnyDatabase
dbOwner
dbAdminAnyDatabase
App Encrypted!
MongoDB Driver
Encryption at Rest
En
cry
Storage p te
d!
{
email: "test@test.com",
password: "ad50dsjaflf10ur239" Encrypted/ Hashed
}
{
email: "test@test.com",
password: "…"
}
Module Summary
§ MongoDB uses a Role Based Access § You can encrypt data during transportation and
Control approach at rest
§ You create users on databases and § During transportation, you use TLS/ SSL to
you then log in with your credentials encrypt data
(against those databases) § For production, you should use SSL certificates
§ Users have no rights by default, you issues by a certificate authority (NOT self-signed
need to add roles to allow certain certificates)
operations § For encryption at rest, you can encrypt both the
§ Permissions that are granted by roles files that hold your data (made simple with
(“Privileges”) are only granted for the “MongoDB Enterprise”) and the values inside your
database the user was added to documents
unless you explicitly grant access to
other databases
§ You can use “AnyDatabase” roles for
cross-database access
Performance, Fault Tolerance &
Deployment
Capped Collections
Replica Sets
Sharding
Indexes Sharding
Write
MongoDB Server
Write
Asynchronous Replication
Read
MongoDB Server
Offline Read
Read
MongoDB Server
Read Read
MongoDB Server
Client
mongos (Router)
Documents
Queries & Sharding
find()
mongos
mongod mongod
Encryption
Manage Replica Sets (Transportation & Regular Backups Update Software
Rest)
MongoDB Atlas is a Managed Solution
localhost Atlas
mongod mongod
Encryption
Manage Replica Sets (Transportation & Regular Backups Update Software
Rest)
Module Summary
§ Consider Capped Collections for cases § Deployment is a complex matter since it involves
where you want to clear old data many tasks – some of them are not even directly
automatically related to MongoDB
§ Performance is all about having § Unless you are an experienced admin (or you got
efficient queries/ operations, fitting one), you should consider a managed solution like
data formats and a best-practice MongoDB Atlas
MongoDB server config § Atlas is a managed service where you can
§ Replica sets provide fault tolerancy configure a MongoDB environment and pay at a
(with automatic recovery) and by-usage basis
improved read performance
§ Sharding allows you to scale your
MongoDB server horizontally
Transactions
Fail Together
Transactions
CRUD Operations
Splitting Work between Drivers & Shell
Shell Driver
Create Indexes
MongoDB Stitch
What is Stitch?
Using Stitch
What is Stitch?
Functions
App User
Authentication
Stitch Authentication vs MongoDB Authentication
What Next?
Play Around!
Resources