2 NoSQL

NoSQL Databases
Why Do We require NoSQL ?

Multi- structure and heterogeneous
data from front end applications
High volume data
Need of Scalability
Reliable and available database
2
Taxonomy of NoSQL
• Key-value
• Graph database
• Document-oriented
• Column family 3
NoSQL “Not Only SQL”
(Non-Relational)
New ways of querying architecting
your dynamic data store…
…beyond RDBMS rows & columns…
…for a different breed of scale

problems…
…and more specialized and simplistic

development techniques. 4
No SQL Continued..
It’s more than rows in
It’s free of joins
It’s schema-free
It works on many processors
It uses shared-nothing commodity
computers
It supports linear scalability
It’s innovative
5
Key drivers for NoSQL
6
What Giants use ?
Google – Big Table
Proprietary , available with Google Cloud
Platform
Amazon – DynamoDB –proprietary
NoSQL Database
Offered as a part of AWS
It is a fully managed cloud database and
supports both document and key-value
store models.
7
Case Study : Google Big Table
Google need to store results from the web crawlers
that extract HTML pages, images, sounds, videos, and
other media from the internet.
The resulting dataset was so large that it couldn’t fit

into a single relational database, so Google built their
own storage system.
Their fundamental goal was to build a system that

would easily scale as their data increased without
forcing them to purchase expensive hardware.
The solution was neither a full relational database nor

a filesystem, but what they called a “distributed
storage system” that worked with structured data.
Google Big Table
It gave Google developers a single
tabular view of the data by creating one
large table that stored all the data they
needed.
In addition, they created a system that
allowed the hardware to be located in
any data center, anywhere in the world
Created an environment where
developers didn’t need to worry about
the physical location of the data they
manipulated.
Amazon’s Motivation
Traditional brick-and-mortar retailers that
operate in a few locations operating only during
business hours.
When not open for business, they run daily
reports, and perform backups and software
upgrades.
The Amazon model (1) Customers from all
corners of the world (2) Shop at all hours of the
day, every day.
Any downtime in the purchasing cycle could
result in the loss of millions of dollars. Amazon’s
systems need to be iron-clad reliable and
scalable without a loss in service.
Amazon’s Dynamo—accept an order
24 hours a day, 7 days a week
Amazon’s need to create
A highly reliable web storefront

that supported transactions from around
the world
24 hours a day, 7 days a week, without
interruption
Traditional RDBMS systems were not
able to support the business need
The Databases so far
Flat Files- no structure , no standard
RDBMS –relational tables
OLAP / DWH - Cubes
NoSQL-Collections
12
NoSQL
Database management System

focused on
Scalability
Performance
High Availability
13
NoSQL Continued..
No Joins
No Complex transactions
Complexity has to be taken care by the
application
Less functionality but more
performance (w.r.t. RDBMS)
14
Sharding of data
Distributes a single logical database system
across a cluster of machines
Uses range-based partitioning to distribute
documents based on a specific shard key
Automatically balances the data associated
with each shard
Can be turned on and off per collection
(table) 8
MongoDB
16
MongoDB
Document Oriented, NoSQL Database
Open Source
Developed and Supported by 10gen
founded in 2007
General Public Licence (free)
Commercial Licence
Scalable, open source , high
performance , document oriented
database (10gen) 17
MongoDB Continued..
Schema-less database
Written in C++
Supports APIs (drivers) in many
computer languages
JavaScript, Python, Ruby, Perl, Java,
Java Scala, C#, C++, Haskell, Erlang
18
MongoDB
Table – Collection
Row – Document
Document may have different field
Each Row need to have same field
Compare it with Flipkart page visits
for user
19
Schema Free
• MongoDB does not need any pre-defined data schema
• Every document in a collection could have different data
{name: “will”, name: “jeff”, {name: “brendan”,

eyes: “blue”, eyes: “blue”, aliases: [“eldiablo”]}
birthplace: “NY”, loc: [40.7, 73.4],
aliases: [“bill”, “la ciacco”], boss: “ben”}
loc: [32.7, 63.4],
boss: ”ben”}
{name: “matt”,
pizza: “DiGiorno”,
height: 72,
name: “ben”, loc: [44.6, 71.3]}
hat: ”yes”}
Flipkart Collection
{
{id=1
Page1 =page1;
Page 2 =page 2;
Page 3 = page3;
}
{id=2
Page 1 =page1;
Page 2= page2;
} 21
}
Flipkart Collection (another
way)
{
{id=1
Page =[page1, page 2, page3]
}
{id=2
Page [page1, page2]
}
}
22
CRUD operations
Create
Read
Update
Delete
Done on the collections
23
MongoDB :Use Cases
(Project / Company specific )
24
Aadhar
Adhar is an excellent example of real world use
cases of MongoDB.
Aadhar, is the world’s biggest biometrics
database. Contains biometric data of over 1.2
billion residents.
Aadhar has used MongoDB as one of its
database to store this huge amount of data,
originally procured for running the database
search.
MySQL is used for storing demographic data
and MongoDB is used to store images.
25
Shutterfly
Internet-based photo sharing and
personal publishing company
Manages a store of more than 6
billion images with a transaction rate
of up to 10,000 operations per
second.
One of the companies that
transitioned from Oracle to MongoDB.
26
MetLife
MetLife is a leading global provider of
insurance, annuities and employee
benefit programs.
They serve about 90 million customers
and hold leading market positions in the
United States, Japan, Latin America,
Asia, Europe and the Middle East.
27
Metlife continued..
MetLife uses MongoDB for “The Wall”,
an innovative customer service application
that provides a consolidated view of MetLife
customers, including policy details and
transactions
The Wall is designed to look and
function like Facebook and has improved
customer satisfaction and call centre
productivity
28
eBay
eBay has a number of projects
running on MongoDB for search
suggestions, metadata storage, cloud
management and merchandizing
categorization.
29
MongoDB use cases
(Application specific)
Source: MongoDB
30
High Volume Data Feeds
Machine
• More machine forms, sensors & data
Generated • Variably structured
Data
• High frequency trading

Securities Data • Daily closing price
Social Media / • Multiple data sources

• Each changes their format consistently
General Public • Student Scores, ISP logs
High Volume Data Feeds Flexible document
model can adapt to
changes in sensor
format
Asynchronous Writes
Data
Data
Sources
Data
Sources
Data Write to memory with
Sources periodic disk flush
Sources
Scale writes over

multiple shards
Operational Intelligence
• Large volume of users

Ad Targeting • Very strict latency requirements
• Sentiment Analysis
• Expose data to millions of customers

Real time • Reports on large volumes of data
dashboards • Reports that update in real time
• Join the conversation

Social Media • Catered Games
Monitoring • Customized Surveys
Operational Intelligence
Parallelize queries
Low latency reads
across replicas and
shards
API
In database
aggregation
Dashboards
Flexible schema
Can use same adapts to changing
cluster to collect, input data
store and report on
data
Behavioural Profiles
Rich profiles
collecting multiple
complex actions
1 See Ad
Scale out to support { cookie_id: “1234512413243”,

high throughput of advertiser:{
activities tracked apple: {
See Ad actions: [
2 { impression: ‘ad1’, time: 123 },
{ impression: ‘ad2’, time: 232 },
{ click: ‘ad2’, time: 235 },
{ add_to_cart: ‘laptop’,
sku: ‘asdf23f’,
time: 254 },
3 Click { purchase: ‘laptop’, time: 354 }
] …
Dynamic schemas
make it easy to
4 Convert
Metadata
• Diverse product portfolio

Product • Complex querying and filtering
Catalogue • Multi-faceted product attributes
• Data mining
Data analysis • Call records
• Insurance Claims
• Retina Scans
Biometric • Fingerprints
Metadata Indexing and rich query
API for easy searching
and sorting
db.archives. Indexing techniques

find({ “country”: “Egypt” }); that fit your data
modeling
db.archives.
find({key:“type”, value:“Artifact”}); Flexible data model
for similar but
different objects
{ type: “Artifact”, { ISBN: “00e8da9b”,

medium: “Ceramic”, type: “Book”,
country: “Egypt”, country: “Egypt”,
year: “3000 BC” title: “Ancient Egypt”
} }
Content Management
• Comments and user generated

News Site content
• Personalization of content and layout
Multi-device • Generate layout on the fly

rendering • No need to cache static pages
• Store large objects

Sharing • Simpler modeling of metadata
Content Management
Geo spatial indexing
Flexible data model for location-based
GridFS for large
for similar but searches
object storage
different objects
{ camera: “Nikon d4”,

location: [ -122.418333, 37.775 ]
}
{ camera: “Canon 5d mkII”,

people: [ “Jim”, “Carol” ],
taken_on: ISODate("2012-03-07T18:32:35.002Z")
}
{ origin: “facebook.com/photos/xwdf23fsdf”,
license: “Creative Commons CC0”,
size: {
dimensions: [ 124, 52 ],
units: “pixels”
Horizontal scalability }
for large data sets }
Application Why MongoDB Might be a good fit
Large number of objects Sharding lets you split objects across
to store multiple servers
High write / read Sharding + Replication lets you scale
throughput and data read and write traffic across multiple
distribution servers, multiple tenants, or data
centers
Low latency access Memory mapped storage engine
caches documents in RAM, enabling
in-memory operations. Data locality of
documents significantly improves
latency over join-based approaches
Variable data in objects Dynamic schema and JSON data
model enable flexible data storage
without sparse tables or complex
joins, and provide for an intuitive
query language
Cloud based deployment Sharding and replication let you work
around hardware limitations in the 40
cloud.

2 NoSQL

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2 NoSQL

Uploaded by

Copyright:

Available Formats

NoSQL Databases

Why Do We require NoSQL ?

…beyond RDBMS rows & columns…

…for a different breed of scale

…and more specialized and simplistic

The resulting dataset was so large that it couldn’t fit

Their fundamental goal was to build a system that

The solution was neither a full relational database nor

A highly reliable web storefront

Database management System

{name: “will”, name: “jeff”, {name: “brendan”,

• High frequency trading

Social Media / • Multiple data sources

Scale writes over

• Large volume of users

• Expose data to millions of customers

• Join the conversation

Scale out to support { cookie_id: “1234512413243”,

• Diverse product portfolio

db.archives. Indexing techniques

{ type: “Artifact”, { ISBN: “00e8da9b”,

• Comments and user generated

Multi-device • Generate layout on the fly

• Store large objects

{ camera: “Nikon d4”,

{ camera: “Canon 5d mkII”,

You might also like