NoSQL DB

Advanced Database Systems
NoSQL
Firma convenzione
Politecnico di Milano e Veneranda Fabbrica
del Duomo di Milano
Instructor
Aula Magna –Eric
Rettorato
Umuhoza, PhD
Mercoledì 27 maggio 2015
eumuhoza@andrew.cmu.edu
@EricUmuhoza
NoSQL History
1980  Relational Databases

1990
2000
2010
Relational Databases - Advantages
 Relational databases have been a successful providing

o Persistence
o Concurrency control
 Different users access the same information at the same time
 Transactions are used to ensure consistent interaction
 ACID properties
o Integration mechanism
 Many applications need to share information
 By getting all applications to use the database, we ensure all these
applications have consistent, updated data
o SQL, a quasi standard
o Reporting
Relational Databases - Disadvantages
Impedance mismatch between

the relational model and the in-
memory data structures
NoSQL History
1980
1990
 Object Databases
2000
2010
NoSQL History
1980
1990
 Relational DBMS continued to dominate over
OODB
2000  Integration DB
o Multiple applications storing their data in a
shared DB
2010
o Improves communication
NoSQL History
 How to cope with lots of traffic generated by websites and social media
applications?
o Scale up (bigger machines, more processors, disk storage, and memory)
o Use lots of small machines in a cluster
 Relational databases are not designed to run efficiently on clusters
o Alternative route to data storage
 2007:
o The research paper on Amazon Dynamo is released
o The document database MongoDB is started
 2008:
o Cassandra project
o Voldemort
The #NoSQL Story
 A meetup
o Johan Oskarssonin SF, CA, June 2009
 A hashtag: #nosq
 BUT
Carlo Strozzi used the term NoSQL in 1998 to name his lightweight, open-
source relational database that did not expose the standard SQL
interface.
Definition
 There is no standard definition of what NoSQL means

 The term began with a workshop organized in 2009
 But there is much argument about what databases can
truly be called NoSQL
Common Characteristics
 Non- relational
o They don’t use the relational data model and
thus don’t use the SQL language
 Cluster-friendly
o They tend to be designed to run on a cluster
 Open Source
 Schema-less
o They don’t have a fixed schema
o They allow you to store any data in any record
NoSQL Originators
 Google (BigTable, LevelDB)

 LinkedIn (Voldemort)
 Facebook (Cassandra)
 Twitter (Hadoop/Hbase, FlockDB, Cassandra)
 Netflix (SimpleDB, Hadoop/HBase, Cassandra)
Different Types of NoSQL
 Key-Value Store
o A key that refers to a payload (actual content /
data)
o MemcacheDB, Azure Table Storage, Redis
 Column Store
o Column data is saved together, as opposed to
row data
o Super useful for data analytics
o Hadoop, Cassandra, Hypertable
Different Types of NoSQL
 Document / XML / Object Store

o Key (and possibly other indexes) point at a
serialized object
o DB can operate against values in document
o MongoDB, CouchDB, RavenDB
 Graph Store
o Nodes are stored independently, and the
relationship between nodes (edges) are stored
with data
o Neo4j
Key-Value
Search by ID is usually built

on top of a key-value store Value
Key
100
215
325
Key-Value
Business Key Value

information about tweet
twitter.com tweet id
information about flight

kayak.com Flight number
yourbank.com Account number information about it
information about it
amazon.com item number
Column Storage
Row-store Column-store
+ Only need to read in relevant

+ Easy to add and modify a record data
- Might read in unnecessary data - Tuples writes requires multiple
accesses
Suitable for read-mostly, read-intensive, large data

repository
Why Document-based?
 Handles Schema Changes Well (easy

development)
 Solves Impedance Mismatch problem
 Usually in JSON
 Not really schema-less

o Implicit schema to retrieve specific values
o E.g.: I want a price of an order!
An Example with Relations
[Marco Brambilla, Data Design and Modeling]

Key-value Approach
Document-based Approach
Column-based Approach
From Key-based to Column/Document
Aggregate- oriented Database
 An aggregate is a collection of data that we interact with

as a unit
 Aggregates form the boundaries for ACID operations with
the database
 Key-value, document, and column-family databases
can be seen as forms of aggregate-oriented DB
 Aggregates make it easier for the database to manage
data storage over clusters
 Aggregate-oriented databases work best when most data
interaction is done with the same aggregate
 Graph databases
organize data into
node and edge graphs
 They work best for
data that has complex
relationship structures!
o How about relational
database?
o Relation doesn’t
mean relationship
Classification of NoSQL
Classification of NoSQL
Key Value CRUD Operations
 Query operations are limited to

o put(key,value)
o get(key)
o delete(key)
Example of NoSQL System: MongoDB
 An open source and document-oriented database

 Data is stored in JSON-like documents
 Designed with both scalability and developer
agility
 Dynamic schemas
Terminology: SQL vs MongoDB
MongoDB Data Model
MongoDB Queries: Create
 CRUD (Create–Read –Update –Delete)

o Create a database: use database_name
o Create a collection:
db.createCollection(name, options)
options: specify the number of documents in a collection etc.
o Insert a document:
db.<collection_name>.insert({“name”: “nguyen”, “age”: 24,
“gender”: “male”})
MongoDB Queries: Read

o Query [e.g. select all]
db.<collection_name>.find().pretty()
o Query with conditions
db.<collection_name>.find({“gender”: “female”, “age”: {$lte:20}
}).pretty()
o It’s pattern matching!
Read – Mapping to SQL
Comparison Operators
MongoDB Queries: Update

o <collection_name>.update(<select_criteria>,<updated_data>)
o db.students.update({‘name':‘nguyen'}, { $set:{‘age': 20 } } )
o Replace the existing document with new one: save method:
db.students.save({_id:ObjectId(‘string_id’), “name”: “ben”, “age”: 23, “gender”:
“male”}
MongoDB Queries: Update

o Drop a database
o db.dropDatabase()
o Drop a collection:
o db.<collection_name>.drop()
o Delete a document:
o db.<collection_name>.remove({“gender”: “male” })
Conclusions
 NoSQL pros
o Data modeling can be an iterative process
o Linear scaling occurs as nodes are added to the cluster
o Native integration of Map/Reduce Frameworks and Full-
text search engines
o Easy and efficient storage of high-variable data
Conclusions
 NoSQL “cons”
o Implicit schema at the application level
o Applications need to check for consistency and integrity constraints
o No transactions (across multiple objects), conflict resolution must
be done by the client application
o ACID transactions are limited to just one element (row, document,
entity, etc.) in contrast with RDBMS
o 2nd generation NoSQL or NewSQL databases try to cope with this problem
o Data models and query languages are proprietary and create
vendor lock-in
o Data structure is chosen upfront, based on the queries that will be
expressed.
o If queries change also data need to be changed

NoSQL DB

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NoSQL DB

Uploaded by

Copyright:

Available Formats

Advanced Database Systems

1980  Relational Databases

 Relational databases have been a successful providing

Impedance mismatch between

 There is no standard definition of what NoSQL means

 Google (BigTable, LevelDB)

 Document / XML / Object Store

Search by ID is usually built

Business Key Value

information about flight

yourbank.com Account number information about it

+ Only need to read in relevant

Suitable for read-mostly, read-intensive, large data

 Handles Schema Changes Well (easy

 Not really schema-less

[Marco Brambilla, Data Design and Modeling]

 An aggregate is a collection of data that we interact with

 Query operations are limited to

 An open source and document-oriented database

 CRUD (Create–Read –Update –Delete)

 CRUD (Create–Read –Update –Delete)

 CRUD (Create–Read –Update –Delete)

 CRUD (Create–Read –Update –Delete)

You might also like