You are on page 1of 34

UNIT NO.

1
INTRODUCTION TO
NOSQL
By Kajal Londhe.
Sinhgad College Of Science
What is covered in this presentation?
A brief history of databases
NoSQL WHY, WHAT & WHEN?
Characteristics of NoSQL databases
Aggregate data models
CAP theorem
Introduction

• Database - Organized collection of data


• DBMS - Database Management System: a software package with
computer programs that controls the creation, maintenance and
use of a database
• Databases are created to operate large quantities of information
by inputting, storing, retrieving, and managing that information.
Relational databases
• Benefits of Relational databases:
• Designed for all purposes
• ACID
• Strong consistancy, concurrency, recovery
• Mathematical background (well-defined semantics)
• Standard Query language (SQL)
• Lots of tools to use with i.e: Reporting services, entity
frameworks, ..
SQL databases
History of NoSQL
• The term NoSQL was coined by Carlo Strozzi in the year 1998. He used this term
to name his Open Source, Light Weight, DataBase which did not have an SQL
interface.
What is NoSQL?

• NoSQL Database is used to refer a non-SQL or non relational


database.
• It
provides a mechanism for storage and retrieval of data other
than tabular relations model used in relational databases. NoSQL
database doesn't use tables for storing data. It is generally used
to store big data and real-time web applications.
• For example, companies like Twitter, Facebook and Google collect
terabytes of user data every single day.
Why NoSQL?

• In today’s time data is becoming easier to access and capture


through third parties such as Facebook, Google+ and others.
Personal user information, social graphs, geo location data, user-
generated content and machine logging data are just a few
examples where the data has been increasing exponentially. To
avail the above service properly, it is required to process huge
amount of data. Which SQL databases were never designed. The
evolution of NoSql databases is to handle these huge data
properly.
When should NoSQL be used
• When huge amount of data need to be stored and retrieved .
• The relationship between the data you store is not that important
• The data changing over time and is not structured.
• Support of Constraints and Joins is not required at database level
• The data is growing continuously and you need to scale the
database regular to handle the data
Key Features of NoSQL :
1.Document-based: Some NoSQL databases, such as MongoDB, use a document-
based data model, where data is stored in semi-structured format, such as JSON or
BSON.
2.Key-value-based: Other NoSQL databases, such as Redis, use a key-value data model,
where data is stored as a collection of key-value pairs.
3.Column-based: Some NoSQL databases, such as Cassandra, use a column-based data
model, where data is organized into columns instead of rows.
4.Distributed and high availability: NoSQL databases are often designed to be highly
available and to automatically handle node failures and data replication across
multiple nodes in a database cluster.
5.Flexibility: NoSQL databases allow developers to store and retrieve data in a flexible
and dynamic manner, with support for multiple data types and changing data
structures.
6.Performance: NoSQL databases are optimized for high performance and can handle
a high volume of reads and writes, making them suitable for big data and real-time
applications.
Advantages of NoSQL
1.Flexibility: NoSQL databases are designed to handle unstructured or semi-structured
data, which means that they can accommodate dynamic changes to the data model.
This makes NoSQL databases a good fit for applications that need to handle changing
data requirements.
2.High availability : Auto replication feature in NoSQL databases makes it highly
available because in case of any failure data replicates itself to the previous
consistent state.
3.Scalability: NoSQL databases are highly scalable, which means that they can handle
large amounts of data and traffic with ease. This makes them a good fit for
applications that need to handle large amounts of data or traffic
4.Performance: NoSQL databases are designed to handle large amounts of data and
traffic, which means that they can offer improved performance compared to
traditional relational databases.
5.Cost-effectiveness: NoSQL databases are often more cost-effective than traditional
relational databases, as they are typically less complex and do not require expensive
hardware or software.
6.High scalability : NoSQL databases use sharding for horizontal
scaling. Partitioning of data and placing it on multiple machines in
such a way that the order of the data is preserved is sharding. Vertical
scaling means adding more resources to the existing machine whereas
horizontal scaling means adding more machines to handle the data.
Vertical scaling is not that easy to implement but horizontal scaling is
easy to implement. Examples of horizontal scaling databases are
MongoDB, Cassandra, etc. NoSQL can handle a huge amount of data
because of scalability, as the data grows NoSQL scale itself to handle
that data in an efficient manner.
NoSQL pros/cons
Advantages :
• High scalability
• Distributed Computing
• Lower cost
• Schema flexibility, semi-structure data
• No complicated Relationships
• Disadvantages
• No standardization
• Limited query capabilities (so far)
• Eventual consistent is not intuitive to program for
Aggregate Data Models (Types of NoSQL
database)
• Types of NoSQL databases and the name of the databases system that falls
in that category are:
1.Graph Databases: Examples – Neo4j
2.Key value store: Examples – Redis
3.Column-oriented: Examples – Hbase, Big Table

4.Document-based: Examples – MongoDB, CouchDB


Key value data Model
Distribution Models:
• Aggregate oriented databases make distribution of data easier, since the
distribution mechanism has to move the aggregate and not have to worry about
related data, as all the related data is contained in the aggregate. There are two
styles of distributing data:
• Sharding: Sharding distributes different data across multiple servers, so each
server acts as the single source for a subset of data.
• Replication: Replication copies data across multiple servers, so each bit of data
can be found in multiple places. Replication comes in two forms,
• Master-slave replication makes one node the authoritative copy that handles writes while
slaves synchronize with the master and may handle reads.
• Peer-to-peer replication allows writes to any node; the nodes coordinate to synchronize their
copies of the data.

• Master-slave replication reduces the chance of update conflicts but peer-to-peer


replication avoids loading all writes onto a single server creating a single point of
failure. A system may use either or both techniques. Like Riak database shards
the data and also replicates it based on the replication factor.

You might also like