Professional Documents
Culture Documents
No SQL Data Storage Tools
No SQL Data Storage Tools
Contents
1 Introduction
12
5 Conclusion
13
6 Bibliography
14
Introduction
There are a lot of companies and web services that are using databases daily.
The common database used for the last decades is known as Relational data
base.
A relational database is a collection of data organized in a set of relationed tables previously dened. Each table contains one or more categories
organised in columns, and the information of a unique element of the data
organised in each row. To interact, doing queries, with the information contained in those databases, it is used the SQL (Structured Query Language),
based on the relational algebra.
They needed the data to be available, in the sense that all clients can
With the arrival of the web 2.0 not just the companies can upload data to
the web, but all the people, and the exponential growing of the data starts
to cause problems managing all this information stored in the relational
databases.
As all the operations done on the database are controlled by just one
machine to ensure the consistency, this machine starts to become a bottleneck
when the number of operations at a time grows, slowing the global behaviour
of the database. A rst approach to solve this problem is to scale-in (or
vertically) the machine, buying better components, like a better processor,
but it is expensive and does not solve totally the problem.
Big web companies like Google, Facebook or Amazon realized that for
them, most of the time it was more important the performance and processing velocity rather than consistency, and started to develop new types of
databases to solve the situation. Those data bases are known as NoSQL.
Easy scale-out
fore they are applied, so an error occurs we can restore the data using the log.
Partition tolerance and availability
Lack of structure
One characteristic of the NoSQL databases is the lack of a predened structure. In the relational databases we had a clear tabular structure that we
had to predene before adding any information to the database, but here we
don't have this problem.
This has advantages and disadvantages:
The advantage is clear. As the system is not responsible of the struc-
ture of the data, adding new information is much faster than in the
relational databases. It is said that NoSQL databases can ingest almost
everything.
7
cause a loss of data, in the sense that we are free to use dierent
structures to add the same type of data, but if we try to recover that
data, we can "forget" some of the structures used, and recover just a
part of the information.
Depending on the data structure that a database have, we can classify the
NoSQL databases in 4 main groups:
3.1
Key-Value database
Apache Cassandra
Document database
MongoDB
Graph database
10
Neo4j
Neo4j is the most famous NoSQL graph database. It is open source, written in Java and counts with a native graph storage and processing system.
It ensures consistency and hight availability of the data.
3.4
Column-based database
Druid
11
NoSQL vs SQL
As we have seen along this technical report, the NoSQL databases appeared few years ago to solve a problem: The old databases, called relational databases, weren't able to work with the amount of data that is being
generated nowadays.
A NoSQL database must verify basically 2 conditions: scale-out and be
lack of structure. With this characteristics, those databases solve the problem they were created for, but this doesn't mean that the relational databases
are getting old, or that are no more useful.
Then, we should know when is it good to use a NoSQL database and when is
it better to use a relational database. It is better to use a NoSQL database
if:
The amount of data is huge.
Our data doesn't have a uniform schema.
We expect intensive use of the database.
There are a lot of relations between our data.
12
Conclusion
NoSQl databases are a new generation of databases created to solve the problem of handling with huge amounts of data, for this reason are increasingly
used in big data and real-time web applications.
They lack of a predened structure, so there are a lot of NoSQL database
types (key-value, document oriented, graph etc), each of them specially built
for an specic work. This lack of structure also causes that it is hard to move
from a noSQl provider to another.
As any new technology, NoSQL databases are not still as used as they
should be, maybe due to the distrust they still generate, but the benets
of the correct use of this technology are great: velocity working with huge
amounts of data, easily parallel and scale-out structure... For all of this
reasons, it is expectable that the use of noSQL databases will grow in the
future.
13
Bibliography
Material from subject "Back-end for Big Data analysis".
https://es.wikipedia.org/wiki/NoSQL
https://en.wikipedia.org/wiki/NoSQL
http://www.acens.com/wp-content/images/2014/02/bbdd-nosql-wp-
acens.pdf
http://www.genbetadev.com/bases-de-datos/bases-de-datos-nosql-elige-
la-opcion-que-mejor-se-adapte-a-tus-necesidades
http://www.genbetadev.com/bases-de-datos/el-concepto-nosql-o-como-
almacenar-tus-datos-en-una-base-de-datos-no-relacional
http://www.maestrosdelweb.com/nosql-como-el-futuro-de-las-
bases-de-datos/
http://www.campusmvp.es/recursos/post/Fundamentos-de-bases-
de-datos-NoSQL-MongoDB.aspx
14