Professional Documents
Culture Documents
Vol. 5, No 2, 2017.
1
www.japmnt.com
(JPMNT) Journal of Process Management – New Technologies, International
Vol. 5, No 2, 2017.
So, the attributes for trains are train which allows storing the particular objects
number, source city, destination city, with relation between them. A graph can
arrival time and departure time. Train be expressed as G = (V, E) where V is the
number is the primary key. The attributes set of vertices (nodes) and E is the set of
for pilots are employee identity, name, edges (relations). The data structure such
address, experience year. Employee as adjacency matrix, incidence matrix,
identity is the primary key. For the adjacency list and linked list are used to
relationship “assigned to”, the attributes store the graph. The Resource Description
are the primary key of train, pilot and date. Framework (RDF) is a popular
representation for graph data, and it is used
Now we can use relational tables to as a model by some graph databases even
represent the data object and the though the primary objective of RDF was
relationship with their attributes. The train not really representation of graphs [4].
table contains the above mentioned
attributes for trains, the pilot table contains In recent years, some graph databases
the above mentioned attributes for pilots such as DEX, Infinite Graph, Neo4j,
and there is one more table for the Orient DB, Titan etc. have been
relationship assigned to attributes like train developed, which are currently used in
number, employee identity and date. operations for storing data over a
traditional relational model. A database
system Big Table has been created and
1.2 Graph Database used by Google [5].
2
www.japmnt.com
(JPMNT) Journal of Process Management – New Technologies, International
Vol. 5, No 2, 2017.
In the graph model [11] shown above, the huge static graphs and has used
there are three nodes and five relationships geometry based technique to overcome
and a label. User is the label. Ruth, Billy this problem. Hong et. al. [9] used the
and Harry are the nodes and the property is graph database for subgraph matching with
name for the nodes and ‘follows’ is the set similarity. Sakr and Al-Naymat
Al [10]
relationship.
p. Here, Ruth follows Billy, have proposed efficient relational
which means that there is a connection techniques for
or processing graph queries.
between these two nodes using a directed They have used two kinds of relational
edge, and the edge has a property called mapping schemes for storing graph
‘follows”. In Neo4j, the edges are database such as Vertex-Edge
Vertex mapping
bidirectional as we can see in the model in and Edge-Edge
Edge mapping. In Vertex-Edge
Vertex
Fig. 1. Neo4j uses the cyphercyph query mapping, they have used two tables:
language for creating nodes, relationships Vertices (graph ID, Vertex ID, Vertex
with properties and also for retrieving, Label)) and Edges (graph ID, s-vertex,
s d-
searching and manipulating data. vertex and edge label) In Edge-Edge
Edge
mapping they have used only one table for
2. Related Work representing the graph. The table contains
Many researchers are working on Edge (GID, e-ID, e-label,
label, s-VID,
s sv-Label,
graph database in different fields such as Dvid, dv-Label).
computer science, chemistry, biology,
bio 3. Relational Databases vs. Graph
electronics etc. Balaur et. al. [6] have Databases
applied graph database technologies for
managing comprehensive genome scale Relational databases
database were originally
networks. In their work, they have designed in tabular structure. Modelling
visualized an acid metabolic network relationships are not quite suitable in
based on the cypher query language.
language Batra relational databases. It is so because they
[7] also compared the relational database use foreign keys to relate one information
and graph database based on some with another. In relational databases, tables
evaluation parameters such as level of are needed to be joined using foreign keys
support or maturity, security and to connect information from different
flexibility. Ching-Yung
Yung Lin [8] has tables. If we want to join many tables for
worked on graph computing and linked big querying, it may not be optimal to deal
data, and has faced challenges to visualize with.
3
www.japmnt.com
(JPMNT) Journal of Process Management – New Technologies, International
Vol. 5, No 2, 2017.
Table 1 Table 2
Employees Relationships
E1 Raj E1 R3
E2 Hirak E1 R4
E3 Bimal E2 R3
E4 John E3 R2
E3 R4
E4 R3
Here, employees and relationships are relational databases. But in our example,
shown in two tables and the relationship however the relations are undirected, so
table represents the relation between that emp_id ‘E3’ and relation_id’ R4’ are
employees. An id has been assigned to the same as emp_id ‘E4’ and relation_id
every employee in Table 1. An emp_id in “R3”.
the employees’ table is referenced as either
emp_id or a relation_id; both the emp_id Now, suppose we want to make
and relation_id columns in relationship simple queries such as to find out what
table refer ids in the Employees’ table. For Bimal’s relationships are in the relational
example, Bimal has emp_id ‘E3’ is a model. First, the query looks at the
relation to Raj and Hirak who has emp_id employees’ table, finds which emp_id is
‘E1’ and ‘E2’ respectively. assigned to Bimal, takes that emp_id to the
relationships table, finds all relation_ids
In the employees’ table, emp_id is the that are assigned to Bimal’s emp_id, and
primary key and both emp_id and then those relation_ids are returned back to
relation_id are used as composite keys in the employees’ table to find the emp_name
the relationship table. A composite key associated to those relation_ids. This type
needs both of the id’s to be a unique of a query is still possible with relational
combination of values. For example, model, but it is computationally expensive.
emp_id ‘E1’, relation_id ‘R3’ and emp_id
‘E1’, relation_id ‘R4’ are not same In contrast, for the same type of a
composite value. If directed relationships model, if we use a graph database such as
are needed to be modelled, the composite Neo4j then Neo4j gives us better result to
value such as emp_id ‘E3’ and realtion_id model the data for above relational model.
‘R4’ are completely different from the In Neo4j, the Model is the following:
composite value than emp_id ‘E4’ in the
4
www.japmnt.com
(JPMNT) Journal of Process Management – New Technologies, International
Vol. 5, No 2, 2017.
5
www.japmnt.com
(JPMNT) Journal of Process Management – New Technologies, International
Vol. 5, No 2, 2017.
In Fig. 3, for representing the Cricket Dravid, Virat etc. and also set properties
game dataset, we have created different such as cric_id and cric_name. For team
nodes for players, teams, game type and label, we have created nodes such as India,
relationships like “belongs_to” and Australia, Pakistan etc. In this model, we
“plays”. For example, we have created first have created the relationship between the
a label “Cricketer” and then different Cricketer and the Team.
nodes for that label such as Sachin,
Fig. 4: Representation of the Cricket Game Dataset for Relationship, Cricketer and
Game Type.
6
www.japmnt.com
(JPMNT) Journal of Process Management – New Technologies, International
Vol. 5, No 2, 2017.
In Fig. 4, for representing the cricket Query 3. Find the Cricket players who
game dataset with the relation between belong to Team India.
cricketerr and game, we already have
created different nodes for players, and we In Table - 3 we have shown the query
have added different nodes for game label results in milliseconds (ms) for Mysql and
such as Test, ODI and T20. We have set Neo4j. We have mentioned the number of
properties such as cric_id and game_name objects for the queries in Mysql and Neo4j
also. In this model, we have created the and also have given the time taken for
relationship between the cricketer and different queries in Mysql and Neo4j. We
game. The objective test of evaluation have calculated the time for executing the
including processing speed between Mysql above mentioned queries for 100, 300 and
and Neo4j is based upon a set of queries 400 objects. We have drawn the column
for the above schema. The Queries are as graph for the execution timeti for three
follows: queries in Mysql and Neo4j. It was found
that graph databases are faster than
Query 1. Find the names of teams. relational database with respect to time
taken for executing the complex queries.
Query 2. Find the names of the cricketers Further, we can design the complex
who have played both ODI and Test relationship model easily.
Cricket.
20
18
16
14
12
10 mysql
8 neo4j
6
4
2
0
query1 query2 query3
7
www.japmnt.com
(JPMNT) Journal of Process Management – New Technologies, International
Vol. 5, No 2, 2017.
250
200
150
mysql
100 neo4j
50
0
query1 query2 query3
400
350
300
250
200 mysql
150 neo4j
100
50
0
query1 query2 query3
5. Conclusions
REFERENCES
Both relational database and graph
database are useful to represent data. [1] Database Trends and Applications. Available
http://www.dbta.com/Articles/Columns/Notes
http://www.dbta.com/Articles/Columns/Notes-on-
However, if we compare with reference to NoSQL/Graph-Dat
Dat abases
abases-and-the-Value-They-
storing and retrieving highly connected Provide-74544.aspx,2012
data, graph databases appear to give better
results. This means that we can more [2] Surajit Medhi, Visualization of Graph Models
for Web Document in Neo4j, International Journal
easily design and retrieve the results of of Energy, Information and Communications,
queries if we use graph databases instead Vol.7, Issue 6, 2016
of using relational databases. When we
want to add a new relationship to graph [3] Gabriele Pozzani, Introduction to NoSQL,
profs.sci.univr.it/~pozzani/attachments/nosql%20
profs.sci.univr.it/~pozzani/attachments/nosql%20-
database, we do not need to restructure the
%2001%20-%20introduction.pdf,
%20introduction.pdf, April 24, 2013
database again. As we can see from our
experiment, we get better er results for [4] Jonathan Hayes, A Graph Model for RDF R
executing the predefined queries in graph (Diploma thesis), Technische Universit¨at
Darmstadt Universidad de Chile, 2004.
databases than in relational database with
respect to time. The retrieval times for [5] R. Angles and C. Gutierrez, Survey of Graph
quires give us a conclusion that graph Database Models, ACM Comput. Surv.
Surv 40(1):1–39,
database is suitable to use in commercial 2008.
purposes such as developing social [6] Irina Balaur, Alexander Mazein, Mansoor Saqi,
network, stock market, recommendation Artem Lysenko, Christopher J. Rawlings and
engine and network management. Charles Auffray, Recon2neo4j:
Recon2neo4j Applying Graph
Data Based Technology for Managing
8
www.japmnt.com
(JPMNT) Journal of Process Management – New Technologies, International
Vol. 5, No 2, 2017.
Comprehensive Genome – Scale Networks, [9] Liang Hong, Lei Zou, Xiang Lian, Philip S. Yu,
Bioinformatics, 2017, 1–3. Subgraph Matching with Set Similarity in a Large
. Graph Database, IEEE Transactions on Knowledge
[7] Shalini Batra and Charu Tyagi, Comparative and Data Engineering, 1041-4347, 2015.
Analysis of Relational and Graph Databases,
International Journal of Soft Computing and [10] Sherif Sakr and Ghazi Al-Naymat, Efficient
Engineering, Volume-2, Issue-2, 2012 Relational Techniques for Processing Graph
Queries, Journal of Computer Science and
Technology, 25(6): 1237 – 1255, 2010.
9
www.japmnt.com