SQL To JSON-LD

SQL MIGRATE TO JSON-LD
HARITHARAN S/O NEDUNSELIAN
SESSION 2016/2017
FACULTY OF INFORMATION SCIENCE & TECHNOLOGY
MULTIMEDIA UNIVERSITY
MARCH 2017
SQL MIGRATE TO JSON-LD
BY
HARITHARAN S/O NEDUNSELIAN
SESSION 2016/2017
THE PROJECT REPORT IS PREPARED FOR

IN PARTIAL FULFILLMENT
FOR
BACHELOR OF INFORMATION TECHNOLOGY

B.I.T. (HONS) SECURITY TECHNOLOGY
MARCH 2017
© 2015 Universiti Telekom Sdn. Bhd. ALL RIGHTS RESERVED
Copyright of this report belongs to Universiti Telekom Sdn. Bhd as qualified by

Regulation 7.2 (c) of the Multimedia University Intellectual Property and
Commercialization policy. No part of this publication may be reproduced,
stored in or introduced into a retrieval system, or transmitted in any form or by
any means (electronic, mechanical, photocopying, recording, or otherwise), or
for any purpose, without the express written permission of Universiti Telekom
Sdn. Bhd. Due acknowledgement shall always be made of the use of any
material contained in, or derived from, this report.
II
DECLARATION
I hereby declare that the work have been done by myself and no portion of the work
contained in this thesis has been submitted in support of any application for any other
degree or qualification of this or any other university or institute of learning.
____________________
Haritharan S/O Nedunselian
Faculty of Information Science & Technology
Multimedia University
Date: 13 March 2017
III
ACKNOWLEDGEMENT
I would like to thank everyone who had contributed to the successful

completion of this project. I would like to express my gratitude to my supervisor, Dr.
Subramanian Kannan for his invaluable advice, guidance and his enormous patience
throughout the development of the research.
In addition, I would also like to take this opportunity to acknowledge my

loving parent and brothers especially, N.Kanimuthu and my friend Divagaran,
Muhilan, Dev Abineyan, Nandtha Kumar and Vivek who have been for me to support
and assist me in overcoming the errors which I have done during the development
phase of this project. Thank them for sharing their knowledge and time to assist me in
this project.
IV
ABSTRACT
Relational database management system (RDBMS) ideas are utilized all over
the world and it is globally used by many organization. Under RDBMS, there are two
types of the database introduced which are SQL and NoSQL databases. SQL database
is mainly designed for structured data and NoSQL database is designed for semi-
structured data and unstructured data. Therefore, SQL database is known by everyone
in the earlier period but nowadays there are many drawbacks of SQL database over
the storage device. In Large organization, massive amount of data collection from
different field gives a huge amount of problem due to the storage device over the
technology and it is also costly. The relational database is not scalability, not partition
tolerance and it is not fit for current web service technology. This brings a situation of
more organization and database vendor struggling with the new technology that
attempts to solve the disadvantages of the SQL.
However, semantic web 2.0 introduces NoSQL database as a compensation for

SQL. The goal of this research is to migrate the SQL database into NoSQL database
and it stores in JSON-LD document. After NoSQL database was introduced; many
organization started to use it and appreciate the benefits of using it. NoSQL database
guarantees that high scalability and availability over the storage device. The research
also described the type of tools that can be used for SQL to NoSQL migration. It
compares the data types over SQL and NoSQL and the storage technology that has
been used. However, based on the CAP theorem, NoSQL guarantees to provide high
performance over the storage device but it is unpredictable in terms of consistency. In
this thesis, the chosen tool for the migrating process is MongoDB. The objectives of
this research are shown in methodology, which is SQL to NoSQL mapping and
document stored in JSON-LD. This research paper also discusses on how the JSON-
LD data is supported by the SQL to NoSQL translation and its features compared to
SQL database. As a result, the methodology of this research paper is to achieve the
result.
V
TABLE OF CONTENTS
DECLARATION ..................................................................................................... III
ACKNOWLEDGEMENT ...................................................................................... IV
ABSTRACT ...............................................................................................................V
TABLE OF CONTENTS ........................................................................................ VI
LIST OF TABLES ................................................................................................ VIII
LIST OF FIGURES ................................................................................................. IX
LIST OF ABBREVIATIONS/ SYMBOLS ......................................................... XII
LIST OF APPENDICES ...................................................................................... XIII
LIST OF ABBREVIATIONS/ SYMBOLS XIII ................................................ VI
CHAPTER 1 INTRODUCTION.............................................................................. 1
1.1 Overview ................................................................................................. 1

1.2 Problem Statement................................................................................... 2
1.3 Project Objectives .................................................................................... 3
1.4 Project Scope ........................................................................................... 3
CHAPTER 2 LITERATURE REVIEW ................................................................. 4
2.1 Introduction ............................................................................................. 4

2.2 What is RDB? .......................................................................................... 4
2.2.1 Relational database model ................................................................... 5
2.2.2 Relational Database Terminology ....................................................... 9
2.2.3 ACID database and distributed database ........................................... 10
2.2.4 The SQL Language............................................................................ 12
2.3 The NoSQL Language ........................................................................... 14
2.3.1 Document Store (DS) ........................................................................ 17
2.4 Key-Value Stores................................................................................... 18
2.4.1 Columnar database ............................................................................ 19
2.4.2 Column .............................................................................................. 19
2.4.3 Super Column .................................................................................... 20
2.4.4 Column Family .................................................................................. 21
VI
2.4.5 Graph Database ................................................................................. 22
2.4.6 Cap Theorem ..................................................................................... 24
2.5 What is JSON-LD? ................................................................................ 25
2.6 Existing migrate method and tools for transform SQL to JSON-LD .... 29
2.7 Comparison of transformation method proposed by authors. ............... 36
2.8 Chapter Summary .................................................................................. 37
CHAPTER 3 METHODOLGY.............................................................................. 39
3.1 Hardware Requirements ........................................................................ 39

3.2 Software Requirements ......................................................................... 42
3.3 Selected tools for migrating SQL to JSON-LD ..................................... 43
3.4 Justification for Selected tools for SQL to JSON-LD ........................... 46
3.5 Converting SQL to RDF using RDB2LOD and stored in JSON-LD
document ........................................................................................................... 50
3.6 Schematic diagram ................................................................................ 52
CHAPTER 4 IMPLEMENTATION AND RESULT ........................................... 54
4.1 Migrating SQL to JSON-LD using MongoDB ..................................... 54

4.2 Migrating SQL to RDF and stored in JSON-LD using RDB2LOD ...... 59
4.2.1 Standard RDB Mapping .................................................................... 59
4.2.2 Custom RDB Mapping ...................................................................... 61
4.2.3 Different between Standard RDF Mapping and Custom RDB
Mapping ......................................................................................................... 67
4.2.4 Import the Standard RDF Mapping and Custom RDB Mapping file
into Protégé 4.3 .............................................................................................. 68
CHAPTER 5 ANALYSIS OF RESULT AND DISCUSSION ............................. 72
........................................................................................................................... 72
5.1 Discussion.............................................................................................. 74
CHAPTER 6 CONCLUSION ................................................................................ 74
REFERENCES ........................................................................................................ 75
APPENDICES.......................................................................................................... 79
VII
LIST OF TABLES
Table 1 Comparison of transformation method that propped by the authors

(Aswamenakul, Buranarach, & Saikaew) ............................................................. 36
VIII
LIST OF FIGURES
Figure 2:1 Relational Database Model (Ramakrishnan & Gehrke, 2000) ........... 6
Figure 2:2 A design for Primary Key ...................................................................... 7
Figure 2:3 A design for Foreign Key ....................................................................... 7
Figure 2:4 The diagrams show the process of normalization (Dharaneeswaran) 8
Figure 2:5 Show an example of ERD (Jones, 2006) .............................................. 14
Figure 2:6 Show the different between ACID and BASE (McPhillips, 2012) .... 16
Figure 2:7 Show RDB data model and Document data model (Kammerer &
Nimis, 2014) .............................................................................................................. 17
Figure 2:8 JSON format of Data in the form of Document Structure (Lamllari,
2013) .......................................................................................................................... 18
Figure 2:9 Shows the Key and Values (Manoj, 2014) .......................................... 18
Figure 2:10 Shows JSON document (Manoj, 2014) ............................................. 20
Figure 2:11 Shows Super Column (Manoj, 2014)................................................. 20
Figure 2:12 Shows Column Family database (davevalz, 2013) ........................... 21
Figure 2:13 Shows the different between RDB and Columnar database (davevalz,
2013) .......................................................................................................................... 22
Figure 2:14 Shows Graph Database using Neo4j (Hunge, 2016) ........................ 23
Figure 2:15 Shows CAP Theorem (Lamllari, 2013) ............................................. 25
Figure 2:16 Shows JSON-LD Data Type (Presbrey, 2014) .................................. 27
Figure 2:17 Shows Simple Couch DB Document (Södergren & Englund, 2011)
................................................................................................................................... 33
Figure 3:1Performance Comparison Line Graph ................................................ 40
Figure 3:2 Hardware Review ................................................................................. 41
Figure 3:3 Installed hardware review in the PC ................................................... 41
Figure 3:4 Shows the sharding process ................................................................. 45
Figure 3:5 Shows example of JSON document ..................................................... 46
Figure 3:6 Shows example of field in NoSQL ....................................................... 47
Figure 3:7 Shows example of SQL join relations and NoSQL relation. ............. 47
Figure 3:8 Shows the summarization of the SQL to NOSQL migrating process.
................................................................................................................................... 48
Figure 3:9 Shows the map reduce query process. ................................................ 48
Figure 3:10 Development Process .......................................................................... 52
IX
Figure 3:11 Data Flow Diagram for SQL to JSON-LD ....................................... 52
Figure 4:1Show list of database in MySQL Workbench and MySQL ............... 54
Figure 4:2 Show interface of MongoDB and list of the same database in MySQL
................................................................................................................................... 54
Figure 4:3 Mongify is the intermediate for migrating SQL to MongoDB ......... 55
Figure 4:4 Perform configuration between MySQL database and MongoDB .. 55
Figure 4:5 Performing testing on Mongify connection ........................................ 56
Figure 4:6 Diagram show the source code for translating SQL to MongoDB ... 56
Figure 4:7 Diagram show the translation process between SQL to MongoDB . 57
Figure 4:8 Diagram show data after performing translation into MongoDB which
stored in document database .................................................................................. 57
Figure 4:9 Perform query for specific document database ................................. 58
Figure 4:10 Diagram show document database which in JSON format ............ 58
Figure 4:11Robomongo represent the document database into GUI ................. 59
Figure 4:12 Standard RDB mapping approach .................................................... 60
Figure 4:13 Successfully generated standard mapping ....................................... 60
Figure 4:14 Custom relational database mapping ............................................... 61
Figure 4:15 Show the interface for manual mapping process between relational
database and Ontology ............................................................................................ 61
Figure 4:16 Configuring database for MySQL database .................................... 62
Figure 4:17 Show the manual mapping process between relational database and
Ontology ................................................................................................................... 63
Figure 4:18 Show dump to RDF datasets .............................................................. 64
Figure 4:19 Query RDF to file ................................................................................ 64
Figure 4:20 Starting the server .............................................................................. 65
Figure 4:21 Show that D2R server is start ............................................................ 65
Figure 4:22 Show the relation between classes, property, subject predicate and
object ......................................................................................................................... 66
Figure 4:23 Show the result different between standard mapping and custom
mapping .................................................................................................................... 67
Figure 4:24 Show the result different between standard and custom mapping in
term of SPARQL query .......................................................................................... 67
Figure 4:25 Show the SQL data into OntoGraf as a result ................................. 69
Figure 4:26 Show Ontograf .................................................................................... 69
X
Figure 4:27 Show the relationship between each table in the OWLViz form ... 70
Figure 4:28 Object property for SQL database.................................................... 70
Figure 4:29 Show the Entities of SQL data ........................................................... 71
Figure 5:1 Show how the data is stored in the different migration process ....... 72
Figure 5:2 OWL data is represented subject, predicate, and object in JSON
visualization.............................................................................................................. 72
Figure 5:3 RDF represent the data in the form of OntoGraf .............................. 73
Figure 5:4 RDF represent the data in the form of subject, predicate and object in
OntoGraf .................................................................................................................. 73
XI
LIST OF ABBREVIATIONS/ SYMBOLS
RDB Relational Database

RDBMS Relational Database Management System
SQL Structured Query Language
NoSQL Not Only Structured Query Language
JSON JavaScript Object Notation
JSON-LD JavaScript Object Notation Linked Data
CAP Consistency, Availability, Partition Tolerance
W3M World Wide Web Consortium
GIS Geography Information System
RDM Relational Database Model
DBMS Database Management System
JSON Final Year Project
ANSI American National Standard Institute
ISO International Standard Organization
DDL Data Definition Language
DML Data Manipulation Language
ERD Entity Relationship Diagram
BASE Basically Available, Soft State, Eventual Consistency
XML Extensible Mark-up Language
RDF Resource Description Framework
HTTP Hypertext Transfer Protocol
FTP File Transfer Protocol
TFTP Trivial File Transfer Protocol
IPV6 Internet Protocol Version 6
XII
LIST OF APPENDICES
Appendix-A: Weekly Report .................................................................................. 79

Appendix-B: CD Cover ........................................................................................... 80
Appendix-C: Source Code ...................................................................................... 81
XIII
CHAPTER 1
INTRODUCTION
1.1 Overview
In the current period of time, the growth of internet and pc usage all over the
world is increasing. In the past 10 years the number of the internet users have increased
by 28.5%. As compared to last time, people nowadays have started to adopt technology
by using computer. When the market introduced personal computer, people are afraid
of the technology given in the computer but nowadays, everyone is more accepting
towards technology, it shows the growth of technology in current scenario towards the
needs of the people. Relational Database Management System (RDMS) is a program
that is all over the world and it is used to store data. Relational database is developed
by E.F.Codd since 1969. IBM, Oracle, and Microsoft are successfully using the
relational database services all over the world but for a larger scale of database, the
users need to pay. It makes it less cost-effective. After more people started using the
internet, it contributed to performance issue over the relational database. In addition to
that, relational database is facing performance issues as it is not scalable. Many
organization, found the solution to solve the performance issue and scalable problem.
NoSQL is the database which fills the gap in the database market, it is also capable of
managing large scale of data. NoSQL database system started using era web 2.0. As a
result, NoSQL has become the new database which competes with RDB. However
compared to RDB, NoSQL database has become more popular because it supports
structured and unstructured data. RDB only supports structured data. Moreover,
NoSQL database system can handle the huge amount of data to distribute the huge
scale of data into various storage devices with low cost. NoSQL database is supported
in horizontal and vertical scaling. For horizontal scaling, the NoSQL has distributed
the simple operation into many servers. As for vertical scaling, the NoSQL is used to
store the single record on the different server. Then it is able to replicate between the
servers and distribute data over many servers. Additionally, NoSQL database is
dynamically able to add the new field in the records. Compared to NoSQL database
with SQL, NoSQL is more popular because it supports unstructured data, semi
structured data and schema-less. NoSQL database is schema-less, because it supports
1
many servers and perform the task faster than RDB. Giant industry such as Facebook,
Twitter, and Google adopts the NoSQL database because of the availability and
scalability. Another key feature in the NoSQL database is shared nothing, replication
and partitioning data over many clients. NoSQL database system does not support
ACID properties except Couch DB. However, it is not that effective. NoSQL database
consist of CAP theorem. In the CAP theorem, there are system which must have only
two properties out of three. In addition to that, the database does not support standard
method for query processing in NoSQL but it is faster than SQL query method.
Although, World Wide Web Consortium (W3M) plays an important role after the SQL
data is migrated to NoSQL and stored in JSON and JSON-LD document. JSON data
is highly scalable and is itself describing. JSON data is also similar to NoSQL, which
is schema-less. Finally, the focus of this research paper is migrating SQL data into
NoSQL document store in JSON-LD. The reason of this research is to show how
JSON-LD data is supported over the web and what are the challenges faced in
translating SQL to NOSQL.
Chapter 2 discusses what is meant by relational database and how it works. It

also introduces NoSQL databases as a compensation for SQL. In brief, it explains what
are the terminology and concepts used by NoSQL. Furthermore, it also describes
JSON-LD that is used by the NoSQL to store the documents. Finally, they also provide
some steps that can be used for the migrating process.
1.2 Problem Statement
For the current web, SQL plays a standard role in database representation over
the web. However, it also has too many drawbacks over relational database which are
SQL.SQL is not highly scalable and it has poor performance over multiple databases.
To solve this problem transformation of SQL into NoSQL should be carried out. Then
it should also carry out the mapping process SQL to NoSQL database. The document
in the NoSQL database is stored in the JSON document. By plugin, some library JSON
is converted into JSON-LD and it easier to use in web.
2
1.3 Project Objectives
The Objective of this research is Migrating SQL data into JSON-LD. The following
are the objectives:
1. To Investigate and compare the migration tools from a MySQL

database into NoSQL MongoDB.
2. To implement it by using existing method and add semantics to the
JSON-LD.
3. To compare the SQL data and JSON-LD data to get an identical result.
1.4 Project Scope
The scope of the thesis is migrating the relational database into NoSQL
database and transform it into JDON-LD. Upon migrating SQL to NoSQL database,
the next is representation: linking, field, document, and collection. After mapping is
completed the document is stored in JSON or JSON-LD based on the plugin library.
The mapping file must manipulate with graph for an identical result.
3
CHAPTER 2
LITERATURE REVIEW
2.1 Introduction
This Literature review will discuss the current trends of the database and era
web 2.0 technology database which is NoSQL database. Then it also introduces JSON-
LD used by NoSQL database to store data. The current trend of an industry is that it
has started using NoSQL databases such as Facebook, twitter, and Netflix. So, some
of existing vendor also started to migrate to NoSQL database. Next chapter will discuss
on what tools that is used for migrating and what are the properties that are used by
NoSQL database in depth. The objective of this research paper is to analyse relational
database, SQL, NoSQL, and JSON-LD. Bearing this in mind, this literature review
will give some brief information on how it works and these information are collected
from the latest articles, Journals and research papers.
2.2 What is RDB?
Relational Database (RDB) is one of the symbols which are sorted out into a
collection of relations. The idea of “Relational” is inspired with the aid of the
mathematical idea of a relation. Before the relational concepts are implemented there
are several approaches that has been practiced but there are not as successful as a
relational database. Relational databases are a collection of one or more relations,
columns or attributes and rows of data known as tuples. It explains that the origins of
the relational model or relational database that is developed by Edgar Frank Codd in
the early and mid-1970.Relational Database Management System (RDBMS) is used
by various software system for maintaining relational database. The development of
relational database began in 1976 and early 1991, the database is made of two-
dimensional array of rows and columns for recombining different relation database
and it produces great flexibility of data performance. The method of structured data is
introduced during 1986 by Burroughs.
4
According to (Elmasri & Navathe, 2009), Geography database system (GIS)
stores data into the two-dimensional tables through its process, it can communicate
with multiple data elements of a data set. The dynamic data is used on operational, for
example regular organization, foundation and association they utilized information to
store as a record and they also used to modify and maintained the data. Static data is
used on analytical for track historical databases. Additional inconsistent and excess
information are issues related with an early database demonstrate alluded to as a
various levelled tables which can represent parent child relationship in the table. Link
database units are equally complicated. Both are helpful for developing the relational
database model.
Based on (Hernandez, 2013) Dr.E.F.Codd has founded the new mathematical

theory which is useful in the relational database. The theory is represented relational
database model (RDM) in term of a decision. It can be forecast considering and make
it the Relational Database Model (RDM) risk-free and expectable. The theory is totally
appreciated and it makes it strengthen in Information Technology environment.
(Bonham-Carter, 2014) stated that structured data is a form of record and tuples into
the dataset so that its relationship between completely different entities and attributes
may be used for data access and transformation.
2.2.1 Relational database model
The relation data model is used for storing the dataset information using the
relations or tables. It means the Database Management Software allows for storing,
accessing, modifying, deleting, and retrieve the information which is stored in the
server. However, the database model is used by other application software.
Identification of the relational database model can done by main mechanisms for
relational database. They are the entity, tuple, attribute, and domain.
5
Figure 2:1 Relational Database Model (Ramakrishnan & Gehrke, 2000)
Then the most important data integrity features in the relational database model
is the candidate key, primary key, and foreign key. And the relations are in tables,
tuples are in rows and the attribute in the columns. A candidate key is referred to the
columns on the database. The candidate key can be one or more candidate key in the
tables but if there have only one candidate keys means it special it assigned as a
primary key for the tables. A Primary Key is a candidate key which doesn’t have
repeated values nor does it comes with a NULL value in the table. A primary
key will unambiguously identify every row in any table, so a primary key's chiefly
utilized for record looking.
Primary key in any table is each a super key as well as a candidate key.
It is potential to have a lot of than one alternative of candidate key in a selected table
example. In this case, the selection of the first key would be driven by the designer’s
choice or by user necessities. Primary key shown below Figure 2.2. A foreign key's
usually from one table that seems as a field in another wherever the primary table
encompasses a relationship to the second. In primary key table cannot be null it must
contain a unique data or non-null value for the each row it knows as entity integrity.
The foreign key is must contain any values, that are refer to the existing row in the
parent database table. Make sure they are no one adding added rows to the table and
does not match the entry in the table. It knew as referential integrity. Foreign Key
shown below 2.3.
6
Figure 2:2 A design for Primary Key
Figure 2:3 A design for Foreign Key
There another important aspect of the relational database is normalization.

Normalization is the method of decomposing huge table into smaller ones. It views to
cast off redundant information and copy statistics to avoid problems with updating,
deleting, retrieve and deleting records. During the normalization procedure, structures
are tested against normal forms and then it’s modified if any of the previous mentions
problems issues are found. A normal form is specifically set with the rules and it used
to test the table structure with any problem are identified. Currently, there are First
Normal Form (1NF), Second Normal Form (2nd) and Third Normal Form (3rd).
Normal Form (1NF):If there are group are eliminated of repeating data, it
means it is creating a new table for each group data for a specific group and which
must identified by primary key lines.
7
Normal Form (2NF): The primary key is compromised with several columns
and it must consist of the 1 normalization rules in order to continue the 2NF method
and they do not have dependencies on any other values and key which are primary key.
Third Normal Form (3NF): The table which is already in 1NF and 2NF is used
for the 3RD normalization and it used the non-primary key which is not be determined
by a non-primary key value in 2NF. For example, in 2NF product table is represent
networkID as a primary key and discount rate is not allocate as a column in with
Products table on the off chance that it is likewise reliant on the productPrice, which
is not part of the primary key.
Figure 2:4 The diagrams show the process of normalization (Dharaneeswaran)
8
2.2.2 Relational Database Terminology
(Hernandez, 2013) describe four categories of terms are described in Relational

Database which is value-added, structure, relationships, and integrity-related.
Value Related Terms:
The data that been stored in the database is the static data until it makes some
modification on the data. Sometimes in the database NULL value represents the empty
or the real value cannot be evaluated by the relational database algorithms.
Structured Related Terms:
 In the relational database table is one of the important factors and it consists of
entities, attributes and a lot of information. The arrangement of the table is
supposed to be not an important factor. It usually represents unmarried topics,
which can be an item or an event. A field is the smallest structure in the
relational database it also knows as the column. The column represents the
relationship between the value and it also shows some multivalued
relationship.
 The structure of the table in the relational database is represent as record whit
unique identifier that known as the primary key. In table, tuple is also taken the
main role to represent the subject of the content. The process is implemented
on saved SQL query.
9
Relationship-related Term:
Relationship establishes in an association between the entities of the entire

tables. The combination of the primary key and the foreign key of tables makes it the
existence of relationship database.
Types of Relationships:
 One-to-One: The relationship happened between one table to another table of

entities and it represents only one connection.
 One-to-Many: Combinations of the existing tables, if one record within the
initial table is expounded the additional records will expound second table,
however, the first tables within the second record may be associated with just
one record within the initial table.It is the factor that represents the one to many
relationships in the table.
 Many Many-to-Many Exists between a combine of tables if one record within
the initial table is associated with one or additional records within the second
table, and one record within the second table is related with one or extra records
inside the underlying table. Building up an immediate relationship between
these 2 tables is intense accordingly of it'll turn out a larger than average
amount of repetitive learning in one in every one of the tables.
2.2.3 ACID database and distributed database
Database management system (DBMS) controls the distributed

database .Therefore relational database is managed by Database management system
(DBMS).Under DBMS is control the distributed database which is the database. Under
DBMS control, not all devices is connected with common CPU maybe be connected
with different CPU. There are two processes is distributed by database that is made
duplication and replication of the database. Replication utilizing specific software to
make changes on distributive database. After making changes to the source of
database, the replication starts to make all database same as the source database. This
can do one or more than one-time copy in the database. In some cases, if make changes
10
on destination database it causes to change on all another database including source
database. It required more computing resource to perform the replication process.
Duplication is the process made based on the source of database or master database.
In this process the database only duplicated the database but if make any change on
the duplicated data it does not cause any issue on the source database. Besides, there
is no definition of the exchange, the database will be a wrecked condition. The better
solution for this case is ACID properties. (Yu, 2009) stated distributed database can
be ACID properties it makes reliability of database. ACID defined into Atomicity,
Consistency, Isolation and Durability.
Atomicity:
 In the database management system (DBMS) “refer to the atomicity is ensure

that is possible to perform the transaction to (all or nothing)". In the atomicity
process if the transaction if fails mean the entire transaction will fail. The
atomicity must sure guard to make atomic to the system in any cases.
Consistency:
 Exchanges ensure database consistency. Exchanges accomplish consistency by

keeping the database reliable after an exchange finishes effectively.
Isolation:
 Every running exchange is autonomous of the other simultaneous exchanges

and they don't see the progressions made by another inadequate exchange.
Durability:
 Durability expresses that once an exchange is submitted, its belongings are

insured to preserve even in a case of ensuring failures. That implies when
clients are informed of achievement, the exchanges will be held on, not be fixed
and make due disappointment from a client.
11
2.2.4 The SQL Language
Structured Query Language (SQL) is a Database Language that was

recommended by Database Management System (DBMS) which used for defining,
storing, retrieving and manipulation data. (Halverson, 2016) described SQL is a
standard language that was developed by IBM for querying, altering and defining
relational database using declarative statements. Even if SQL is a standard, Nowadays
many database systems are implemented their own version of the SQL language.
During this document, we’ll use the Microsoft SQL Server as an example. There are
various totally different database systems and Database Management Systems, like
Microsoft SQL Server, Oracle, MySQL, Microsoft Access, IBM DB2, and Sybase.
The name SQL was is originally abstract from structure command language. The
authorized title now could be Database Language Structured Query Language.
American National Standard Institute (ANSI) is officially approved SQL as database
language and it accepted by the International Standard Organization. Quality has
evolved and continues to try to thus.
According to (Yu, 2009), the Database Management System is intonation to

Structured Query Language. It’s vital to know that SQL-compliance doesn't guarantee
movability of an information base from one DBMS to a different. This is often as a
result of DBMS vendors generally embody enhancements not needed by the SQL
normal, however not dominated out by the quality either. Most products are not utterly
compliant with the quality.
Data Definition Language (DDl) & Data Manipulation Language (DML):
Data Definition Language and a Data Manipulation Language defined the

structure of the SQL. (Halvorsen, 2016) stated the most essential things of DDL are
the CREATE, ALTER, RENAME and DROP statement such as CREATE makes an
object (a table, for instance) in the database, DROP erases an item in the database,
typically hopelessly, ALTER alters the structure a current object in different routes—
for instance, including a column to a current table. The Data Manipulation Language
(DML) is the subset of SQL used to add, delete and update information.
12
Entity Relationship Diagrams square measure a serious information demonstrating
instrument and can facilitate organize the info in your project into entities and outline
the relationships between the entities. One of the method that been established to
supply an honest info structure in order that the info may be held on and retrieved in
an exceedingly best manner. Entity Relationship Diagram is that the structure of a
system and it’s a conceptual model. ERD, representational process solely structural
options give a static read of the system.
Components of an ERD:
 Entity: Entity is the main part in the table because it represents the data which
relating to table and relationship of the table is composed based on the entity
data. For example city, student, hostel, and more.
 Relationship: Relationship is based on the entities of data. The relationship

happens between one to many entities in the table. For example, associate in
nursing worker might method several payments, however, may not method
any costs betting on the character of this woman career.
 Attribute: Attributes is the subtypes of the entities and represent actual

content of the table. For example, Name, worker range, a rate of pay are all
attributes of the entity worker. Example, worker range could be a primary key
for a worker.
13
Figure 2:5 Show an example of ERD (Jones, 2006)
2.3 The NoSQL Language
Nowadays in the information technology (IT) world, the leading one of the
technology is relational database management system (RDBMS).The RDBMS in
developed by E.F.Codd during 1970 for ruled the business. There are different types
of relational database model concept is implemented, one of the models is hierarchical
model and another is network data model were introduced in 1960.The hierarchical
database model has recorded the data into tree structured. The records are in the parent
and child relationship. The Network model allows the multiple parents and child
process in the database and generates through graph database. The network model is
origin by Charles Bachman in 1969 in the Conference/Committee on Data Systems
Languages (CODASYL) Consortium. This approach is used for every relation
database system. After introduced the semantic web 2.0 application the NoSQL
databases is more scalable and reliable solution than a relational database for data
storage. (McPhillips, 2012) stated that "Transitioning an application from a relational
environment to NoSQL environment presents numerous difficulties as far as diagram
outline and information access techniques".
14
Therefore the relational database is capacity for processing and managed a
large amount of database. Nonetheless, taking care and managed the large database is
obligatory for IT Companies. RDBMS is not suitable for managing a large amount of
data for clustered computing to solve this kind of problem another approach is find
which are NoSQL is introduced was acquainted with moderate a portion of the issues
is not took care of carefully by RDMBS. Carl Strozzi is introduced the Not only SQL
is one of the open source relational databases in term of "Strozzi NoSQL" and it was
used 1998 first time. The reason why called NoSQL databases is combination and
collaboration of many plugin and libraries in Application Programming Interface
(API).
The NoSQL database is completely different from the relational database. In

the NoSQL database can be used any new existing database which is SPARQL and
Cypher; the does not have any standard for implementing the new database so that it
flexible, speed and scalable. Most of the NoSQL is meet the requirement with the user
which are more than terabyte data. It also doesn't has primary key, foreign key and
candidate key and it not store the data into the table. The data in the NoSQL has used
the document to store the data it most commonly knows in JSON format. (McPhillips,
2012) describe there are several key elements in NoSQL, which are automatic sharing,
horizontal scaling, schema-less, highly available and simple query.
Horizontal Scaling is improved the performance of NoSQL database into the

dataset. If added the new server into the database it automatically shared the resources
equally with the distributing server.
Scheme-less is, for the most part, there is an immediate similarity between this
"schema less" style and powerfully wrote a language. Develops, for example, those
above are anything but difficult to represent in PHP, Python and Ruby. What we are
attempting to do here is make this mapping to the database common.
Highly available the Information is reproduced over various servers taking into
account an exceptionally accessible design that can deal with various server
disappointments and bolster fiasco recuperation simple query process it does not
15
support SQL query language for query but it provides the different method which is
simplest that SQL Query. The NoSQL is used the unstructured and semi-structured
database. It also asynchronously replication with the procedure for recreating
information between databases where the essential stockpiling. No concurrent
Replication has the benefit of speed, at the expense of expanded danger of information
misfortune due to correspondence or copy framework disappointment.(Mughees,
2014) explained that relational database it make use of ACID properties but NoSQL is
databases is used BASE concept. The Base represents the acronym of basics
availability, soft state and eventually consistent. BASE concepts are implemented into
NoSQL database and it also highly suggested. The relational database is data is
consistent but the NoSQL data is not consistent. The BASE concept and ACID
properties is are two different concepts. ACID alternative is called BASE concept.
Below the diagram shows the different between the ACID and BASE.
Figure 2:6 Show the different between ACID and BASE (McPhillips, 2012)
16
2.3.1 Document Store (DS)
Therefore, the data in the NOSQL database the data is stored in semi-structured
and unstructured form. The DS used to store the database information in NoSQL
database. In the relational database it uses the table to store the information and used
SQL for query but in this case, the NoSQL database does not contain any table it used
document store for sired the data. The DS does not represent any schema but it is
independent. The programmer's life become easy went it is independent and it easy for
integrity issues. On the off chance that a field is expecting an invalid esteem then it is
not a part of the document. In the NoSQL database, every document store it
automatically or manually create a unique identifier (UID).The UID is same functions
as primary keys in RDBMS. In the distributed network the document store is useful
for web based application. Figure 2.6 (Kammerer & Nimis, 2014) below shows the
different between RDBMS and DS.
Figure 2:7 Show RDB data model and Document data model (Kammerer &
Nimis, 2014)
17
Figure 2:8 JSON format of Data in the form of Document Structure (Lamllari,
2013)
2.4 Key-Value Stores
These system is using different database compare SQL. The NoSQL use a key-
value database to store the data in a system. (Lamllari, 2013) stated that the Amazon
is the first organization that used key value database. The Amazon used DynamoDB
model for implementing the key-value pairs.
The key-value database is the master for all NoSQL database. Therefore in the
key-value database, the key are generated automatically or a user can modify manually
based on user requirement but the most commonly it generated automatically. The
values in the key-value database are based on the user input and it must represent the
key. Key-value database is used hash tables for store the data. Therefore the query
process also can be done but it can make it source to the destination; it cannot perform
another round.
The NoSQL database it also accepted any type of data such as the array, int,
and char but it based on the architecture of the system. In the key value, we can also
perform instantiate, read, write and delete four fundamental operations. For example
Figure 2.8 Show the image of Key Values database.
Figure 2:9 Shows the Key and Values (Manoj, 2014)

18
Based on the figure it shows key and column in the separate column. The key
is must be unique and it automatically generated an id for the key. The values show
the output enter by user.
2.4.1 Columnar database
(Manoj, 2014) described that most command method that used for a query in
the NoSQL database in a columnar database. The columnar database is also known as
the Column-oriented database. The first organization that implemented the columnar
database is Google. Moreover that all organization inspired the method and start to
implement, it been used on a very large scale of structured data for the storage system
and it distributed by Google Bigtable. However, the Facebook message is also used
columnar database and the functionality provides by the HBase. The HBase is also
supported to perform the thousand of a message. This approach is has been used for
NoSQL database. In the NoSQL database, it only supported column by column, row
by rows and rows by column. For example, in the relational database if the user
performs querying on a large database it took longer time because it performs query to
all rows and column but in NoSQL database, the query only performs by row and
column and column and column, it makes the process fastest than the traditional
database. The columnar database is not stored in the table it stores in circulated models.
In a columnar database, it can be divided into three ideas, which are the column, super
column and column family.
2.4.2 Column
In the columnar database, all values are stored in the column format. The rows
in the NoSQL are always paired with keys and values. One of the best example which
used the column is Google BigTable, HBase and Cassandra but it initialized by
Bigtable. Below shows and example of JSON notation.
19
Figure 2:10 Shows JSON document (Manoj, 2014)
2.4.3 Super Column
NoSQL has also consisted of the super column which the object contain by
column families. In the super column the key-value pair performs the mapping
between each and other in column families, the reason is user can view the number of
tables similar to the relational database. But NoSQL is not supported view function
but it similar to view on a table. During the mapping process interrupted such as similar
name, age and first name can be taken part but it continues the process. Super column
is read the whole column and rows and it generated in two subsystems into the
memory. Super column same function as the composite column in NoSQL. Based on
the key-value they are different between a column and super column. Normally the
column values are int and string type but the super column type are different, it
produced based on the mapping processing. It also sorted the values of the column into
the array during the process. Below show the simple example of a super column.
Figure 2:11 Shows Super Column (Manoj, 2014)
20
2.4.4 Column Family
According to Muhammad Mughees (2014) column family database is

introduced by Google BigTable and now it widely used all over the world. Column
and super column is taking part in the columnar database it similar to a column family.
The column family is similar to relational database but they differentially different
concept. In the relational database, the data store in the row but in NoSQL the data is
stored in column family database. But we can also perform the same searching method
for both processes. Then, column family the data stored logically identified each and
other but in physical there are put away together. The column family is accepted how
much the letter is big enough. It also accepted any type of data to the columns.
The NoSQL data are stored in column family database, if the data consists
many columns also it can associate with the rows. For example user access to Facebook
privacy settings at the same time they access to Facebook profile of the friend. In the
relational database it contains bulk of data in rows but in NoSQL the data store row
with multiple columns. And another different is the different type of rows can't be the
same column.
Figure 2:12 Shows Column Family database (davevalz, 2013)
The diagram above show the example of column family database .The key are
shows the different between the relational database and NoSQL. The key is supported
how much that big. The column is supported more than column and rows. It means
“table with table”. For the relational database it will created another table and linked
with the primary key and foreign key but column family it all contain single column.
21
Figure 2:13 Shows the different between RDB and Columnar database
(davevalz, 2013)
Huge preferred standpoint here is that we are conveying the related information
nearer to each other, which makes information recovery quick. However attempting to
total this kind of information gets to be troublesome and requires a Map Reduce work.
Much the same as the Key Value stores two major advantages to Column
Stores are even scaling and the absence of a blueprint necessity. Be that as it may, with
Column Stores, having a decent comprehension of the tables and key structure
characterized preceding improvement helps to actualize a valuable information store.
2.4.5 Graph Database
Graph Database is the database it is in graph structure and it contains nodes,

entities and properties. The graph database consists more than one entities and more
than on relationship between the nodes. Nodes are also known as entities and it also
contains properties. In the application, nodes are the instance. Each relationship has its
own properties in a graph database. The graph database similar to the relational
database. Then graph database is faster than RDBMS during join relationship. The
relationship between hubs is not computed at question time but rather is really held on
as a relationship. Crossing endured connections are speedier than ascertaining them
for each inquiry.
22
A graph database is also used shortest path algorithms to calculate the short
path and it scalable. The can have the different type of relationship between the nodes
and there are no limit on the relationship between the nodes. The graph database is
start and end with nodes but there have the different type of relationship.
However, the graph database is easy to perform query instead relational

database and it also used query database. It is easy to broadcast any data in easy
method. It easy for adding the new relationship with nodes and for modification we
need to be done every end of node and relationship. Nowadays many graph database
is available, one of the popular graph databases is Neo4j and OrinteDB. Graph
database that exclusive backings single-profundity connections or contiguousness
records, where you can't navigate more than one level profound for connections.
Figure 2:14 Shows Graph Database using Neo4j (Hunge, 2016)
23
2.4.6 Cap Theorem
(Mughees, 2014) explained the CAP theorem proposed by Eric Brewer in the
year 2000. First of all, CAP in cap theorem stands for consistency, availability and
partition tolerance and these are three attributes of the distributed system that is, a
system made up multiple machines and multiple nodes communicating with one
another over the network and these are three promises.
CAP theorem applies to distributed stuff. Distributed stuff usually involves

more than one laptop and piece of networking between the laptop. ACID theorem is
something that applies to local storage. The meaning of C in ACID is not the same
meaning as C in CAP.
C – Consistency:
 If we write to one node when we read from another node it will return to what
we just wrote or to be more precious it will return
A – Availability
 Promise that when we talk to one node will respond unless that node has
failed. Availability arouses for failed node but if the node has not failed it will
respond partition.
P – Partition Tolerance
 When the network is partitioned whatever other promises have made about the
systems it will still keep those promises. A network partition when messages
can flow from one machine to another. This might happen when we are set 2
different data centres and the wide over the connection between the two is
severed. Figure 3.4 Show the CAP Theorem classification diagram.
When the network is partitioned whatever other promises have made about the
systems it will still keep those promises. A network partition when messages can flow
from one machine to another. This might happen when we are set 2 different data
24
centres and the wide over the connection between the two is severed. Figure 3.4 Show
the CAP Theorem classification diagram.
Figure 2:15 Shows CAP Theorem (Lamllari, 2013)
2.5 What is JSON-LD?
(Presbrey, 2014) described Liked data is one of the technology that raised
faster in IT world. The semantic web is one of the most command technology that
used today. The semantic web is similar to the web. In the semantic web, there have
been making many changes based on the World Wide Web consortium. The
semantic web has several evaluations which are semantic web 1.0, semantic web 2.0,
semantic web 3.0, semantic web 4.0 and semantic web 5.0.
 Semantic web 1.0 is based on the "read only" web and viewing.
25
 Semantic web 2.0 is based on the personal web page, portable, wiki and
widget that powered by the web. During this time NoSQL was introduced. It
also is known as the social web.
 Semantic web 3.0 is called intelligent semantic web because it connected the
existing data with the intelligent data for smarter uses. Linked data approach
was introduced during this evaluation and it enables more computer resources
to do more work by reading the web page. It is one of the ways to perform
standard on the web. Linked data is machine readable across the world.
Linked data is improved on the web, it engages individuals that distribute the
web and make sure used it. The data on the web it shared vocabulary with and the
standard and published across the IT world. JavaScript Object Notation (JSON) is the
standard format that used in a web for transmitting the data. A human can easily read
and make changes in the JSON code and it is easy to implement compare to another
approach. JSON is based on JavaScript and it is and independent language. In the
semantic Web 2.0, JSON was first introduced. JSON is an alternative method for
extension markup language (XML). Then JSON is the light weight programmed and
easy to execute in the web. It also "no-schema", which means it easy to implement in
web server and it quickly perform integration from other sources. Facebook, Google
service and Twitter are now used the JSON for supporting the web. If compare the
JSON with a relational database it caused some disadvantages. Which has been
developing technology is NoSQL, they are limited for NoSQL for query process.
JSON also does not have the standard method for query process. Besides that
JSON is incomplete the ACID theorem. JSON is represent 6 types of data and can be
divided into two. First is primitive types itself it contains string, number,
 Boolean and null and second is structure types itself it contains object and
arrays.
 String: The code must be used ' " ' symbols for the starting and ending code.
 Number: The number can be any types such as int and Boolean and it can't
be ANSI types.
 Null: it empty value.
26
 Array: value must by ending with “," and valid data.
 Object: Values and key pairs is the attribute for the object
Figure 2:16 Shows JSON-LD Data Type (Presbrey, 2014)
In the web, service HTML is mainly created for the human for making user
readable and another side is mainly for computer readable is called JSON. JSON is
back-end process and it supported human and machine readable. RDFa and HTML are
used for expressing linked data but it does not propose in the standard way so that
JSON-LD is introduced. JSON-LD is human readable and machine readable, it also
lightweight linked data. It depends on the officially effective JSON organize and gives
an approach to help JSON information interoperate at Web-scale. JSON-LD is a
perfect information group for programming situations, REST Web administrations,
and unstructured databases, for example, Couch DB and Mongo DB. JSON-LD is
similar to the RDF and it successful factor in the World Wide Web. In the semantic
web, Jason-LD is been used and it integrated. Since JSON-LD is 100% perfect with
customary JSON, designers can keep on using their current apparatuses and libraries
and it also initially with REST service.
RDBMS does not support XML.JSON and JSON-LD format but it possible to
migrate the RDBMS to JSON-LD through MongoDB. The RDBMS data can be
transformed into Turtle and N-Triple form then it can be migrated to JSON. The
translation of RDB to Turtle more and similar to JSON-LD. The language structure
was changed drastically in the most recent forms and permits now information to be
27
serialized in a way that is regularly indistinct from conventional JSON. This is
exceptional since JSON is utilized to serialize a graph model that are in tree form or
parent-child model.
The graph model in the JSON-LD is represented subject, object and properties
which are similar to RDF. The subject nodes must end with another node. URI is the
extension of IRI; it is one of the standards was published by RDF3987.The subject
which represents the nodes must be labelled with the IRI and this is the first
requirement for the JSON-LD. Jason-LD is supported labelled and unlabelled nodes.
In some cases, the nodes have not fully filled the requirement it's also accepted but not
for all cases. The edges are must label with the IRI and it used refer another document.
In the current period of time, most developers think that for JSON-LD we must put
more effort to migrate from JSON to JSON-LD. However the fact is JSON and JSON-
LD is command method but it including with additional key values which are
@context and @id.
(Lanthaler & Gütl, 2012) described that JSON-LD is similar to JSON, most the
developer and user think that JSON-LD is more complicated. The developer or user
still go on through the existing tools and method. The JSON-LD has basically
supported the RDF method. The creation of JSON-LD is to solve the multiple webs
contributed the same data. For example, in the first web page, the name data value is
similar to the second web page it courses an ambiguity problem. To solve this kind of
the problem JSON-LD is used global identification which is @id and @context. The
migration of JSON to JSON-LD is just making it by adding the global variable or
reference with API for producing the output.
28
2.6 Existing migrate method and tools for transform SQL to JSON-LD
MongoDB:
(Dzhakishev, 2014) proposed that the have Mongo DB is one of the tools that
been used nowadays for migrating a relational database to NoSQL and document are
stored in the JSON-LD format. Mongo DB is open source software it published by
10gen in 2007.A C++ has been used for developing Mongo DB. It the service is
provided for Linux, Mac OS and windows. Mongo DB is more commonly known as
NoSQL database. The relational database table is can transform into JSON document
format it done by Mongo DB. In the Mongo DB, the document is stored in Binary
JSON (BSON).The BSON is limited with 16 MB size for each document. Then Mongo
DB data are stored in the rows and column instead table and it does not have any
database. Mongo DB is automatically creating are @id for each process, the id similar
to the Primary key in a relational database.
The id must be unique and it does not same with others id. Mongo DB is using
key-value for hold the data .However Mongo DB has held the data in the memory
instead buy large storage device for the system, it can also save money from buying
large storage device. Mongo DB is also supported sharing concepts. Sharing concepts
are divided the large amount data into single data, by splitting the data the process can
be done quickly and scalable. Usually, the data in the Mongo DB are divided into a
column by using the shared keys. A particular scope of the key is characterized as a
shared and any key falls inside that extent is doled out to that shard. It also supported
indexing and Map Reduce. Map Reduce method is proposed by Google in
2004.MapReduce is read a large amount of data in very quickly and splitting up it
individual and it distributed over the network computer. The main idea of Map Reduce
has divided the function and executed in cluster network and it was done by two
functions which are map () and reduce ().
One if the benefit using the MapReduce over Mongo DB is asked the question
over the document store which is no less confined than the potential outcomes of the
programming dialect in which the MapReduce example is actualized. In the Mongo
29
DB, it included all the JSON data types which are Boolean, null, array, double and
object. Mongo DB has used the special command like findAndModify, update if
current and modifier.
(Cattell, 2011) described about the modifier command is used for set the value,
append, remove the values, and retrieve the record and updated the record. The
findmymodify command is for to perform an immediate update and store the value in
the document, this process is been done too fast. Update if current command is used
to make changes only if the value which is similar to previous value then updated.
Therefore, Mongo DB is the replication mechanism. Mongo DB is provided database
durability and concurrency over the document, Mongo DB is also synchronizing
information over numerous servers and the master slave is the replication system.
Mongo DB is also used for read and write the data of master-slave and more slaves in
the network. In this process, the master can read, write and update from the slaves.
Finally the MongoDB server is used mongo server for published the JSON and JSON-
LD text over the HTTP.
Google app engine:
Google app engine is an integral part of the Google cloud platform. Google app
engine provide us everything we need to build for the cloud including platform and
infrastructure as service as well as big data, cloud storage and much more. Cloud
application should be able to scale at a moment notice to handle huge. Using app
engine we can build our application to run on top of Google’s world-class
infrastructure we don’t need to worry about provisioning and managing a data center
that scales to meet our demand. (Zahariev, 2009) stated In-app engine we can build
cloud scalable app with python, PHP or GO using your favorite development tools in
a local environment for testifying and debugging your app before we deploying it to
the cloud app engine also supports java and with an eclipsing and we can also develop,
test and debug locally before we deploy it to cloud. In-app engine, we have multiple
options for storing our data including relational databases scalable file storage and a
lightweight data store. App engine makes us more productive by reducing the need to
30
write boilerplate code and manage services such as task queues and the user API will
help to build world class application.
Couch DB:
(Cattell, 2011) describe about Apache Couch DB is one of the most NoSQL
database and it also open source. Couch DB database most for web and store JSON or
JSON-LD document. Then it normally save in JSON document but in current period
in migrating to JSON-LD. Couch DB database is written in Erlang programing
language. User must access the documents via HTTP web browser. In the Couch DB
the query the document based on JavaScript. Most of the mobile web and modern web
page are using Couch DB. Couch DB is also provide web administration console for
the admin. Couch DB is also provide add, delete, modify and delete function on
database document which store in the JSON format.
Couch DB is distributed in 2005 and it stated well kwon during 2008 when it
became an Apache product. Couch DB is totally different with relational database
because the data it not store in the relationship of table. The data of the data set is store
in the individual document form.
However, in the documents the Couch DB is written concurrently during the

edit, delete and add process but not Boolean. At the client side, the user does not have
to wait for a read or write during the concurrent process. The record in is special id
which is DocID and Sequence ID. Then every record it provides a Sequence id with
the database if make changes or modification also is updated simultaneously with the
document and stored. The document has split the data into separate rows and columns
in a database system. At the point when reports are resolved to plate, the record fields
Furthermore, metadata are stuffed into cradles, successively one archive after another.
Couch DB has also supported Compaction method. By scheduling the database setting
surpasses a specific measure of squandered space means the compaction process takes
over and clone the all the dynamic information to another record and after that disposes
of the old record. The database is still online until a specific time and all redesigns and
peruses are permitted to finish effectively. The old database record is erased just when
31
every one of the information has been replicated and all clients transitioned to the new
document.
(Anderson, Lehnardt, & Slater, 2010) stated, based on the CAP Theorem
Couch DB is supported availability and partition tolerance. When any critical session
all the client is view the database, during this time the user perform read and writing
process makes availability takes a secondary lounge to consistency, but case are
implement in RDB but in the Couch DB is performed the action successfully. However
in the documenting the Couch DB is written concurrently during the edit, delete and
add process but not Boolean. At the client side the user does not have to wait for read
or write during concurrent process. The record in are special id which are DocID and
Sequence ID. Then every record it provide a Sequence id with the database, if make
changes or modification also is updated simultaneously with the document and stored.
The document is split the data into separate rows and columns in database system. At
the point when reports are resolved to plate, the record fields.
Furthermore, metadata are stuffed into cradles, successively one archive after
another. Couch DB is also supported Compaction method. By scheduling the database
setting surpasses a specific measure of squandered space means the compaction
process is take over and clone the all the dynamic information to another record and
after that disposes of the old record. The database is still online until specific time and
all redesigns and peruses are permitted to finish effectively. The old database record
is erased just when every one of the information has been replicated and all clients
transitioned to the new document. Finally is Couch DB is intended for sans lock
simultaneousness, in the reasonable model and the genuine Erlang usage. It reducing
the bottlenecks problem and maintaining from keep looking the whole system under
overload problems. It also supported concurrent and availability, when the data centre
is the failure, it must to be scan and fix the error and need to be restart but in the Couch
DB database distributing system is detect any failure means it restarts the system but
the system still available.
32
Figure 2:17 Shows Simple Couch DB Document (Södergren & Englund, 2011)
Simple DB:
Based on (Dimovski, 2013) Simple DB is also most popular NOSQL data store
which is written in Erlang Programming language. Then it used for web service at
developed by Amazon. It distributed in 2007 and it partially paid software, based on
the client storage. The name Simple DB is based on the process Select, Delete,
Get_Attributes, and Put_Attributes. Simple DB document store id different with other
and it simpler. Simple DB store the document Google store. Based on the CAP
theorem, (Cattell, 2011) describe Simple DB has supported consistency but not value-
based consistency. Like a large portion of alternate frameworks, it does no concurrent
replication. Compare to other tools it dislikes key-value document store and it used
attribute-value pairs. These concept make the client wording practically equivalent to
ideas in a customary spreadsheet table. It also supports multiple attributes, the
document is stored in the document. Select operations are in one space and determine
a conjunction of requirements on attributes. Simple DB is select automatically select
the attributes and updated, in some cases, it also perform query by automatically.
However, it does not supported automatically transfer the data over series. Simple DB
also has some disadvantage, it is predictability performance. Finally Simple DB is
provided 10GB limited space for per domain, it not utilized for the daily multi-terabyte
Internet crawl.
33
HBASE:
HBASE is one of the popular Not only SQL database which it stored in column
oriented database. The system itself run on top of Hadoop Distributed File System.
HBASE can also be managing the big data which parsing the data on the cloud
environment. HBASE application a written in java, mud like typical map-reduce
application. HBASE application is written in Apache AVRO and RESTFUL API. The
HBASE application made up by a set of valued-pair and it contains table form. The
tables are stored in HDFS, HBASE does not typically use map reduce. Each table
contains row and columns. The table must be defined with column and it contains
primary key and it can also call by another document for accessing. HBASE columns
represent an attribute of an object and it grouped together into column families.
Relational Database to link of Data (RDB2LOD):
RDB2LOD used to migrate the SQL database into RDF and stored the data in
JSON format, it a concept of ontology. RDB2LOD application was developed for
installation and running on the Java virtual machine environment, and supports the
MySQL database management systems (driver native D2RQ platform) and MY SQL
Server (driver enclosed within the framework). Moreover, Relational Database
mapping can also done through this mapping process, can also make a SPARQL query
through D2RQ Server for JSON data and the mapping file are saved in JSON, N-Triple
and Ontology. The tools were possible to make the customization of the mapping file,
so Semiautomatic through the OWLtoD2RQ-Mapping Tool, which made the
replacement standard vocabulary the terms of the domain RDF. Then they compared
the views of the connected data generated, both from the standard RDB mapping, as
from the customized mapping (with the incorporation of domain ontology).
Application of the approach provided a RDB2LOD better expression to generate RDF
triples, presenting well-defined meanings (Approximate to natural language) for the
subjects, predicates and objects therefore, were obtained customizing the RDB
mapping with the incorporation of an ontology domain.
34
The application of RDB2LOD approach also provided a user interactive by
means of graphical interfaces, eliminating the need for manual interaction
configuration and operation of the tools applied to data publishing process connected.
Thus, the customization of mapping the associations between tables and RDB columns
and the classes and properties of ontology, which until then was made manually it shall
be done in an automated manner by means of this approach, which reduces the time
spent and the need for technical knowledge of the file language Mapping generated by
the applied tool. The customization of mapping bases Data with a large number of
tables and columns, the method would be impractical conventional (manually), it will
be possible through this approach. Finally, the after perform the mapping process the
dataset is saved in JSON format and the JSON file are used in linked data in Ontology.
.
35
2.7 Comparison of transformation method proposed by authors.
Table 1 Comparison of transformation method that propped by the authors

(Aswamenakul, Buranarach, & Saikaew)
36
2.8 Chapter Summary
In this research, there is a different kind of tools and types of conversion

method that has been discussed to migrating tools from RDB to JSON-LD. This
migrating tools called as MongoDB. A Relational Database consisting of separate
tables, many explicitly defined relationship and whose elements may be selectively
combined with the results of queries. A relational database can be divided into 3 terms,
which are value related terms, structured related term and type of relationship.
Structured Query Languages (SQL) is well known as the common language

that is being used for relational database. In details, SQL also used to query the
database by effective means. For example, it utilizes the keyword search, update, drop,
delete and more for the query. Additionally, it contributes to optimization and
maintenance of database and also in a much more relational database like MySQL
oracle, Microsoft SQL, Base and etc. The SQL syntax which is being used, is
standardized with standard except for some databases. Furthermore, ACID is a set of
properties that provides reliable plus scalable RDBMS to prevent from stale data and
guarantee that the database transaction processed is reliable. ACID properties consist
of four different types, in which, we are supposed to strictly follow upon creation of
the database in order to know how the maintenance of data in the database happens.
RDBMS have a problem on distributed environment. It effects the performance

and takes longer time in the process when terabytes data processing. Most of the
biggest company they do not depend on the RDBMS because it’s not enough
scalability and high availability. The solution for the problem in NOSQL database
which is documented in JSON-LD form. NOSQL represent a really broad category of
a database which allows large quantities of unstructured and semi structured data to be
stored and managed, now additionally they’re designed to handle a high level of reads
and writes while scaling horizontally. NOSQL is to provide scalability, performance,
and high availability. There are scalable they can handle a large amount of data to
answer. NOSQL keeps on growing in the scalability, performance and high availability
in their hardware failure so they can handle a large amount of data with amazing
performance to compromise the offer less as compared to RDB. NOSQL is not
supported in SQL query method but it used the simplest method and easiest method
37
for query process. SQL is supported only structured data but NOSQL supported
structured and unstructured data. Organization is changing from SQL to NOSQL is not
a surprising statement because it based on the goal of an organization. Although some
organization is adopted to the system and started using it. Base on this is useful to
migrate SQL to NOSQL and JSON-LD it helps to understand how the data in RDB is
stored and maintain the different of NOSQL.
Therefore, JSON-LD is taking the main role in NOSQL database for store the
document. JSON is used to transmit data back and forth between websites and in
browsers. Linked Data (LD) is new term is a way to publishing data on the web such
that it’s interconnected between different websites. Era semantic web 2.0 and semantic
web technology is used to linked data in JSON-LD. JSON-LD is really good at being
both human readable and easily possible by machines and the problem is when we start
to gather data from multiple websites.
38
CHAPTER 3
METHODOLGY
3.1 Hardware Requirements
Hardware Requirements is a physical computer resource. The hardware

components in the computer system are to use installed software efficiently. The
hardware used in this research is the laptop which in Windows 7 under Ultimate. This
edition was aimed at home market segment such as Windows Media Centre, Windows
Aero and multi-touch support. Since the hardware contains 8 GB Random Access
Memory (RAM), Window 7 is preferable because it is compatible with most of the
new programs and flexible in a usage of disk space and memory. If the hardware
contains less RAM, then Windows XP is suitable to be used because it occupies low
disk space and consumes less memory compare to others.
Window 7 has good visual styles and animation. It can support the installation
of the software that will be used for the real implementation and for XML editors as
well. Besides that, the laptop was manufactured by Acer and fall under ASUS A43SD
model. Processor Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz provides average
performance as shown below In addition, it also has Turbo speed of 3.1 GHz, Clock
speed of 2.5GHz.
39
Figure 3:1 Performance Comparison Line Graph
Based on the performance table shown above, this research will be faster and
produce excellent results if replace the current processor with the Intel Core i7 4790K
@ 4.00GHz processor. In addition, the hardware consists of NVIDIA GeForce GTX
630M, middle-class graphic card. This card ensures less power consumption and offers
pure video HD technology for video encoding.
It is also capable of supporting 3D vision and mainly for gaming purposes.

The hardware is 64-bit operating system which supports the usage of MongoDB for
SQL migrating method because 64-bit handle large amount of memory in RAM
(Random Access Memory) more effectively compared to 32-bit.If the hardware has
more RAM and more graphics memory , then Windows 8 suitable to be used as a
replacement for Window 7 . But Window 8 is still in progress due to its compatibility
level. The figure below shows the window edition and the system contents of the
hardware.
40
Figure 3:2 Hardware Review
No Pen or bit input is offered within the hardware as a result of there's no

installation of the inform devices driver within the system or there's a prospect of
hiding devices in the device manager. The figure below shows the tab of device
manager that showing devices by kind. Device manager lists all the hardware put in
within the portable computer.
Figure 3:3 Installed hardware review in the PC
41
3.2 Software Requirements
Programming is anything that can store electronically and empower the PC to

work. Programming Requirements implies the need of the appropriate devices to
complete the research activities. Appropriate instruments that have distinguished for
this research are.
MYSQL:
 MySQL is an open source tools that supports relational database management

(RDBMS).MySQL supports SQL. MySQL is developed by Oracle
Organization. MySQL is central component by LAMP. Finally, it is also used
by high profile and large scale websites.
FME workbench:
 FME workbench is known as Feature Manipulation Engine. It is specially

designed to connect the data between different formats, transform the data and
it will automatically run the workflow. According to our case, it is used for
import and export JSON-LD data from MongoDB.
Dezign for databases v9:
 DeZign for Databases is a commercial software. The software is used to import

SQL file or data which consist of relational database and it automatically
generates the Entity Relationship Diagrams (ERD) with relationship of the data
that is imported. The software also can import Microsoft Access and Excel files
or content to generate ERD DeZign for Databases. It also offers a modern
visual information demonstrating environment for database application
advancement that makes your database improvement handle much less
demand. The instrument lessens deficiencies in database advancement and
enhances your profitability. You can envision database structures to
42
comprehend your database, make new databases or figure out existing
databases to change, archive, examine, and upgrade.
lib curl:
 Libcurl is one of an open source tools that is used for multiprotocol file transfer
library. It is easy to use by the client side. Since in our research it used for
HTTP, FTP, and TFTP file for import and export. In addition, it also supports
IPV6 and supper fast file transfer. Finally, the program is written in C language.
Python Sphinx:
 However, sphinx tools are written in Python language, it makes user easy to
create documents. Furthermore, the sphinx convert restructured text or file into
HTML websites. By using this tools, user will find it easy to create an
intelligence documentation. Moreover, it is user-friendly.
3.3 Selected tools for migrating SQL to JSON-LD
In this research, the selected migrating tools for RDB to JSON-LD is

MongoDB. Based on the research paper, they are different kinds of tools and types of
conversion methods that have been discussed. However, in this research, two tools is
used to carry out the migration process. The selected migrating tool is MongoDB and
RDB2LOD. However, among the other tools, they have several reasons why
MongoDB is selected. One of the reasons for selecting MongoDB is Schema-free
document and it’s suitable for most application. MongoDB is also scalability and
tolerance portion.
Therefore (McPhillips, 2012) stated that MongoDB is huge and monstrous

data. In past 5 years, many organization used MongoDB for their web application and
android because it’s very easy to use and understood by non-technical people. In
addition, MongoDB is designed for the web platform. It also supports fault tolerance,
consistency, replication and document stored in JSON-LD. MongoDB is the right tool
43
for SQL to JSON-LD migration. MongoDB only uses its own storage. The client can
build more application from using the MongoDB tools. It also solves the RDB problem
since the database is not scalable for a larger database so MongoDB builds blocks to
solve larger database problem.
MongoDB is simple and reliable to create blocks. MongoDB is good enough

to solve the problem and it is flexible in creating blocks. MongoDB is based on key-
value data stores because of this MongoDB is suitable for our migrating approach. It
only uses column and row (instead of table and stores) to keep the data in document
database and it takes full advantage of JSON-LD document. MongoDB uses REST
API for add, delete, modify and retrieve and therefore it is very simple and it allows
the structured data into semi-structured and unstructured data forms. Another feature
of MongoDB is rich in query language. In the MongoDB database, it use sharding
method for a specific process and it is one of the main reason why RDB is discarded.
Below Figure shows the sharding diagram.
Sharding is used in a cluster environment or multiple servers for scalability

portions data. Then Sharding is possible to make auto-sharding and manual sharding
is based on user requirement. In some case, the server can be down but MongoDB
sharding method solves the problem by adding new server without downtime in easy.
It also capable of scaling to more than thousand and no single point let-down. In case,
there has any failure it capable of doing automatic failover.
Although, sharding is performed automatically more than one shard and each
shard are carried the total data to be the shard. It also read and writes automatically.
The shard is supported by replicate set; the replica set holds the data that been shard.
By holding the shard replica is provide high availability and redundancy. The replica
is defined in MongoDB is it can support more than one server and all containing same
data. In this case can be only one server will be primary and other become secondary.
Sometimes, when the primary server is down or disaster occurs another server
becomes primary and rest of becoming secondary it managed by MongoDB.
Therefore, sharding help for scaling the system and replication is supported back hand
and it provides high availability, data security, and disaster recovery. Based on
44
(Mughees, 2014) stated that among the different tools, MongoDB provides good result
during, and when performing the query and filter the data. The reason why MongoDB
used Jason is because it is faster to use static filtering because of these advantage make
me choose MongoDB for migrating SQL to JSON-LD.
Another tools which are chosen in this research, is RDB2LOD, which

converting the relational database to Resource Description Framework through
automatic and manual and stored in JSON-LD document. The RDB2LOD is tools as
three fundamental procedures which are the creation of an ontology from a database,
mapping a database to an existing ontology, and translation the SQL into RDF and
stored in JSON-LD. The RDB2LOD is the best tools for mapping database to ontology
mapping, it’s associated with the SQL statement and Database to Ontology Language.
RDB2LOD is the best tools and its supported D2RQ and D2R Server mapping
approach. The RDB2LOD can generate an ontology from the relational database, map
the relational database to an existing OWL ontology and it translate SPARQL queries
into ontology into SQL queries and stored as JSON-LD document. They have several
steps can be done by the RDB2LOD tools for mapping RDB2RDF there are create an
ontology from a single database, map the existing ontology with the RDB data and can
perfume queries over the ontology of the RDB data which are stored in JSON-LD
document format.
Figure 3:4 Shows the sharding process
45
3.4 Justification for Selected tools for SQL to JSON-LD
First of all, an idea to select suitable mapping method is based on the

comparison of transformation method made in the literature review. Based on the
comparison I can conclude that MongoDB tool is the best tool for migrating RDB to
JSON-LD. MongoDB is an open source NoSQL database which stores the data in
JSON-LD. Hence there are many differences if u come from Relational Database
background, and it makes it difficult to understand the concept. Therefore,
understanding the mapping process SQL to NOSQL is complex if you are coming from
the SQL background. Hence, mapping SQL to NOSQL the result will still be the same
but in different forms. So I would like to explain more details about how SQL
transform to NOSQL mapping process is done. First of all, it maps the table, rows, and
column. Each database in the MongoDB is stored as the document but in SQL it stores
in a table form. Each JSON document is a collection of only rows and columns. The
rows in the SQL which store in the table is similar to JSON store the value in rows.
Figure 3:5 Shows example of JSON document
The JSON diagram shows a row of a database, it is similar to rows in RDB. In

SQL, the table contains many rows but in the NOSQL database document contain
many rows. MongoDB does not come with any automatic increment features in assign
Universally Unique Identifier (UUID) to each document in the database. The "_id" in
MongoDB represents primary key for the document and it contains 12 byte of a field.
The "_id" field automatically generates documents and it is unique, it means it is not
similar to another document field. However the diagram below will show how the SQL
row are converted into NOSQL database for better understanding. Therefore,
MongoDB supports schema-less. In the NoSQL database, each document contains
46
different schema, for example, if user wants to add tag to web, you can simply write
the tag and save it as document; in the web the older tag is discarded and it shows new
tag to the web but the older tag is still available in the MongoDB database until a
specific period. The keyword "tag" is represented in Information system in a computer
file, digital image and more. This concept is not available in RDB. Additionally, the
document in the NoSQL can contain different field in the database.
Figure 3:6 Shows example of field in NoSQL
Subsequently, the diagram on the left side shows the 4 fields but the diagram
on the right side document shows only 3 fields, there is a different field but it supported
by NOSQL and all fields are saved into a single document. Then, this schema-less
approach cannot be done in RDB table and rows because it not efficient in creating
such database. Another key difference in NOSQL is that it does not support inner joins.
In RDB it uses primary key, composite key and foreign key for the relationship
between one tables to another table. But in the NoSQL database there is no such
relationship.
Figure 3:7 Shows example of SQL join relations and NoSQL relation.
47
The diagram below shows how the MongoDB uses linking approach with
database document. However, MongoDB uses automatic generating id field which is
similar to RDB. It also uses another id which is called "_id". The "_id" is used to
prevent stale data. If someone makes changes in the document, it becomes stale data.
To overcome this problem, user must resave the document.
Figure 3:8 Shows the summarization of the SQL to NOSQL migrating process.
Therefore, MongoDB NoSQL database query processing is totally different

from RDB because it uses the map and reduces function. The map and reduce function
run concurrently and performs the query. It is also known as view in MongoDB.
Figure 3:9 Shows the map reduce query process.
48
Finally, the RDB2LOD tools that allows to create of an ontology from a
database and mapping a database to an existing ontology. The tools will analyse the
data components (tables, column, primary/foreign keys), and transform to ontology
components such as individuals (instances), classes (concepts), attributes and relations.
In case they have exiting ontology data, we will use a manual mapping. In some cases,
we can consider a semi-automatic approach for one to one mapping. Therefore, we
need a more refined mapping such as one to many or many to many, this
transformation based on conditional mappings. Thus, in this case, we cannot consider
automatically or semi-automatically approach to solve the problem of our tools to offer
a graphical user interface that allows a user to manually use a mapping approach. In
our tools, we can specify two major tools or component in the mapping approach. The
first one is D2R Map, which associates with the ontology components with SQL
statements such as classes, object property and data graph property. The other may be
an additional refined mapping specification language that we tend to propose basing
on D2RQ and D2R Server. So, we have implanted elements in RDB data mapped to
owl: class and from our database, RDB2LOD tools are mapped automatically and
generated the relation such as “has”, “Part-of” and more; it based on the SQL tables
which is stored in database and standard ontology terms used to represent relationship
between classes. After completed the mapping process user can start the query using
JSON-LD and SPARQL.
49
3.5 Converting SQL to RDF using RDB2LOD and stored in JSON-LD
document
As we specified beforehand, an information supplier ought to perform its basic

task for converting RDB to RDF and stored in JSON-LD document. Therefore, the
first one is creating an ontology from a single database. The RDB2LOD tools are
creating automatically an ontology from the database using the OWL-DL language.
The created ontology from the RDB data is called as local ontology which is
temporarily stored data. The RDB database consists several elements which are the
table, column, constraints are converted into class, and property and relation during
the converting process the Ontology taking part for generation process. The generated
ontology is contained OWL representation but not the instance. The instances are
retrieved and used for queries. The RDB2LOD tools are used the mapping document
for query process which is stored in the local ontology.
Therefore, after creating the mapping document it can be used for queries,
mapped for the single database to the local ontology is doesn’t require because the
RDB2LOD tools are already created its own local ontology in case the expert user
want to manually map between the databases to local ontology also possible in this
tools. Ontology creation from the database and mapping database to an existing
ontology also can produce the mapping document based on the user requirement.
Ontology generation process include several steps. It begins by changing tables

to concepts, and after that changing columns to properties. In the database table, class,
sub-class are transformed into OWL class and each table in the database which many-
to-many relationships are transformed into object properties. The two inverse object
properties are added to the ontology, one for every class whose comparing table was
identified with the current table. After the ontology generation, it continues to mapping
generation process. Under mapping generation process two kinds of mapping
specification. First on is an association with the SQL statement and the second on is
the OWL language.
Finally, mapping database to an existing ontology during the process the

RDB2LOD tools map the relational database to the existing ontology. Some case, the
expert human can also perform manual mapping to the components of the database
50
and existing ontology. The result is produced the mapping JSON-LD document. The
RDB2LOD tools provide the graphical user interface to association with the SQL
statement. Therefore, the mapping process is done by selecting the database of SQL
command and associating with an ontology class and property. JDBC (Java Database
Connectivity) API is used in RDB2LOD tools to connect the database and association
with SQL statement. The API extract the metadata from the database, include table
and column. The database connects to the MySQL statement and it extracted using
JDBC API. The extracted data are encapsulated into internal database model. Then the
existing ontology is loaded and expressed as Semantic Web Framework Jena. The
database and ontology are loaded into the GUI. Using the GUI, mapping association
of ontology component and SQL statement. The database model is utilized as the
contribution to the ontology generation algorithms. At the point, the execution of this
algorithms generates an internal ontology model, which is changed into Jena ontology
model or acquire the OWL ontology. At last, the mapping model is made an
interpretation of utilizing JDOM API into XML format and transform into mapping
document. The data of metadata is managed by RDBMS.
The RDB2LOD tools are used API for reading, manipulating and writing the
OWL ontologies by Jena (Semantic Web Framework for Java).Jena is java based
language it provides RDF, OWL, and SPARQL environment for the tools. SQL parses
SQL statement and manipulate and The SQL statement. Java-based document object
for XML (JDOM) is another tools which provide XML document. The user can easily
read and manipulating of the database to ontology mapping process. RDB2LOD tools
are supported MySQL statement and Oracle databases.RDB2LOD.We benefit from
some specific sees gave by these RDBMSs to depicting the database metadata.
RDB2LOD can be effectively stretched out to bolster different RDBMSs that give such
perspectives.
51
3.6 Schematic diagram
Figure 3:10 Development Process
Figure 3:11 Data Flow Diagram for SQL to JSON-LD
52
Figure 3:12 Data Flow Diagram for SQL to RDF then JSON-LD
53
CHAPTER 4
IMPLEMENTATION AND RESULT
4.1 Migrating SQL to JSON-LD using MongoDB
Figure 4:1 Show list of database in MySQL Workbench and MySQL
Figure 4:2 Show interface of MongoDB and list of the same database in MySQL
54
Figure 4:3 Mongify is the intermediate for migrating SQL to MongoDB
Figure 4:4 Perform configuration between MySQL database and MongoDB
55
Figure 4:5 Performing testing on Mongify connection
Figure 4:6 Diagram show the source code for translating SQL to MongoDB
56
Figure 4:7 Diagram show the translation process between SQL to MongoDB
Figure 4:8 Diagram show data after performing translation into MongoDB
which stored in document database
57
Figure 4:9 Perform query for specific document database
Figure 4:10 Diagram show document database which in JSON format
58
Figure 4:11Robomongo represent the document database into GUI
4.2 Migrating SQL to RDF and stored in JSON-LD using RDB2LOD
4.2.1 Standard RDB Mapping
In the process of Standard RDB mapping approach, metadata of RDB in

MYSQL import is loaded into RDB2LOD to create standard RDB mapping as shown
in the diagram below.
59
Figure 4:12 Standard RDB mapping approach
The user needs to key in the database information such as server name, port
number, database server, username and password for accessing the database with
authorized. Then the user needs the key in the name of the file that wants to map and
its store in the .ttl file. After key in all the information user need to click the generate
button to create the mapping file. If the entered information is correct, then a message
box will pop out as shown below.
Figure 4:13 Successfully generated standard mapping
60
4.2.2 Custom RDB Mapping
In the process of Custom RDB mapping approach, which aims to interactive

and semi-automatic way, customization of the RDF mapping between RDB. Custom
RDB mapping as shown in the diagram below.
Figure 4:14 Custom relational database mapping
After the user click the custom mapping the shown empty in database and
ontology, then the used need to click the “Config Database” for configure the database.
Figure 4:15 Show the interface for manual mapping process between relational
database and Ontology
61
The configuration database is shown below. The user needs to select the
database engine such as MySQL Server and MS SQL Server. Then constantly key in
the database information such as server name, port number, database name, username
and password for accessing the database with authorized.
Figure 4:16 Configuring database for MySQL database
Therefore, after the user configures the database. The tables and columns are
automatically arranged into owning form. Then open the "Open OWL File" to perform
the manual mapping. Then the “Database Table” check box, the table is displayed,
obtained from the RDB structure, those who are associated with the respective classes
of the loaded ontology, these appear in the box “Ontology Classes” selection. The
record of each are associating between the RDB class ontology is done by clicking the
“>>” the button which immediately displays the association in the check box “Table
to Classes”. To remove an association, just select the “<<” superior. The association
will allow the user to use (or subject) in triple RDF generated is the key field in a table,
referred to as a class of the field provided by the user ontology.
Similarity, the check box ”Database Columns” are displayed the columns
which previously selected from the user selection those who are associated with the
respective properties of the selected class, displayed in the check box “Datatype
Properties”. Each is associated with RDB column and property of ontology by
selecting the “>>” button below, immediately displayed in the “Column to Properties”
62
check box. To remove an association, just select the “<<” below. This association
allows the attribute in each RDF triple generated is a column of a table, referred to the
Datatype Property of a class of the field provided by the user ontology. During the
process user also can select the RDF syntax format and custom URI and default URI.
After the user entered information is correct, then a message box will pop out as shown
“Mapping File Successfully Generated”.
Figure 4:17 Show the manual mapping process between relational database and
Ontology
After complete the “Standard RDB Mapping” or “Custom RDB Mapping” the
user click the output button to perform RDF to RDB mapping process. The RDB2LOD
application provides the options window "Dump RDF Dataset" through the process
the dump all the contents or mapped RDB. In the process, RDB database is mapped
into RDF dataset through the process dump and in the store the result in the form RDF
dataset. During the process user also can select the RDF syntax format and custom
URI and default URI. Finally, a user entered the correct information, the message box
will pop out as shown below.
63
Figure 4:18 Show dump to RDF datasets
Secondly, the diagram presents the options window "Query RDF to file",
which is goal make an SPARQL query mapped to the RDB database and save the
result in the form of RDF triples in a file. User needs to select the mapping file, and
create the query file name and choose a file format such as
(.txt, .xml, .Jason, .csv, .srb, .ttl).Then the user can choose a default URI or custom
URI based on the user requirement. The application also has an additional feature that
the user can also set up the "Timeout”. Finally, in the windows, the text box "SPARQL
Query" is intended to query formulation SPARQL to be performed. At the end, the
user entered correct information and the displays a dialog with the message "Query
File successfully generated."
Figure 4:19 Query RDF to file
64
Finally, the user needs to click the "Start-D2R Server”, which aim to trigger
D2R Server tool to boot a Web server, which allows viewing and "explore" the
contents of the mapped RDB to RDF, or make queries through one SPARQL console.
Then the user needs to select the “Select Mapping File ...” button, it allows the user to
select the mapping file based on the user requirement, and user needs to click the start
button to start the server and view content.
Figure 4:20 Starting the server
Furthermore, after complete the mapped process user need to open the browser
and type http://localhost:2020/ then it shows D2R Server. In the D2R Server show the
RDF and Owl file view.
Figure 4:21 Show that D2R server is start
65
In the SPARQL result, it shows the subject, predicate and object in the form of
RDF, the classes and properties show the OWL view through the browser. The user
can also type in manually SPARQL command to view the result. The content (Entity
or Attribute) is translated or mapped into RDF form.
Figure 4:22 Show the relation between classes, property, subject predicate and
object
66
4.2.3 Different between Standard RDF Mapping and Custom RDB Mapping
Figure 4:23 Show the result different between standard mapping and custom
mapping
Figure 4:24 Show the result different between standard and custom mapping in
term of SPARQL query
67
From the comparisons presented in both display and in consultation SPARQL,
we can say that the application of RDB2LOD approach provided a better expression
to generate RDF triples, presenting well-defined meanings (approximate to natural
language) for the subjects, predicates and objects of these, which were obtained from
the customization of the RDB mapping the incorporation of a domain ontology.
The application of RDB2LOD approach also provided a user interactive By

means of graphical interfaces, eliminating the need for manual interaction
configuration and operation of the tools applied to the linked data publishing process.
Thus, the customization of mapping the joins between tables and columns of the RDB
and the classes and properties of the ontology, which until then was made manually it
shall be done in an automated manner by means of this approach, which reduces the
time spent and need for technical knowledge of the language of the mapping file
generated by the applied tool. The customization of mapping databases with a large
number of tables and columns would be impossible by the conventional method
(manually), it becomes possible by this approach.
4.2.4 Import the Standard RDF Mapping and Custom RDB Mapping file into
Protégé 4.3
The first user needs to open the .rdf or .owl or .ttl file view the RDB in the form
of RDF. Then the user can view by clicking the “Active
Ontology”, ”Entities”, ”Classes”, ”Object Properties” , “Data Properties” , Annotation
Properties” , “ Individuals “, “OWL Viz” , “DL Query” , “OntoGraf” , “SPARQL
Query” , and “Ontology Differences”.
68
Figure 4:25 Show the SQL data into OntoGraf as a result
Figure 4:26 Show Ontograf
69
Figure 4:27 Show the relationship between each table in the OWLViz form
Figure 4:28 Object property for SQL database
70
Figure 4:29 Show the Entities of SQL data
Finally, shown the prove that RDB database can mapped into RDF triple form
and view through Protégé 4.3.Then the user doesn’t require to key in the data manually
in Protégé 4.3.,through this tools user can import the RDB database into the application
which RDB2LOD and mapped the database into RDF triple form and save the result.
The saved RDF triple file import to Protégé 4.3 and view the result in the RDF and
OWL form.
71
CHAPTER 5
ANALYSIS OF RESULT AND DISCUSSION
Based on the research paper is simplify the main objective of the research paper
is to implement existing mapping method and tools which as for being discussed in the
previous chapter. Below the diagram is compare the different between JSON
visualization and OntoGraf representation which represent the final result of the
research objectives which take place by using the tools and method that been used.
Figure 5:1 Show how the data is stored in the different migration process
Figure 5:2 OWL data is represented subject, predicate, and object in JSON
visualization
72
Figure 5:3 RDF represent the data in the form of OntoGraf
Figure 5:4 RDF represent the data in the form of subject, predicate and object
in OntoGraf
73
5.1 Discussion
Therefore, based on the research paper there are many methods and tools for
migrating SQL to JSON-LD but in this research have primarily chosen MongoDB as
my tools for producing the result. However in this research Relational Database to
Linked of Data (RDB2LOD) tool which functions as SQL migrating to Resource
Description Framework (RDF) used to store in JSON and JSON-LD data. The reason
to choose this tools for migrating the SQL to JSON-LD is to produce the different
result and determine the advantages and disadvantages. MongoDB is the tools that
easy to use and transforms SQL data into JSON but RDB2LOD is taking the longer
process for migrating the SQL data which are SQL to RDF and JSON. MongoDB is
even faster when there is a larger database and each the data is represented as the single
document. The SQL data that been migrate is stored in the key-value and column store
format; that’s the reason why MongoDB is faster in the query process. MongoDB is
one of the tools that supporting the sharding technique which also known as horizontal
sharding during the process is automatically and manually balanced the data over the
cluster. It also handles concurrent process during the sharding method and it supported
high performance. MongoDB also has unique ID for each document to prevent from
interference from the same values.
The migrating process for SQL to JSON and JSON-LD is coming through
new era so that used JSON visualization to represent the result in the tree and Graf
form. The diagram is shown different how the data is stored in different approaches.
There is two type of different result. For MongoDB result, it shows with a unique ID
for single document rather than a relationship with a content. This result will be
suitable for the new user who wants to migrate SQL to JSON-LD with semantic web
new era. The migration process of MongoDB will fully satisfy the new user
requirements. As for RDB2LOD result shows subject, predicate, and object which
represent in parent-child approach. If user used RDF in the Ontology representation
then it would be better to migrate JSON-LD migration process.
74
CHAPTER 6
CONCLUSION
In conclusion, this research describes the challenges of the existing relational

database system and analyses what are the effects to the organization. RDB lacks
scalability and high performance which means RDB is using the relationship, inner-
join, normalization for the process and it is costly. The aim of this paper is to translate
SQL into NoSQL by using the tool called MongoDB to store the document in JSON-
LD format. Therefore, this paper is to give an overview of SQL, NoSQL, and JSON-
LD. It also describes the step that makes SQL transform to effective NoSQL schema.
In NoSQL database, document store is the main role because all the document in
NoSQL is store in JSON-LD format. Compare to SQL database, it only contains row
and column to store the document and the column are mapped into the field .NoSQL
database also provides horizontal scaling and Sharding distributed networks. However,
one of the objectives of this research is migrating SQL to NOSQL database and it can
be completed through the MongoDB tools through the replication method. Replication
is the additional feature that is used by NoSQL database for the easy conversion
process. In the NoSQL database, it also describes the CAP theorem, how it is different
from the SQL. In addition to that, MongoDB only supports two properties out of three
in CAP theorem, which is consistency and partition tolerance. Furthermore, the
database does not support standard method for query process.
Moreover, it also provides the steps for mapping method from SQL to
NOSQL. The migration process gives an advantage because it is easy for query
process. However, based on my research, it can be identified that NoSQL performs
better than SQL database. The MongoDB is suitable for distributed network and for
the replication process. It also uses sharding mechanism which supported multiple
users to access the system. MongoDB is easier than SQL in programming terms. The
reason why MongoDB’s perform better and faster because it is documented in JSON-
LD document. JSON-LD is highly scalable and independent. JSON is used for the web
transforming data in web service.
75
In conclusion, the NOSQL database system has more advantage compared to
SQL. In Addition, according to research there were another tool called Relational
Database to Linked of Data (RDB2LOD) that enable to migrate Structured Query
Language (SQL) to Resource Description Framework (RDF) and the data stored in
JSON and JSON-LD based on user requirement. In this research, the result has
compared between MongoDB and RDB2LOD through visualization and OntoGraf.
The output of the research represents two types of the result but both are accepted. The
result of this research is helpful for the organization which wants to migrate RDF to
JSON-LD and SQL to JSON-LD. In this both approach, by using MongoDB and
RDB2LOD tools is to transform the actual content of SQL into the data that user
required as the result JSON-LD.
76
REFERENCES
Anderson, J. C., Lehnardt, J., & Slater, N. (2010). CouchDB: the definitive guide: "
O'Reilly Media, Inc.".
Aswamenakul, C., Buranarach, M., & Saikaew, K. R. A review and design of
framework for storing and querying rdf data using nosql database. Paper
presented at the 4th Joint International Semantic Technology Conference.
Bonham-Carter, G. F. (2014). Geographic information systems for geoscientists:
modelling with GIS (Vol. 13): Elsevier.
Cattell, R. (2011). Scalable SQL and NoSQL data stores. Acm Sigmod Record, 39(4),
12-27.
davevalz. (2013). Rules of Engagement – NoSQL Column Data Stores
Dharaneeswaran, M. 1NF-to-5NF-Normalization-with-Eg Retrieved from
https://www.scribd.com/doc/49645421/1NF-to-5NF-Normalization-with-Eg
Dimovski, D. (2013). Database management as a cloud-based service for small and
medium organizations. Master Thesis, Masaryk University Brno.
Dzhakishev, D. (2014). NoSQL Databases in the Enterprise. An Experience with
Tomra s Receipt Validation System.
Elmasri, R., & Navathe, S. (2009). Fundamentals of database systems: 人民邮电出
版社.
Halvorsen, H.-P. (2016). Structured Query Language.

Hernandez, M. J. (2013). Database design for mere mortals: a hands-on guide to
relational database design: Pearson Education.
Hunge, M. (2016). GraphGist: First Steps with Cypher.
Jones, D. (2006). Developing Entity Relationship Diagrams (ERDs).
Kammerer, M., & Nimis, J. (2014). Application-Mimes-An Approach for
Quantitative Comparison of SQL-and NoSQL-databases. Paper presented at
the ICEIS (1).
Lamllari, R. (2013). Extending a methodology for migration of the database layer to
the cloud considering relational database schema migration to NoSQL.
Lanthaler, M., & Gütl, C. (2012). On using JSON-LD to create evolvable RESTful
services. Paper presented at the Proceedings of the Third International
Workshop on RESTful Design.
77
Manoj, V. (2014). Comparative study of nosql document, column store databases and
evaluation of cassandra. International Journal of Database Management
Systems, 6(4), 11.
McPhillips, J. (2012). Transitioning from Relational to NoSQL: A Case Study: Regis
University.
Mughees, M. (2014). DATA MIGRATION FROM STANDARD SQL TO NoSQL.
Presbrey, J. J. W. (2014). Linked data platform for web applications. Massachusetts
Institute of Technology.
Ramakrishnan, R., & Gehrke, J. (2000). Database management systems: McGraw-
Hill.
Södergren, P., & Englund, B. (2011). Investigating NoSQL from a SQL Perspective.
Yu, S. (2009). ACID Properties in Distributed Databases. Advanced eBusiness
Transactions for B2B-Collaborations.
78
APPENDICES
Appendix-A: Weekly Report
79
Appendix-B: CD Cover
80
Appendix-C: Source Code
81
table "areatrigger_involvedrelation" do
column "id", :key, :as => :integer
column "quest", :integer
end
table "areatrigger_tavern" do
column "name", :text
end
table "areatrigger_teleport" do
column "name", :text
column "required_level", :integer
column "required_item", :integer
column "required_item2", :integer
column "required_quest_done", :integer
column "target_map", :integer
column "target_position_x", :float
column "target_position_y", :float
column "target_position_z", :float
column "target_orientation", :float
end
table "battleground_events" do
column "map", :integer
column "event1", :integer
column "description", :string
end
table "battleground_template" do
82
column "MinPlayersPerTeam", :integer
column "MaxPlayersPerTeam", :integer
column "MinLvl", :integer
column "MaxLvl", :integer
column "AllianceStartLoc", :integer
column "AllianceStartO", :float
column "HordeStartLoc", :integer
column "HordeStartO", :float
end
table "battlemaster_entry" do
column "entry", :integer
column "bg_template", :integer
end
table "command" do
column "name", :string
column "security", :integer
column "help", :text
end
table "conditions" do
column "condition_entry", :integer
column "type", :integer
column "value1", :integer
column "value2", :integer
end
table "creature" do
column "guid", :integer
column "map", :integer
column "modelid", :integer
83
column "equipment_id", :integer, :references => "equipment"
column "position_x", :float
column "position_y", :float
column "position_z", :float
column "orientation", :float
column "spawntimesecs", :integer
column "spawndist", :float
column "currentwaypoint", :integer
column "curhealth", :integer
column "curmana", :integer
column "DeathState", :integer
column "MovementType", :integer
end
table "creature_addon" do
column "mount", :integer
column "bytes1", :integer
column "b2_0_sheath", :integer
column "b2_1_flags", :integer
column "emote", :integer
column "moveflags", :integer
column "auras", :text
end
table "creature_ai_scripts" do
column "creature_id", :integer, :references => "creatures"
column "event_type", :integer
column "event_inverse_phase_mask", :integer
column "event_chance", :integer
column "event_flags", :integer
column "event_param1", :integer
84
column "action1_type", :integer
column "action1_param1", :integer
column "comment", :string
end
table "creature_ai_summons" do
column "position_x", :float
column "position_y", :float
column "position_z", :float
column "orientation", :float
column "spawntimesecs", :integer
column "comment", :string
end
table "creature_ai_texts" do
column "content_default", :text
column "content_loc1", :text
85
column "sound", :integer
column "type", :integer
column "language", :integer
column "emote", :integer
column "comment", :text
end
table "creature_battleground" do
end
table "creature_equip_template" do
column "equipentry1", :integer
end
86
mongify check database.config
mongify translation database.config
mongify translation database.config > fulldb.rb
mongify process database.config fulldb.rb
sql_connection do
adapter "mysql2"
host "localhost"
username "root"
password "tharan"
database "fulldb" # This is defaulted to 10000 but in case you want to make that
smaller (on lower RAM machines)
end
mongodb_connection do
host "localhost"
database "fulldb"
end
87
Faculty of Information Science and Technology (FIST)
Final Year Project Meeting Log
MEETING DATE: MEETING NO.:

PROJECT ID:
PROJECT TITLE :
SESSION : SUPERVISOR :
STUDENT ID & Name: CO- SUPERVISOR :
All to be filled in by student
1. WORK DONE [Please write the details of the work done after the last meeting.]
2. WORK TO BE DONE
3. PROBLEMS ENCOUNTERED
4. COMMENTS
Supervisor’s Signature & Co-Supervisor’s Signature Student’s Signature

Stamp & Stamp (if any)
NOTES:
1. Items 1 – 3 are to be completed by the students before coming for the meeting. Item 4 is to be
completed by the supervisor.
2. For FYP Phase 1, total six log sheets are to be submitted (every other week*).
3. For FYP Phase 2, total six log sheets are to be submitted (every other week**).
4. Log sheets are compulsory assessment criteria for FYP. Student who fails to meet the
requirements of log sheets will not be allowed to submit FYP report.
*: week 1, 3, 5, 7, 9, 11 or 2, 4, 6, 8, 10 of the first trimester (week 11: report submission, weeks 13 &
14: presentation)
**: week 1, 3, 5, 7, 9, 11 or 2, 4, 6, 8, 10 of the second trimester (week 11: report submission,
weeks 13 & 14: presentation)
1

PROJECT ID:
PROJECT TITLE :
2. WORK TO BE DONE
4. COMMENTS

NOTES:
14: presentation)
2

PROJECT ID:
PROJECT TITLE :
2. WORK TO BE DONE
4. COMMENTS

NOTES:
14: presentation)
3

PROJECT ID:
PROJECT TITLE :
2. WORK TO BE DONE
4. COMMENTS

NOTES:
14: presentation)
4

PROJECT ID:
PROJECT TITLE :
2. WORK TO BE DONE
4. COMMENTS

NOTES:
14: presentation)
5
Appendix B: Checklist for FYP Interim Submission

Checklist for Interim Report Submission
(To be filled in by Student)
STUDENT’S DETAILS
Project Code FIST
Name
ID No
Title of Thesis
Supervisor Name
REPORT ARRANGEMENT √ Comments (if any differences)

1. Cover of The Interim Report
2. Title Page of the Interim Report
3 Copyright page of I Interim Report
4. Declaration Page of Interim report
5. Acknowledgement
6. Table of Contents
7. Abstract
8. List of Tables
9. List of Figures
10. List of Symbols
11. List of Appendices
12. Chapter 1: Introduction – objectives,
scope
13. Chapter 2: Literature Review
14. Chapter 3: Title:
17. References – APA style
18. Appendices
19. CD/ DVD and envelope as shown in Appendix K
20. Attachment : FYP Meeting Logs (all) 1 set
FORMAT OF REPORT √ Comments

1. Page Numbering
2. Font and Type Face
3. Font Cover
4. Tables and Figures
5. Comb Bind
6. Colour of the Front Cover
7. Number of words > 5000 (Main content only)
Checked by
________________________
Student’s Signature & Date
6
7

SQL To JSON-LD

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SQL To JSON-LD

Uploaded by

Copyright:

Available Formats

SQL MIGRATE TO JSON-LD

HARITHARAN S/O NEDUNSELIAN

FACULTY OF INFORMATION SCIENCE & TECHNOLOGY

HARITHARAN S/O NEDUNSELIAN

THE PROJECT REPORT IS PREPARED FOR

FACULTY OF INFORMATION SCIENCE & TECHNOLOGY

BACHELOR OF INFORMATION TECHNOLOGY

FACULTY OF INFORMATION SCIENCE & TECHNOLOGY

Copyright of this report belongs to Universiti Telekom Sdn. Bhd as qualified by

I would like to thank everyone who had contributed to the successful

In addition, I would also like to take this opportunity to acknowledge my

However, semantic web 2.0 introduces NoSQL database as a compensation for

DECLARATION ..................................................................................................... III

TABLE OF CONTENTS ........................................................................................ VI

LIST OF TABLES ................................................................................................ VIII

LIST OF FIGURES ................................................................................................. IX

LIST OF ABBREVIATIONS/ SYMBOLS ......................................................... XII

LIST OF APPENDICES ...................................................................................... XIII

LIST OF ABBREVIATIONS/ SYMBOLS XIII ................................................ VI

1.1 Overview ................................................................................................. 1

CHAPTER 2 LITERATURE REVIEW ................................................................. 4

2.1 Introduction ............................................................................................. 4

3.1 Hardware Requirements ........................................................................ 39

CHAPTER 4 IMPLEMENTATION AND RESULT ........................................... 54

4.1 Migrating SQL to JSON-LD using MongoDB ..................................... 54

CHAPTER 5 ANALYSIS OF RESULT AND DISCUSSION ............................. 72

Table 1 Comparison of transformation method that propped by the authors

RDB Relational Database

Appendix-A: Weekly Report .................................................................................. 79

Chapter 2 discusses what is meant by relational database and how it works. It

1.2 Problem Statement

1. To Investigate and compare the migration tools from a MySQL

1.4 Project Scope

2.2 What is RDB?

Based on (Hernandez, 2013) Dr.E.F.Codd has founded the new mathematical

2.2.1 Relational database model

Figure 2:3 A design for Foreign Key

There another important aspect of the relational database is normalization.

Figure 2:4 The diagrams show the process of normalization (Dharaneeswaran)

(Hernandez, 2013) describe four categories of terms are described in Relational

Value Related Terms:

Structured Related Terms:

Relationship establishes in an association between the entities of the entire

 One-to-One: The relationship happened between one table to another table of

2.2.3 ACID database and distributed database

Database management system (DBMS) controls the distributed

 In the database management system (DBMS) “refer to the atomicity is ensure

 Exchanges ensure database consistency. Exchanges accomplish consistency by

 Every running exchange is autonomous of the other simultaneous exchanges

 Durability expresses that once an exchange is submitted, its belongings are

Structured Query Language (SQL) is a Database Language that was

According to (Yu, 2009), the Database Management System is intonation to

Data Definition Language (DDl) & Data Manipulation Language (DML):

Data Definition Language and a Data Manipulation Language defined the

 Relationship: Relationship is based on the entities of data. The relationship

 Attribute: Attributes is the subtypes of the entities and represent actual

2.3 The NoSQL Language

The NoSQL database is completely different from the relational database. In

Horizontal Scaling is improved the performance of NoSQL database into the

2.4 Key-Value Stores

Figure 2:9 Shows the Key and Values (Manoj, 2014)

2.4.1 Columnar database

2.4.3 Super Column