Professional Documents
Culture Documents
Adbms Tech-Neo Searchable
Adbms Tech-Neo Searchable
Seg
abus... al
Sylle
e
Mumbai University
B. E. (Computer Engineering)
Credit
: i |
Course Code Course Name
| |
CSDOS01 Advance Database Management SY fC
Course Objectives :
2. To specify the various approaches used for using XML and JSON technologies.
. vine |
8.
i
i the various
To apply the concepts behind types of NoSQL databases al nd utilize it for Mongodb
Course Outcomes : After the successful completion of this course leamer will be able to «
g.
1. Design distributed database using the various techniques for query processin
Measure query cost and perform distributed transaction management. ‘ _|
2
3 Organize the data using XML and JSON database for better interoperability.
4, Compare different types of NoSQL databases.
5 Formulate NoSQL queries using Mongodb.
6 Describe various trends in advance databases through temporal, graph based and spatial based databases
Module | Hrs.
1 Distributed Databases 3
Contents Hrs. —
3.2 Basic JSON syntax, (Java Script Object Notation) JSON data types, Stringifying and
parsing the JSON for sending & receiving, JSON Object retrieval using key-value
pair and JQuery, XML Vs JSON. (Refer Chapter 3)
10
NoSQL Distribution Model
comparison
41 NoSQL database concepts: NoSQL data modeling, Benefits of NoSQL,
between SQL and NoSQL database system.
d data, CAP
4.2 Replication and sharding, Distribution Models Consistency in distribute
theorem, Notion of ACID Vs BASE, handling Transactions, consistency and
eventual consistency
4.3 Types of NoSQL databases: Key-value data store, Document database and Column
ACID
Family Data store, Comparison of NoSQL databases w.r.t CAP theorem and
(Refer Chapter 4)
properties.
NoSQL using MongoDB
5.1 NoSQL using MongoDB: Introduction to MongoDB Shell, Running the MongoDB
shell, MongoDB client, Basic operations with MongoDB shell, Basic Data Types,
Arrays, Embedded Documents
5.2 Querying MongoDB using find() functions, advanced queries using logical operators
and sorting, simple aggregate functions, saving and updating document. MongoDB
Distributed environment: Concepts of replication and horizonal scaling through
sharding in MongoDB. (Refer Chapter 5)
6.3 Spatial database: Introduction, data types, models, operators and queries.
(Refer Chapter 6)
> Chapter2 Distributed Database Handling........
eee un cppe®seeaees
.scssssccecssssnersrenen tee sas ees Seennetssaseaeasascases 2-1 to 2-26
gov
MODULE 1
Distributed Databases
CHAPTER 1
_ Syllabus
ccs
anaagsennaceaensanss 1-2
4.1. IMtrOCUCHION .ccccccccsesceseecenceccecsscsesscssccsansecusneseeenesesnenseaneunanendanbesaenenssencessenseuscunsenstunsennengeancensensensceneeensensesausau
1.1.1 Difference between Centralized and Distributed Database .........ccssesseeeenesseesesssaeneneneesenanensensannnennanenss 1-2
1.1.2 Transparency in DDBMS.. at Rivsrencseete NaS a peteentaateeetenn VE
ua. Write a note on client server architecture. (UU TURSVEN BIE) .........cccccccccscsseseeseecenecenceeesessenneeneeseesneeseaseneeneaaeassees 1-9
1.3. Data Fragmentation, Replication.and Allocation Techniques for Distributed Database Design...
1.3.1 Replication .........cccceeeeees
1.3.2 Fragmentation
UQ. _ Give two examples of horizontal and vertical fragmentation each [JUUBINTYAREMUIEWAR ...............-e 1-11
ua. Give derived horizontal fragmentation for emp and pay. Write resultant fragmatts) sethcancceee 1-13
1.3.3 Syntax for Creating Fragments ........sssecsscsesssecsreraecesesssenecsseuscansesesesesesesenquansnessasauennaensasseyesesensuanasgsesenesananans 1-16
4.3:4 — Data Replication ......c.csscssssscsessssesessbessssonsarssesssnsrsezsssoensesesenvassansnenesnsecenenecsssas aiacsccdis
aesuaamnnesnanasensennsannmas nnies 11
A Distributed Database (DDB) is database that is not stored on one system, it is divided on different systems
or sites, i.e., on multiple computers which are connected through the computer
network.
1 Definition
* A Distributed database is defined as a logically related collection of data that is shared which is physically
distributed over a computer network on different sites.
* A Distributed Database System (DDBS) is the software that manages data which is stored on different
computers connected through network and follows the concept that user
will not come to know where data is
scattered on different sites or servers and users will think that only one
system is there to provide data which
is required by user in the form of query.
Example
* Consider you want to fetch data from different folders related to given
task and that folders are on different
drivers so we can say data which is related to each other
is distributed in folders.
In these folders data can be present in same format like in
document or can be in different format like excel
and document or can be in any other extension of
file.
Figure
A OF
Client 3 Client 4
» Communication,
[| channel _
Client 2
Centralized
database
1a1)Figg 1.1.1
(anFi : entra
Central
lt ized Databas
ataba e se Syste:
System (142)Fig: 1.: 1.eee
Distr 2 d Database system
ibute
Location of data ~ |The database is located on single | The database is located on various
machine. sites
Maintenance It is easy to maintain
It is difficult to maintain
Design of data It will have simple design of data which It will be complex design of data which
will be easily understandable. will be difficult to understand,
Response time It will take more response time, It will take less response time.
Processing of query The query will be processed by single The query will be processed by many
server so will have load on the same server so will not have load on one
"| system. system.
Failure of system If centralized server fails entiré system If one system or server fails ,system
will be halted, continues to work with the other system.
Data traffic There will be data traffic as data stored | There will not be data traffic as data is
on one server divided or copied among the number of
servers.
Advantages e All data is stored at a single location | * Database can be easily expanded as
so it becomes easier to access and data is already spread across sites at |
communicate data. different physical locations.
¢ Minimal data redundancy. e The distributed database can easily
© — less costly be accessed from different networks.
e This database is more secured.
Disadvantages
e Data traffic will be there as all data e Very costly and it is difficult to
is stored at one location. maintain because of its complexity.
e If any kind of failure occurs at e In this database, it is difficult to
centralized system then there is risk provide a uniform view to user since
of entire data will be lost. it is spread across different physical
locations.
i as Sa PPO La a i Ps Ne ce a a a
Transparency is one of the features of DDBMS. It means or the way to hide internal implementation details
from the user, how data is distributed and where it is stored all these details will be hidden from the user.
(1) Distribution transparency : It allows the distributed data to be treated as a single logical database. User
doesn’t know which data are partitioned and where it is distributed.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Databases)._Page no. (1-4
more than one network site. Maintains
(2) Transaction Transparency :It allows a transaction to update data at
database integrity as transaction is completed or aborted.
(3) Failure transparency : It ensures system continues to operate in event of node or network failure.
(4) Performance transparency : It allows system to perform as if it looks like centralized DBMS
(5) Heterogeneity transparency : It allows the integration of several different local DBMS’s under a common
global schema.
(6) Replication Transparency : It hides about which data is replicated from the user.
(7) Fragmentation Transparency : The end user doesn’t know the fragment names or fragment locations are prior
to data retrieval. (which fragment data is accessed by query fired by user.)
(3) User fires the query by selecting subject as C so user doesn’t know that user is fetching data from server S1
as he is unaware whether data is divided among servers for good performance. This implementation fact is
hidden from the user and centralized view is shown to use (Distribution/fragmentation transparency).
¢ Ina heterogeneous distributed database, all sites or servers can use different DBMS that can cause problems
in query processing and transactions.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Datebase N i MU-Sem ) Distributed Databases)....Page no. (1-5
* In this architecture there are two views as logical and component architectural models of a DDB.
User User
Extemal
view A= Extemal
f-
Local conceptual schema (LCS)
\
“Local conceptual schema (LCS)
4 A.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) fel Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Manageme
ee nt System (MU-Sem 5-Com
ee ee Distributed Databases)....Page no. (1-6
integration of all the data that is stored on every site and divided as per the design of database and is
Tepresented by the Global Conceptual Schema (GCS), which provides network transparency
¢ Each node is having its own Local Internal Schema (LIS) based on physical organization details at that
particular site,
* The logical organization of data at each site which is local to it is not remote is shown by the Local
Conceptual Schema (LCS). The GCS, LCS and their underlying mappings provide the fragmentation and
replication transparency as per the design of database
¢ The Fig. 1.2.2 Shows the component architecture of a DDB. It is an extension of its centralized database. The
components that are responsible for executing the query whose data are available on different servers.
* The global query
compiler references the
User ~
Global Conceptual Schema (GCS) from
the global system catalog to verify and
Interactive global query
impose already defined constraints.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications... SACHIN SHAH Venture
(MU-Sem 5-Com Distributed Databases)....Page no. (1-7
Advance Database Management System
e (A) Shared memory (tightly coupled) architecture : In this architecture, multiple processors share
in
secondary (disk) storage and also share primary memory, Data and code in a parallel program are stored
the main memory accessible for the processors.
Processor | | Processor aeeseannen!
Processor
n
1 2
nterconnection network
* (B) Shared disk (loosely coupled) architecture : In this architecture, multiple processors share secondary
(disk) storage but each has their own primary memory. ;
e These architectures enable processors to communicate without the overhead of exchanging messages over a
network.
Interconnection network
L |
Processor | |'Processor ‘Processor
1 2 ; eaneeeeaee: - n j ‘
| oe Interconnection network
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
(Distributed Databases)....Page no. (1-8
Advance Database Management System (MU-Sem 5-Comp)
ll
Advance Database Management System (MU-Sem 5-Com Distributed Databases)....Page no. (1-9
(1) The local schema and component schema is the conceptual schema (full database definition) of a
component database, and the component schema is derived by translating the local schema into a canonical
data model or Common Data Model (CDM) for the FDBS.
Schema translation from the local schema to the component schema is done by generating mappings to
transform commands on a component schema into commands on the corresponding local schema.
(2) The export schema represents the subset of a component schema that is available to the FDBS.
(3) The federated schema is the global schema or view. This schema is the result of integrating all the
shareable export schemas. '
(4) The external schemas define the schema for a user group or an application that is designed for the users
only.
© The user is provided with the interface by this layer. The programs at this layer present Web interfaces
or GUI forms to the client in order to interface with the application or system.
e Web browsers are often used and the languages used for creating the interface includes HTML,
XHTML, CSS, Java, JavaScript etc.
e This layer handles user input, output, and navigation by accepting user commands and displaying the
needed information by the user, usually in the form of static or dynamic Web pages. This layer typically
communicates with the application layer via the HTTP protocol.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Databases sea
‘age no. (1-10
.
© — The application logic programs application logic. For example, queries can be formulated based on user
input from the client, or query results can be formatted and sent to the client for presentation to user to
understand the output(proper format).
* The additional application functionality can be handled at this layer, such as security checks, identity
verification, and other functions.
* The application layer can interact with the database using ODBC, JDBC, SQL/CLI, or other database
access techniques.
3. Database server
e This layer communicates with the application layer and is responsible for handling the query and
updating requests of the user by processing the requests, and sending the results back to user.
e SQL is used to access the database if it is relational or object-relational and stored database procedures
may also be invoked. Query results (and queries) may be formatted and transmitted between the
application server and the database server.
To process SQL query application layer interacts with database layer in the following way:
(1) The user query is formulated by application server based on input from the client layer and divides it into a
number of independent site queries. Each site query is sent to the appropriate database server site.
(2) Each database server processes the local query independently and sends the results to the application server.
Increasingly, XML is being used as the standard for data exchange so the database server may format the
query result into XML before sending it to the application server,
(3) The results of all subqueries are combined by application server to produce the result of the originally
required query then format it into HTML or some other form accepted by the client, and sends it to the client
site for display. .
There are two ways the data can be stored on different sites that
are as follows - Fragmentation and
Replication.
—
(MU-New Syllabus w.e.f academic year 21-22)(M5-68)
Tech-Neo Publications A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Databases)....Page no. (1-11
1.3.1 Replication
The process of storing copies of data on more than one server is called as replication of data. So system
maintains multiple copies of data to increase availability of data and reduces response time of the query.
ES Advantages
t= Disadvantages
(1) If change is made at one site then need to be reflected at every site where copy is stored or else it may lead to
inconsistency of copies of data.
(2) The concurrency control becomes complex as concurrent execution need to be performed over a number of
sites.
1.3.2 Fragmentation
« The process of dividing the database into a smaller chunks or parts is called as fragmentation.
As per the requirements of application fragments may be stored at different locations or sites.
« Then original database should be able to construct from the fragments without loss of data.
e Every fragment is subpart of the original table.
e Fragmentation doesn’t create copies of data, it just divides the data so data consistency will not be a problem.
There are three types of data fragmentation: 1. Horizontal data fragmentation 2. Vertical Fragmentation
¢ In this type of fragmentation a table data is divided horizontally into the group of rows to create subsets of
tables.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) fa) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Databases)....Page no. (1-12
* It is done by applying some condition on attribute or column in table. Rows in table are separated into |
fragments.
* There is splitting of the rows horizontally by applying condition on attribute depending on the query. The
condition can be on one or more attributes.
Example
* Consider table customer (Custid, Name, address, City). In above table if we have values for city in table as
Mumbai, Delhi, Pune.
300
Table : CUSTORMER
HF2 : city = "PUNE"
Cust ID name | address. city CustID [name [address |
4
301
1000
700
1000
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) te Tech-Neo Publications..A SACHIN SHAH Venture
(M Distributed Databases)....
Advance Database Management System
Note: Horizontal fragmentation works on division of data based on conditions of the attributes . No of rows
i fy ine .,
Owner relation and member relations participate in this type of fragmentation. The relation with the primary
key is owner relation and the relation with the foreign key is member relation.
Let S is owner and R is member relations so derived horizontal fragments of R are defined as
Ri = R semi join Si where 1<=i<=n and Si is primary horizontal fragment. Where n is the maximum number
of the fragments on R.
Si=Sigma (condition) (owner relation)
Consider the following example in which Pay is owner relation and Emp is member relation.
PAY
| TITLE, SAL
Leis
EMP PROJ
ENO, ENAME, TITLE| — | PNO, PNAME, BUDGER, LOC
L, Ls
ASG =
| ENO, PNO, RESP, OUR
(a1Fig 1.3.2 : Derived Horizontal fragmentation
Consider PAY! AND PAY2 are primary horizontal fragments then derived horizontal fragments can be
defined as :
DHF1 EMP semijoin PAY1
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
2. Vertical fragmentation
* The division of data by selecting specific columns or attributes of the table, no condition is required. If we
combine all the vertical fragments there should be original table with correct number of attributes.
® The schema of the table or relation is divided into smaller schemas(sub-schemas). Each fragment must
contain a common candidate key so as to ensure lossless join, If candidate key isn’t taken then it will not be
possible to reconstruct the original table from vertical fragments as there will not be any relation among the
data.
Example
e Consider table customer (Custid, Name, address, City, Phno).
* — If we divide information of customer in one fragment with the attributes (name, address and city) and phno
in another fragment then we have two vertical fragments with candidate ket as cust_id which relates two
fragment with each other,
Table : CUSTORMER
10 name ddrass | city
1
1000
1000 1000
* Customer = VFI join VF2(by performing join operation among the vertical fragments we will get original
table customer)
™> Note : Vertical fragmentation works on div salt tice eens
will be same in all fragments as no division of rows me
E= Types of vertical fragmentation
(a) Complete vertical fragmentation : In vertical fragments all attributes are present of original
table and one
common key then its complete vertical fragmentation.
(b) Mixed (hybrid) Fragmentation : This type of fragmentation is the combination of horizontal and vertical
fragmentation.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
(MU-Sem 5-Com Distributed Databases)....Page no. (1-15
Advance Database Management System
of table.
Any one way you can use and both will be considered as mixed or hybrid fragmentation
Bare
[Fragment 2
Fragment 1
"Table
E Fragment 2 a
Example 1
Fragmentation 1
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com i
Distributed Datab ases)....Page no, (1-
16
1.3.3 Syntax for Creating Fragments
(i) Full replication makes the concurrency control and recovery techniques more expensive.
(ii) Update operation is slower as multiple copies of same fragments that can slow down update operations, since
a single logical update must be performed on every copy of the database to keep the copies consistent. This
is especially true if many copies of the database exist.
> 2) No Replication
The opposite of full replication is having no replication - that is, each fragment is stored at exactly one site.
In this case, all-fragments must be dis-joint, except for the repetition of primary keys among vertical (or
mixed) fragments.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Databases)....Page no. (1-18
User 4 - =] original
=) | Data database
|]
User2
Server No replication
of data
User 3
(it) Slows down the query execution process, as multiple clients are accessi
ng the same server.
> 3) Partial replication
* Partial replication means only some fragments are
replicated from the database on more than one server
Or we can say that some fragments of the database
Original
may be replicated whereas others may not. e
| datab
siat ene
¢ The number of copies of each fragment can range
from one up to the total number of sites in the
distributed system.
E.g. sales forces, financial planners, and claims
adjustors carry partially replicated databases with
them on laptops and PDAs and synchronize them
(Recovery location)
periodically with the server database.
se.
Q.1 Write difference between centralized and distributed databa
Q.7 Which operations are performed on horizontal and vertical fragments to get original table.
| 1.5 MULTIPLE CHOICE ‘on peal t Ic NS Q.1.2 Storing a separate-copy of the database at multiple
locations is which of the following
Q. 1.1 A homogenous distributed database is which of the (a) Data Replication
following? (b) Horizontal Partitioning
(c) Vertical Partitioning
(a) The same DBMS is used at each location and
data are not distributed across all nodes (d) None of the above Y Ans.: (a)
(b) The same DBMS is used at each location and
A distributed database is a collection of data which
data are distributed across all nodes,
(c) A different DBMS is used at each location and belong to the same system but are spread
data are not distributed across all nodes over the of the network.
(d) A different DBMS is used at each location and (a) Logically, sites
data are distributed across all nodes. ¥ Ans.: (a) (b) Physically, sites
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Databases)....Page no. (1-20
(c) Database, DBMS Q. 1.7 Some of the columns of a relation are at different
(d) None of the above Y Ans.: (a) sites is which of the following?
Q. 1.4 Which of the following is/are the main goals of a (a) Horizontal Partitioning
distributed database? (b) Vertical Partitioning
(a) Interconnection of database (c) Replication
Chapter Ends...
Q00
es §6Distributed Database
CHAPTER 2 Handling
ig : yllabus
Distributed Transaction Management —- Definition, properties, types, architecture Distributed Query
Processing - Characterization of Query Processors, Layers/ phases of query processing.
Distributed Concurrency Control- Taxonomy, Locking based, Basic TO algorithm, Recovery In Distributed Databases:
Failures in distributed database, 2PC and 3PC protocol.
UQ. —_Explain how concurrency control is achieved in distributed database, ((UIUMNMENIRED)..........-.sssssssseusssesen 219
2.5 Recovery in distributed database... nee autsiT TTR GeTe |
NOMLERAtTE
ARIAT AIN
2.5.1 Distributed Two-phase Commit ere). suunbueusiinTs i 2-21
REIE
MED) .......-ssssssssssss
UQ. —_ Explain 2PC protocol in detail. ((UU N
sssssesssesesesessns nsnnsnsey 2-21
2.5.2 Distributed Three-Phase Commit oPC) cave rerdansecsegreaceensansegraserenpatp Nap NAN TAR
dalam eoyappaddianiigi
2.6 Descriptive Questions.........
af Multiple Choice questions
Chapter Ends oo. c.ccccccescsessssssssesssesssessssessssesssessvensssssssssesssveessasssessarsessessseansesesasneesnsesnedeavssacasancsnnsauenesuenseenneseys
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling). ...Page no. (2-2
%™ 2.1.1 Transaction
0 Definition
e = The transaction is sequence of steps or a logical unit of work on a database or an entire program on data.
* The series of steps necessary to accomplish a logical unit of work is referred as one transaction.
> 1. Atomicity
e All changes to data are performed as if they are a single operation. That is, all the changes are performed, or
none of them are to make sure that database is in the consistent state.
* For example, in an application that transfers funds from one account to another, the atomicity property
ensures that, if a debit is made successfully from one account, the corresponding credit is made to the other
account. :
> 2. Consistency
> 3. Isolation
¢ When one transaction is executing then other transaction will not interfere and executes independe
ntly The
state of a transaction is invisible to other transactions that seems transacti
ons that run concurrently appear to
be serialized.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) [al rech-Neo Publications..A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)....Page no. (2-3
in an application that transfers funds from one account to another, the isolation property
e For example,
ensures that another transaction sees the transferred funds in one account or the other, but not in both, nor in
neither.
> 4. Durability
in the
e When the transaction successfully completes, changes to data persist and will not be undone, even
event of a system failure.
durability property
e For example, in an application that transfers funds from one account to another, the
ensures that the changes made to each account will be there in the database.
Transactions have been classified according to a number of criteria. One criterion is the duration of
as
transactions. Accordingly, transactions may be classified as online or batch. These two classes are also called
short life and long life transactions.
1. Online Transaction
e Online transactions are characterized by very short execution/response times and by access to a
relatively smal] portion of the database.
¢ This class of transactions probably covers a large majority of daily life applications that we use mostly.
e Examples are banking transactions and airline reservation transactions. .
2. Batch Transactions
* Batch transactions, are the transactions that take longer to execute (response time being measured can be
in minutes, hours, or even days) and access a larger portion of the database.
¢ The applications that might require batch transactions are statistical applications, report generation,
complex queries and image processing.
¢ Another way the classification is done on the organization of the read and write operations:
3. Flat transactions
Flat transactions have a single start point (Begin transaction) and a single termination point (End transaction)
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)...,Page no, 2-4
a
4. Nested Transactions
* The transaction model that includes other transactions with their own begin and commit points are called
nested transactions.
¢ — It will look like as shown below
Begin transaction abo
Begin
Begin trarisaction abel:::
end. {abel}
Begin transaction abe2
End
mI 2.2 ees :
Ez Teh) cry Dec. 11
Query processing refers to the set of the activities involved in extracting data from a database and in
distributed database data is available on more than one server so it is important to design database in such a
way that response time will be reduced and database will be in the consistent state.
It briefs how query is executed internally and how output is calculated on user desktop. In a distributed
system, the issues must be considered as the cost of a data transmission over the network. The stages of a
distributed database query processing are given below and are common to all the servers or sites.
Query decomposition
Localization :
!
facilities provided by the language.
The goal of normalization is to transform the query contro!
(2)
to facilitate further
site < Data Fragment
to a normalized form localization schema
processing.
includes the lexical and analytical Algebraic query on fragments
(3) This process
analysis and the treatment of WHERE clause,
There are two possible normal forms Global Allocation
optimization schema
F. Conjunctive NI L
(A predicate) of disjunctions ‘Distributed query execution plan
This is a conjunction
(V predicates) as follows:
Local |" Distributed
sites execution.
(p11 Vp12 V...Vpin) a...A (pm1 Vpm2 ...Vpmn)
OR DUR=24
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) [el Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)....Page no, (2-6
* Query analysis enables rejection of normalized queries for which further processing is either impossible
or necessary.
The main reasons for rejection are that the query is type incorrect or semantically incorrect.
Type incorrect:
e — If any of its attribute or relation names are not defined in the global schema.
¢ — If operations are applied to attributes of the wrong type.
Semantically incorrect :
I. Query Graph
This graph is used for most queries involving select, project, and join operations. In a graph, one node
represents the result relation and any other node represents an operand relation. 4 An edge between two
nodes that are not results represents a join, whereas an edge whose destination node is the result represents a
project. ,
(MU-New Syllabus w.e.f academic year 21-22)(MS-68) Tech-Neo Publications A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)....Page no. (2-7
DUR 236
ASGPNO =
PROJ.PNO
UPNAME’= |
RESULT CADICAM’
J. Join Graph
AND ‘TITLE="PROGRAMMER" Se ee
The query graph is shown in figure is disconnected, which tell us that the query is semantically incorrect.
EMP.ENO =
ASG.ENO
TITLE=
Programmer:
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)....Page no, (2-8
¢ A user query expressed on a view may be enriched with several predicates to achieve view-relation
correspondence and ensure semantic integrity and security.
¢ The enriched query qualification may then contain redundant predicates.
¢ Such redundancy may be eliminated by simplifying the qualification with the following well known
idempotency rules :
1. plaa(pl) @ false
2.. pla(pl V p2)
© pl
3. pl V false = pl
tx Example
e Second, the root node is created as a project operation and these are found in SELECT clause
e Third, the SQL WHERE clause is translated into the sequence of relational operations (select, join, union,
etc.).
For converting SQL query into tree form there are some transformation rules to follow as below
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
¥
Advance Database Management System (MU-Sem 5-Com ; Distributed Database Handling)....Page no. (2-19
* The input to the second layer is an algebraic query on global relations. The second layer will localize the
query’s data using data distribution information in the fragment schema..
* This layer will check which fragments are required for the query and transforms the distributed query into a
query on fragments.
* A global relation can be reconstructed by applying the fragmentation rules, and then deriving a program,
called a localization program of relational algebra operators which then act on fragments.
* Generating a query on fragments is done in two steps. First, the query is mapped into a fragment query by
substituting each relation by its reconstruction program (also called materialization program). Second, the
fragment query is simplified and restructured to produce another “good” query.
¢ The goal of query optimization is to find an execution strategy for the query which is optimal.(re
quires less
time to execute)
* The previous layers have already optimized the query, for example, by eliminating
redundant expressions.
However, this optimization is independent of fragment characteristics such as fragment
allocation and
cardinalities.
* Query optimization consists of finding the “best” ordering
of operators in the query, including
communication operators that minimize a cost function(response time).
¢ The output of the query optimization layer is a optimized
algebraic query with communication operators
included on fragments, It is typically represented as a distrib
uted query execution plan,
(MU-New Syllabus w.e.f academic year 21-22)(MS-68) Tach-Neo Publications...A SACHIN SHAH Venture
(MU-Sem 5-Com Distributed Database Handling)..,.Page no. (2-11
Advance Database Management System
The last layer is performed by all the sites which are having fragments involved in the query. Each sub query
will execute at one site, called a local query which is then optimized using the local schema of the site and
executed.
2a. 2.2.5 Data Transfer Cost in Query Processing (example of SQL Query Executing at
Different Sites )
This requires a request and transfer cost for the data over the network.
Take the example of EMPLOYEE and DEPARTMENT tables.
Consider an EMPLOYEE table with 1000 records with each record of 100bytes,
at
DEPARTMENT table with 10 records with each record of 20 bytes. Suppose EMPLOYEE table is in DB1
location 1 and DEPARTMENT table is in DB2 at location 2.
resulting
Consider the query to find the Names of the employees and their department names. Suppose each
record will have 20 bytes and all the employees and departments are being selected.
e Suppose this query is being executed at location 4.
%3. 2.2.6 The Possibilities to Execute the SQL Query in Distributed Database
Case 1
Since location 4 is not having any of the tables, both the tables needs to be transferred to location 3. Hence
the cost of data transfer is as below :
Cost of transferring EMPLOYEE data: 1000 records * 100 bytes = 1,00,000 bytes
Cost of transferring DEPARTMENT data: 10 records * 20 bytes = 200 bytes.
Therefore, total cost = 1, 00,000 bytes + 200 bytes = 1,00,200 bytes
Here cost of transferring result records will not come as result is request at this location itself.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) [al Tech-Neo Publications...A SACHIN SHAH Venture
sell
Distributed Database Hand ling)....Page no. 2-12
(2.
Advance Database Management System (MU-Sem 5-Com
Case 2
the data
¢ Suppose we transfer EMPLOYEE records into location 2 and proc
ess the query there. Then transfer
EE records and transfer cost of
to the location 3. This transfer needs to consider transfer cos t of EMPLOY
result records.
* So the cost of transferring EMPLOYEE data: 1000 records * 100 bytes
= 1,00,000 bytes
Case 3
the query there. Then transfer the
e Suppose we transfer DEPARTMENT records into. location | and process
data to the location 3. This transfer needs to consider transfer cost of DEPARTMENT records and transfer
cost of result records. Hence j
* Cost of transferring DEPARTMENT data: 10 records * 20 bytes = 200 bytes
¢ — Cost of transferring the result : 1000 records * 20 bytes = 20,000 bytes..
* Therefore, total cost = 200 bytes + 20,000 bytes = 20,200 bytes
¢ Hence the case 3 is the best approach for transferring the data which gives the minimal cost. Hence the
federated method has to calculate these costs depending on the query, table size, result size, cost processing
" location etc and determine which method to use for query processing. Like we do in normal DB query
processing — reducing number of records, performing the filter condition first etc.
In distributed system the database is divided on multiple locations and multiple transactions are allowed to
execute at the same time. So it must recover from site or communication failure.
There can be some issues or problems when dealing with distributed environment that are listed below:
1. Multiple copies of the data items 2... Failure of individual sites
3. Failure of communication links 4. Distributed commit
5. Distributed deadlock
The concurrency control method should maintain consistency of these multiple copies of data. The recovery
method is responsible for making a copy consistent with other copies if the site on which the copy is stored
fails and recovers later.
When one or more individual sites fail, the DDBMS should continue to Operate with its running sites. When
a site recovers, its local database must be brought up-to-date with the rest of the sites before it rejoins the
system. ,
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications... SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)....Page no. (2-13
The system must be able to deal with the failure of one or more of the communication links that connect the
sites. And can cause network partitioning.
If some sites fail during the commit process, then problems can arise with committing a transaction that is
accessing databases stored on multiple servers. The two-phase commit (2PC) protocol is used to deal with
this problem.
Deadlock may occur among several sites due to distribution of data, so there should be some techniques for
dealing with this problem -
A lock is defined as a variable associated with a data item that describes the status of the item with respect to
the possible operations that can be applied to it,
Another way Wwe can say that lock is a mechanism that is used to control the access of the item by number of
transaction and if the data item is shared by conflicting operations then only one operation will access data.
Database systems with lock-based protocol uses a mechanism by which any transaction cannot read or write
data until it acquires an appropriate lock on it.
There are two types of locks :
1. Binary Locks: A lock on a data item can be in two states as it is either locked or unlocked by the
particular data item.
2. Shared/exclusive: The locking mechanism differentiates the locks\based on their requirement of the
data item as if a lock is acquired on a data item to perform a write operation, it is an exclusive lock. If a
lock is required to perform only read operation then its shared lock. Allowing more than one transaction
to write on the same data item would lead the database into an inconsistent state. Read locks are shared
because no data updating done on data item which has read lock.
There are two options for handling the locks on data item with lock manager as given below:
Single lock manager approach (Binary locks-locked or unlocked)
e Jn this approach , system maintains a single lock manager that resides on a single chosen site, say Si.
¢ All lock and unlock requests are made at site Si.
e When a transaction needs to lock a data item, it sends a lock request to Si rights to allocate data item
and lock manager determines whether the lock can be granted or not.
e If data item is available then lock manager sends a message to the site which has initiated the request
and If not available, the request for data item is delayed until it can be granted.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
fen :
istri
ement System (MU-Sem 5-Com
Advance Database Mana
a replica of the data item
sact ion can read the data item from any one of the sites at which
© The tran
resides.
the data item resides mus t be inv
olved in the writing.
, all the sites wher e a repl ica of
e Incase of a write
cS Advantages
and
two messa ges for handl ing lock requests and unlock requests
It is easy implementation as it requir es
the deadlock-handling
deadl ock handl ing as all lock and unloc k requests are made at one server,
easy
algorithms can be applied directly.
tS Disadvantages
cS Advantage
t to failures.
Distribution of the work and it causes robus
t= Disadvantage
icated.
Deadlock detection and recovery is more compl
(MU-New Syllabus w.ef academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Ventu’?
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)....Page no. (2-15
Sr.No: | 7 Th
1 lock-S(A)
2 lock-S(A)
3 lock-X(B)
gen Oe | coms
5 | Unlock(A)
6 Lock-X(C)
7 Unlock(B)
8 Unlock(A)
9 Unlock(C)
oe Strict 2-PL
o Rigorous 2-PL
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
7
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)....Page no. (2-16
© Conservative 2-PL
In Basic 2-PL, over that some extra modifications are done.
Strict 2-PL
This requires that in addition to the lock being 2-Phase all Exclusive(X) locks held by the transaction be
released until after the Transaction Commits.
Following Strict 2-PL ensures that our schedule is :
© Recoverable
© Cascade less
Hence, it gives us freedom from Cascading Abort which was still there in Basic 2-PL and moreover
guarantee Strict Schedules but still, Deadlocks are possible!
ES Rigorous 2-PL
This requires that in addition to the lock being 2-Phase all Exclusive(X) and Shared(S) locks held by the
transaction be released until after the Transaction Commits.
Following Rigorous 2-PL-ensures that our schedule is :
© Recoverable
o Cascade less
Conservative 2PL 3
This protocol requires the transaction to lock all the items it access before the Transaction begins execution
by predeclaring its read-set and write-set.
If any of the predeclared items needed cannot be locked, the transaction does not lock any of the items y
instead, it waits until all the items are available for locking.
It is difficult to use in practice because of the need to predeclare the read-set and the write-set which is not
possible in many situations.
In this type of two-phase locking mechanism, lock managers are distributed to all sites.
They are responsible
for managing locks for data at that site. If no data is replicated, it is equivalent to primary
copy 2PL.
In this approach, there are a number of lock managers, where each lock manager
controls locks of data items
stored at its local site. The location of the lock manager is based upon data distribu
tion and replication.
The basic principle of distributed two-phase locking is same as the basic two -phase
; . locking protocol.
However, in a distributed system there are sites designated as lock managers.
A lock manager controls lock acquisition requests from
transaction monitors. In order to enforce c0-
ordination between the lock managers in various sites,
; at least one site is given the authority ‘to see all
transactions and detect lock conflicts.
(MU- New Syllabus w.e.f academic year 21-22)(M5-68) [ial rch. nies Publications...A SACHIN SHAH Venture
(MU-Sem 5-Com Distributed Database Handling)....Page no, (2-17
Advance Database Management System
who can detect lock conflicts, distributed two-phase locking
e Depending upon the number of sites
approaches can be of three types :
1. Centralized two-phase locking
For every data item, two time stamp are maintained that are listed below:
e Read time stamp : Time stamp of youngest transaction which has performed operation read on the data
item.
* Write time stamp : Time stamp of youngest transaction which has performed operation write on the
data item.
¢ The Timestamp Ordering Protocol is used to order the transactions based on their Timestamps. The order of
transaction is nothing but the ascending order of the transaction creation.
e The priority of the older transaction is higher that's why it executes first. To determine the timestamp of the
transaction, this protocol uses system time or logical counter.
* The lock-based protocol is used to manage the order between conflicting pairs among transactions at the
execution time. But Timestamp based protocols start working as soon as a transaction is created.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) te Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)....Page no. (2-19
Let's assume there are two transactions Tl and T2. Suppose the transaction T1 has entered the system at 007
times and transaction T2 has entered the system at 009 times. T1 has the higher priority, so it executes firs,
as it is entered the system first.
The timestamp ordering protocol also maintains the timestamp of last 'read' and ‘write’ operation on a data,
Basic Timestamp ordering protocol works as given below :
1. Check the following condition whenever a transaction Ti issues a Read (X) operation:
© If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back otherwise the operation is
executed,
Where,
TS (TI) denotes the timestamp of the transaction Ti.
R_TS(X) denotes the recent Read time-stamp of data-item X.
W_TS(X) denotes the recent Write time-stamp of data-item X.
This rule states if TS (Ti) < W-timestamp(X), then the operation is rejected and Ti is rolled back. Time-
stamp ordering rules can be modified to make the schedule view serializable. Instead of making Ti rolled back,
the 'write’ operation itself is ignored.
1. Transaction Failure : This is the condition in the transaction where a transaction cannot execute it further.
This type of failure affects only few tables or processes. The failure can be because of logical errors in the
code or because of system error like deadlock or unavailability of system resources to execute the
transactions.
System Crash : This can be because of hardware or software failure or because of external factors like
power failure. In most of the cases data in the secondary memory are not affected because of this crash. This
is because the database has lots of integrity checkpoints to prevent the data loss from secondary memory.
Disk Failure : These are the issues with hard disks like formation of bad sectors, disk head crash,
unavailability of disk etc.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)....Page no. (2-19
E—————
——eEeEeee———————————E——EE—E
eee
. Some of
Distributed concurrency control and recovery techniques must deal with these and other problems
the techniques that have given below to deal with recovery and concurrency control in DDBMSs.
To deal with replicated data items in a distributed database, a number of concurrency control methods
have been proposed that extend the concurrency control techniques for centralized databases.
this
The idea is to designate a particular copy of each data item as a distinguished copy. The locks for
are sent to
data item are associated with the distinguished copy, and all locking and unlocking requests
the site that contains that copy.
A number of different methods are based on this idea, but they differ in their method of choosing the
site.
distinguished copies. In the primary site technique, all distinguished copies are kept at the same
A modification of this approach is the primary site with a backup site. Another approach is the primary
copy method, where the distinguished copies of the various data items can be stored in different sites. A
site that includes a distinguished copy of a data item basically acts as the coordinator site for -
concurrency control on that item.
In this method a single primary site is designated to be the coordinator site for all database items and
all locks are kept at that site, and all requests for locking or unlocking are sent there.
This method is an extension of the centralized locking approach.
For example, if all transactions follow the two-phase locking protocol, serializability is guaranteed. The
advantage of this approach is that it is a simple extension of the centralized approach and thus is not
overly complex. However, it has certain inherent disadvantages.
One is that all locking requests are sent to a single site, possibly overloading that site and causing a
system bottleneck. A second disadvantage is that failure of the primary site paralyzes the system, since
all locking information is kept at that site. This can limit system reliability and availability.
Although all locks are accessed at the primary site, the items themselves can be accessed at any site at
which they reside. For example, once a transaction obtains a Read_lock on a data item from the primary
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) te Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com
Distributed Database Handlin
site, it can access any copy of that data item. However, once a transaction obtains
a Write_lock and
updates a data item, the DDBMS is responsible for updating all copies of the data item before Teleasing
the lock.
The concurrency control methods for replicated items maintains the locks for that item.
In the voting method, there is no distinguished copy but a lock request is sent to all sites that inclndes a
copy of the data item.
Each copy maintains its own lock and can grant or deny the request for it. If a transaction that requests a
lock is granted that lock by a majority of the copies, it holds the lock and informs all copies that it has
been granted the lock. If a transaction does not receive a majority of votes granting it a lock within a
certain time-out period, it cancels its request and informs all sites of the cancellation.
The voting method is considered a truly distributed concurrency control method, since the responsibility
for a decision resides with all the sites involved: The voting has higher message traffic among sites than
the distinguished copy methods. If any site failure occurs during the voting , it becomes extremely
complex.
Distributed Recovery
In some cases it is quite difficult even to deter-mine whether a site is down without exchanging
numerous messages with other sites. For example, suppose that site X sends a message to site Y and
expects a response from Y but does not receive it. There are several possible explanations:
(a) The message was not delivered to Y because of communication failure.
(b) Site Y is down and'could not respond.
(c) Site Y is running and sent a response, but the response was not delivered.
Without additional information or the sending of additional messages, it is difficult to determine what
actually happened.
Another problem with distributed recovery is distributed commit. When a transaction is updating data at
several sites, it cannot commit until it is sure that the effect of the transaction on every site cannot be
lost. This means that every site must first have recorded the local effects of the transactions permanently
in the local site log on disk.
The two-phase commit protocol is often used to ensure the correctness of distributed commit.
MU - May 14
——— ee
Assume that there are set of grocery stores where the head of all store wants to query about the available rice
inventory at connected stores in order to move inventory store to store to make balance over the quantity of
rice inventory at all stores.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Com| Distributed Database Handling)....Page no. 2-29
¢ The task is performed by a single transaction T that’s component T,, at the n' store and a store Sp corresponds
to Ty where the manager is located. The following sequence of activities are performed by T:
a) Component of transaction (T) Ty is created at the head-site (head-office).
b) Tosends messages to all the stores to order them to create components T).
c) Every T, executes a query at the store “i” to discover the quantity of available rice inventory and reports
this number to T,.
d) Each store receives instruction and update the inventory level and made shipment to other stores where
require,
But there are some problems that we can face during the execution of above process:
1) Atomicity property of transaction may be violated because any store (S,) may be instructed twice to
Send the inventory that may leave the database in an inconsistent state.
To ensure atomicity property Transaction T must either commit at all the Sites, or it must abort at all
sites. :
2) However, the system at store T, may crash, and the instructions from Tp are never received by T,
because of any network issue and any other reason.
* The distributed two phase commit protocol solves above problems, faced, during execution of Distributed
two-phase commit process.
¢ There are two phases’:
| A. Phase 1: Prepare Phase B. Phase 2: Commit/Abort Phase |
Transaction.
coordinator, un Scere al
see Coordinator Participant
a ~WU°St10. pre a sas -
i: pare a é [Beg] Prepare (vote request)
Prepare
apePale, .
a : a
a 9
=
phase < >?
; 4 ed
4 prepa ‘a
i 2§
aea SS
+t
a : —
a
a
C,
a
Pets
i
ne
9
a
Decision
ise ait
a 2
Commit) Ml E
phase a a o3
Ak
i pone a QB 5 ¢—<_—<—_———
' é ——
: ' :
~'/ a (End }~
Advance Database Management System (MU-Sem 5-Comp) (Distributed Database Handling)....Page no. (2-23)
e After the controlling site has received the first “Not Ready” message from any participant then :
o Thecontrolling site sends.a “Global Abort” message to the participants.
o The participants abort the transaction and send an “Abort ACK” message to the controlling site.
o When the controlling site receives “Abort ACK” message from all the participants, it considers the
transaction as aborted. For better understanding consider the following scenario of 2PC protocol.
Phase ONE Phase TWO
Transaction
Participant
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
7
Advance Database Management System (MU-Sem 6-Com Distributed Database Handling)....Page no, 2-24
Coordinator Participants 1
/oniNO Volo 4
coordinator Participants
= ' Trans action
‘Vote’ a ee
be ree t
on co a Re
rae < a Uést. . s
rene
P,
r9pare to Commit eeseuee | atfirmation Negaven 5t
hI ‘phase, :
all oe |
Disseminatio arty
phase, n
__
am
er o°
ment
|
=
End —|
:
L
pesnowense
(1810)Fig. 2.5.3 : Distributed three phase commit protocol
e The extension of 2PC is 3PCwhere the commit phase is divided into two parts to improve fault tolerance and
addition of the phase prepare-to-commit.
The working of the protocol is as given below:
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH ventul
_ Advance Database Management System (MU-Sem 5-Comp) (Distributed Database Handling) ...Page no. (2-25)
e In the second phase, which is the acting as prepare-to-commit stage, the coordinator sends a prepare
message to participants from the first phase.
° In this phase, the coordinator essentially asks the participants that if they are prepared to commit and, if
they are not, the commit is aborted.
e If the coordinator succeeds in the second phase, it will move on to the decision phase. Once the
the coordinator
coordinator receives a yes from all participants stating that they are prepared to commit,
will send out a commit message. Then participants will commit to the specified transaction.
then it will
e If the coordinator receives a negative message while in a voting state, times out or fails,
message to all
automatically abort the transaction. In this case, the coordinator will send an abort
participants and all participants will execute abort operation to the transaction.
= = SAS
SS Sts hepa
wh 27 MULTIPLE CHOICE QUESTIONS as ay ; (d) The algebraic query is executed by the local
: ‘ BE eRe ORC Matra ihe sy Se Fiat: Sheree SEE) sites v Ans. : (d)
Q. 2.1 Which of the following is NOT a step of query | Q,2.2 Let us assume that in 2PC protocol a transaction
decomposition layer in distributed query coordinator failed after a decision is taken (to
processing? abort/commit) and shared among the participating
(a) Normalized query is analyzed semantically ~ sites. What should the coordinator do during restart
(b) Semantically correct query is simplified (recovery)?
(c) Simplified calculus query is restructured as (a) Abort the transaction in any case
an algebraic query (b) Commit the transaction in any case
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
y
Advance Database Management System (MU-Sem 5-Com Distributed Database Handling)....Page no. (2-26
(c) Commit/abort only if received all (d). It will be granted as soon as it is released by
acknowledgements from participating sites A ¥ Ans. : (c)
god
ei Data Interoperability —
CHAPTER 3 XML and JSON
:
XML Databases : Document Type Definition, XML Schema, Querying and Transformation: XPath and XQuery.
Basic JSON syntax, (Java Script Object Notation),JSON data types, Stringifying and parsing the JSON for sending
& receiving, JSON Object retrieval using key-value pair and JQuery, XML Vs JSON.
3.1ee
XML Databases .. eerie sovnliceviasieurnentvons atanunsupausresnanensesapenedis'ssuvdaild sensi idiueauhel DUN CSE SRE OTS
GQ. Explain XML Based databases. OR What is DTD ? Explain with example 7 ...
3.4.1 Building Blocks of XML File with FOSPACE TO DTD. ..csecccsescssesesesnsestensevsseessoonsennsseensidiedaetsisheseritetdietdactes eases GOD
Ga. Explain XML Schema in details. . slovav execpt ouaweaesan sabe ivaanyeinverssnsedaavenssguotvaeapupaversnaaunenriaasnessevoaneveneasnenes SA
3.1.2 Querying and Transformation: XPATH and XQuery... RE nirrerer reenter iTecnr ea
GQ, Explain the XPATH and XQuery. » unainigno Z 6109 26el) guuocn sisidw secdssub ioaoitelet # * oT ge
GQ. Explain Data retrieval from XML Using XQUEMY: J... cece ceteteteeeeteneseeseerertoneeenseneesessevenesennesensnsensqaeanenenns
Ga. What is XPATH and its uses ?
3.2 Basic JSON (JavaScript Object Notation) SYMAX v.ccsssessesssssessssseeneessenssnsssesaseessssseenseeesseansesseessenecntensensesesetsatsasensers SOUT
3.2.1 ISON Dat Types ..ccseecceeccccsscsssossssesssensssssucesascnsssssnsssnsanenresseuassresneeserssosacennnsenssseuesengueatacossecenencensneensssenten OULD
3.2.2 What is a JSON ODbj@Gt ? o..c.scescsesssssssssccsssesssssssnsssessesssepersnssatecneoeevserediscsrananeeesesegsuenseseatsesesansesinasasensusuate 3-13
3.2.3 sess SO
JSON ASrAYS .ccceecsescseststesesesesesescseesssssseseecseenarsrseareseseseseseseenssssssanseereseasissneneceravedareeneneserebaqeaesesesssaseseacen
3.2.4 Parsing JSON Data in JavaScript
GQ. Explain the parsing and stringifying function with respect to JSON retrieval... S15
3.2.5 Stringifying and Parsing the JSON for Sending and Recelving .......sssssecrsssssssenssrsensasesenssnereeennserees tS
3.2.6 Applications of JSON ...c..sscsssesssecsssssssssvesssssssessssonssestecssneenteesssessanquvessvesscscuusatecssecseneansnatersstasensanessnsesiesnvensies
OULD
3.3
GQ. Differentiate between XML and JSON. ...c.ccessescesessnseseserseseseeseeesrseesseereneesavenaeerseeeeetieteasenseseren SELE
3.4 Multiple Choice QU@STIONS.......:sccsessesesesssesssveeesessssesesesssssessasensansesensvsnseneneseneacensnesessaveyeyeansrseenesessnescananeneeeererunesavanenenas 3-18
2,
+e Chapter Ends oc ACN LAN TNTT
.cccccccecsescssstersesneessseeenecseessnesanareessggs MATT RNR ., 3-20
SSeS
Boca: TT eo ee oye em oe op == ss ke wm me eee wee
} $2. Explain XM Be
a
Se ae OS Se a
7s
XML is software and hardware independent techniques for storing and transporting data between
applications,
| Advance Database Management System (MU-Sem 5-Comp. Data interoperability - XML and JSON)....Page no. (3-3) —
<!DOCTYPEemail
e #PCDATA specifies the Parsed Character Data it is having a text data that will be parsed by the parser.
#CDATA specifies the text will not be parsed by the parser.
> 1. Elements
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) fl Tech=Neo Publications..A SACHIN SHAH Venture
An element can have any number of uniq
ue attributes,
* Attributes give more information about the XML element or more precisely it defines a property of the
element.
<PhoneNo>23456789</phoneNo>
</email> 7 ee tis e om j :
e For the above XML file we will see how
to write a XML Schema or XSD.
XML Schema Document '
<?xml version = "1.0" encoding = "UTF-8"?>
<xs:complexType>
<xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
¢ Inrecent trends day by day increase in applications that uses XML for information exchange, mediate and to
store the data. So that the querying tools for effective data management is becoming very important now a
days.
® Tools for Querying and transforming XML data are especially important for extracting information from
enormous amounts of XML data and converting data across different XML schemas. A relational query's
output can be a relation, and an XML query's output can be an XML document. Querying and transformation
can thus be merged into a single tool.
A. XPATH
matching
e The query language XPath is used to navigate around an XML document. It's typically used to find
patterns for specific elements or attributes. It has official recommendation of W3C (World Wide Web
Consortium).
* It is used to explore an XML document's elements and attributes. XPath includes a number of expressions
that can be used to extract information from an XML document.
t= Basic components of XPATH
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
Data interoperabili — N)...
— XML and JSO .
Advance Database Management System (MU-Sem 5-Comp.
ns select node s or list of
nodes in XML
pow erf ul pat h exp res sio
2. Path Expressions : XPath provides
documents.
and sequence
num: eric val ues , dat e and time comparison, node
3. Standard Function : String values ,
basic functions are a 1! available
in XPath.
manipulation, Boolean values, and other
XML documents
is one of the major elements in XSLT stand
ard and used to transform
4, XSLT : XPath
into various other types of document.
e patterns are used by
pattern in order to select a set of nodes. Thes
* An XPath expression generally defines a
addressing purpose.
XSLT to perform transformations or by XPointer for
of the XPath
XPath specification specifies seven types 0 f nodes whic
h can be the output of execution
e
expression
o Root
o Element
o Text
o Attribute
o Comment
° Processing Instruction
o Namespace —
from an XML document.
¢ XPath uses a path expression to select node or a list of nodes
node/ list of nodes
t= List of useful paths and expression to select any
name node name.
1. Node name: It is useful in selecting all the nodes with the
2. /: Itis used to start the selection tight from the root node.
the secaae
3. /f:Itis used to show the selection starts with the current node that matches
4, .:Itis used to select the current node.
¢ Below is the example where we have a sample XML document, students_info.xml and its style sheet
document students_design.xsI which uses the XPath expressions under select attribute of various XSL tags
to get the values of roll no, firstname, lastname, nickname and marks of each student node
Student_info.xml
<lastname> Verma</lastname>
<nickname> Yogini </nickname>
<marks>65</marks>
</student>
' <student rollno = "103">
_ <firstname> Rushi</firstname>
<lastname>Sing </lastname>
<nickname> Rushi</nickname>
<marks>90</marks>
</student>
</class>
Student_design.xsl
<?xml version = "1.0" encoding = "UTF-8"?>
<xsl:stylesheet version = "1.0"
xmins:xsl = "http://www.w3.org/1999/XSL/Transform">
<xsl:template match = "/">
<html>
<body>
<h2>Student information</h2>
<table border = "1">
<tr bgcolor = "green">
<th> Roll No</th>
<th> First Name</th>
<th>Last Name</th> ...
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) fe Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Mana jement System Data interoperability — XML and JSON)....Page no. 38
(MU-Sem 5-Comp.
<th> Nick Name</th>
<th> Marks</th>
</tr>
<xsl:for-each select = "class/student">
<tr>
[C:/Users/admin/Desktop/XSL_XM. x oa
cS G G Fle://C:
rere
/Use rs/acmiryDesktop/KSl_XML/student.infoara
te treme
a
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publicatio
ns,..A SACHIN SHAH Venture
e Here is how the data fetched from the XML file using XPATH expression and formatted with XSL style
sheet looks like.
B. X Query
* XQuery to XML is same like as SQL for the Databases. As like SQL is basically designed to query the
database as per the requirements same like XQuery does for the XML.
e XQuery is a functional query language that may be used to get data from XML files. It's the same with XML
as it is with databases. It was created with the intention of querying XML data.
e Use XQuery to take data from multiple databases, from XML files, from remote Web documents, even from
CGI scripts, and to produce XML results that you can process with XSLT.
e Both hierarchical and tabular data can be obtained with XQuery. Tree and graphical structures can be queried
with XQuery. XQuery may be used to query webpages directly. XQuery can be used to create webpages
directly. XQuery can be used to transform xml files.
e For example we have a employee database in employee.xml file and we need to get the data from XML file
such books we need to find whose prize is above 50. We can write XQuery with the extension -xqy.
Book-xml:
we
</book> a ‘ ie
<book category XML'>
<title lang=" english" Complete XML ee title>
<author>Robert</author>
<author>Peter</author>
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
el
A dvance Database Management
ng Syst
y em (MU-Sem 5-Comp.) (Data interoperability — XML and JSON)....Page no, (3-1 y
Syear>2013</year>
Sprice>50.00</price>
</book>
<book category="XML">
~</books>
Book.xqy
for $x in doc("Books.xml")/books/book
where $x/price>40
retum $x/title
° In above example we have seen the retival of XML data i XQuery and you may notice that it will work
same as SQL will work for Databases.
° Let’s see one more example to retrieve the XML data elements from the Product.xml file using XQuery.
Product.xml
<Prod_catalog>
<product dept="WMN">
<number>557</number>
<name language="en'"> Fleece Pullover</name>
<colorChoices > navy black</ calorChoices >
</product>
<product dept="ACC">
<number>563</number>
<name language="en"> Floppy Sun Hat</name>
</product>
<product dept="ACC">
<number>443 </number>
(MU-New Syllabus w.e-f academic year 21-22)(M5-68) [al Tech-Neo Publications,.A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp. Data interoperability - XML and JSON)....Page no. (3-11
<product dept="MEN">
<number>784</number>
<name language="en"> Cotton Dress Shirt</name>
<colorChoices> white gray </colorChoices>
</Prod_catalog>
Lets write a XQuery to retrieve the‘elements ‘from XML.
cS Query
for $prod in doc("catalog.xml")/catalog/product |
where $prod/@dept = "ACC"
order by $prod/name
return $prod/name
tS Results
e JSON or JavaScript Object Notation is a lightweight text-based open standard designed for human-
readable data interchange.
¢ Douglas Crockford created the JSON format, which is documented in RFC 4627.
¢ JSON is data representation format who has the data represented in the form of Key Value pair. JSON file
has extension .JSON.
= Why we use JSON?
Straightforward syntax.
You can natively parse in JavaScript using eval() function,
Easy to create and manipulate.
vA
(MU-New Syllabus w.e-f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
— XML and JSON)...
{
"Prod_id":"01",
"Model": "MI 10",
"Version": "5th",
"Prize":."25000"
},
"Prod_id’:"02",
"Model": "Samsung",
"Version": "2nd",
:
"Prize": "30000"
1
]
}
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications... SACHIN SHAH Venture |
Advance Database Management System (MU-Sem 5-Comp. Data interoperability — XML and JSON)....Page no. (3-13
1. String : The String type of data is presented as double quoted Unicode with escape character back slash.
2. Number : Number format supports up to double precision floating point format in java script. Number can
be of any type such as integer, fraction or exponent.
Object : It is used to represent unordered collection of key: value pair objects data.
esa7rAaw
Value : It is used to present either any of from number, array, string, true of false or null.
Whitespace : When any pair of tokens are used. .
JSON object is a set of, Keys along with its values without any specific order.
The key and their values are grouped using curly braces, both opening and closing “{ }”. So, in the previous.
For example_when we were creating a JSON with a car attribute, we were actually creating a JSON car
Object. There are certain rules that need to be followed while creating a JSON structure, we will learn about
those rules while discussing the Key value pairs.
an
So, in order to create a JSON, the first thing we will need is an attribute. Here, we are creating
assume our
“Employee” JSON object. Next thing we need is to specify the properties of the object, let’s
properties of the
employee have a “First Name”, “Last Name”, “employee ID” and “designation”. These
employee are represented as “Keys” in the JSON structure.
Employee:
{
“Employee_id” : “1001”
“Firstname” : “Raghav”,
“Lastname” : “Shastry” aes
“Designation” ; “Manager”
Employee Object.
Everything within the curly braces is known as JSON
Example, we used a JSON to
A basic JSON object is represented by Key-Value pair. In the previous
Tepresent an employee data.
Name”
And we have represented different properties for the employee; “Employee ID” “First Name”, “Last
and “designation”. Each of these “keys” has a value in the JSON. For Example, “First Name” has been
values.
represented by a value “Raghav”. Similarly, we also have represented other keys by using different
Values are represented by putting “:” colon between them and the keys.
WwW
Values can be of any data type like String, Integer, Boolean efc,
vA
Arrays in JSON are similar to the ones that are present in any programming language, the array in JSON is
also an ordered collection of data. The array starts with a left square bracket “[“and ends with right square
bracket “J”. The values inside the array are separated by a comma. There are some basic nules that need to
be followed if you are going to use an array ina JSON.
Let’s have a look at a sample JSON with an Array. We will use the same Employee object that we used
earlier. We will add another property like “Technical Skills”. An employee can have expertise in multiple
programming languages. So, in this case, we can use an array to offer a better way to record muluple
language expertise values.
e For Example
Employee :
{
“Employee id” : “1001”
“Firstname” : “Raghav”,
“Lastname” : “Shastry”
“Designation” : “Manager”
“Technical_Skills” : [“Java”, “C’, “C++”, “net”]
}
5S Rules for using Arrays in JSON
An array in JSON will start with a left square bracket and will end with a right square
bracket.
e Values inside the array will be separated by a comma.
Objects, Key-value pair, and Arrays make diverse components of the JSON. These
can be used together to
record any data ina JSON.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp. — XML and JSON)...
e The JSON.parse() method in JavaScript makes it simple to parse JSON data from the web server. This
method parses a JSON string and creates a JavaScript value or object from it. A syntax error will occur if the
provided string is not valid JSON.
e __Let’s see one example suppose we have received one JSON encoded string such as,
{"name": "Amol", "age": 22, "country": "Inda"}
e Let’s convert this JSON encoded string into JavaScript object as below,
var json = '{"name": "Amol!, "age": 22, "country": "Inda"}';
« The most common use of JSON is to exchange the information between client and server. While sending and
receiving the data initially and at the receiver end it should be in a string format but when it will be sent on
network it will be in object form.
¢ It is possible to create a single JSON object or array of object using JavaScript and it will be used as per the
requirement. So the conversion of string to object and object to string is required at the respective end.
e The two main function are useful in this conversion and they built on javascript.
1. JSON.parse()
2. JSON.stringifyO
e Parsing : The data that we receive from a web server is always a string. We use JSON -parse() to parse the
data and convert it to'a JavaScript object. It is only a string, some text, before it is parsed. and you canmet
access the data embedded in it. It becomes a JavaScript Object after parsing, and you can access the datz.
* Suppose we have received the data from the web server is as,
Name : {"name";"Amey", "age":35, "city":"Pune"}
© Then it can be parsed such as it will get converted into JavaScript object as,
Name obj = JSON.pars ({"name":"Amey", "age":35, "city":"Pune"}");
* Stringify : A JavaScript object is converted to a JSON string using JSON. stringify(). The data that is sent to
a JavaScript object to a
"a web server must be a string. The JSON. stringify() method can be used to convert
string ().
object to a
© The data that is sent to a web server must be a string. JSON.stringify() converts a JavaScript
string().
({MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
{
Advance Database Ma meant System (MU-Sem 5-Comp. Data interoperability - XML and JSON)....Page no. (3-16
JSON refers to JavaScript Object Notation. We use JSON to transfer the data from server to client and client
to server.
e As per now we have seen JSON and its data formats, we.can sehine iad JSON data in key value pair as,
Var Sid Name=. (name"s"Amey", "aged, "city'"Pune’} on
This is how we can define the JSON data object in terms of the Key value pair. Here in this example above
you can see the keys are defined uniquely and it has associated values too. However the name, age and city |
are unique keys and having values specified over there. Let’s discuss how we can retrieve the JSON data
using JQuery. ;
e First we can find the JSON object using its key let’s see how we can,
var Name_of_Student= Stud | Name.name;
var Age
of Student = Stud_Name.age;
(MU-New Syllabus w.e.f academic year 21-22)(M5S-68) Tech-Neo Publications..A SACHIN SHAH Venture |
73. 3.2.6 Applications of JSON
» Helps you to transfer data from a server
» Sample JSON file format helps in transmit and serialize all types of structured data.
e Allows you to perform asynchronous data calls without the need to do a page refresh
1 3.3 XML
VS JSON.
XML has user defied tags. JSON will not uses start and end tag.
Tags
XML does not use arrays. JSON can have array of JSON objects.
Arrays
XML requires DOM parsing. JSON is parsed into ready to use JavaScript
Parsing
objects.
=
Se
Advance Database Management System (MU-Sem 5-Comp.) (Data interoperability - XML-and JSON)....Page.no. (3-1 8)
SS SSS
> 3.4 MULTIPLE CHOICE QUESTIONS Q.3.8 Which is correct format of writting JSON
name/value pair
SS
Q. 3.1 XML stands for occ ccccoues (a) "name" : "value" (b) name = 'value'
(a) Extensible Markup Language (c) name = "value" (d) name: value’ ~Ans. : (a)
(b) Eccessive Markup Language
SS
What is a JSONStringer used for?
(c) Executive Markup Language
(a) It is used to quickly create JSON text.
(a) Extensible Managing Language = “Ans. : (a) (b) It is used to create number strings in JSON.
Q.3.2 The XML format has a simpler set of (c) It quickly converts JSON to Java strings f
ns avansnsuusntnsenegnens than HTML.
(d) It is used to create JSON ordered pairs é
(a) loader rule (b) parsing rules ¥ Ans. : (a)
(c) generator mile = (d) logical mle ~— “Ans. : (b)
Q.3.10 sis a major element in the W3C'’s XSLT
Q.3.3 All information in XML is aesees standard.
(a) Unicode text (b) multi code (a) XQuery (b) XPATH
(c) multi text (d) simple text v Ans. : (a) (c) XPOINTER (d) XLINK ¥ Ans. : (b)
Q.3.4 In XML the attribute value must always be quoted Q. 3.11 XPath is used to navigate through
WIED sciccasecsens
(a) elements and attributes files
(a) double quotes —_(b) single quotes
(b) files
(c) both a and b (d) name of attributes
(c) defferent pages
v Ans. : (a)
(d) none of these ~ Ans. : (a)
Q.3.5 Which of the following isn't a JSON type?
Q. 3.12 XML stands for ............0.000.....
(a) String (b) Object
(a) Extensible Markup Language
(c) Date (d) Array -.¥ Ans. : (¢)
(b) Eccessive Markup Language
Q.3.6 What is the purpose of method JSON. parse()?
(c) Executive Markup Language
(a) Parses a string from JSON to JSON2 (d) Extensible Managing Language v Ans. : (a)
(b) Parses a string to integer
Q. 3.13 The “XML format has a simpler set of
(c) Parses a string to JSON aay NS than HTML. —
(d) Parses integer to string v Ans. : (c) _(a) loader rule (b) parsing rules
Q. 3.7 What is JSON retum? (c) generator rule (d) logicalrule = “Ans. : (b)
(a) json. loads() takes in a string and returns a json Q.3.14 All information in XML is «0.0...
object. json. dumps() takes in a json object and
(a) Unicode text —_(b) multi code
returns a string.
(c) multi text_ (d) simple text “Ans. : (a)
(b) json. loads() takes in a json object and returns a
json object. json. dumps() takes in a json object Q.3.15 In XML the attribute value must always be quoted
and returns a string. WItH oo...
:
(c) json. loads() takes in a string and returns a json (a) double quotes (b) single quotes
object. json. dumps() takes in a string and (c) both a and b (d) name of attributes
returns a string
~ Ans, : (a)
(d) None of these v Ans. : (a)
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
ance Database Management System (MU-Sem 5-Com
— XML and JSON)....P.
N type?
3.16 which of the following isn't a JSO Q. 3.23 JSON name/value Pair is written
as
(a) String (b) Object (a) name’ : ‘value’ (b) name = ‘value’
(c) Date (d) Array v Ans. : (c) (c) name = “value” (d) “name” : “value”
0. 37 What is the purpose of method J SON. parse()? ~ Ans. : (d)
(a) Parses a string from JSON to JSON2 Q. 3.24 In the below notation, Employee is of type {
(b) Parses a string to integer “Employee”: [ “Amy”, “Bob”, “John” ] }
(c) Parses a string to JSON (a) Not a valid JSON string
(d) .Parses integer to string ¥ Ans. : (c) (b) Array (c)Class (d) Object ¥ Ans. : (b)
Q. 3.25 Which of the following is not a JSON type?
g. 3.18 What is JSON return?
(a) json. loads() takes in a string and returns a json (a) Object (b) Date
object. json, dumps() takes in a json object and (c) Array (d) String ¥ Ans, : (b)
returns a string.
Q. 3.26 What is the value of obj in the following code?
(b) json. loads() takes in a json object and returns a var obj = JSON.parse(‘{“fruit”: “Apple”}’,
json object. json. dumps() takes in a json object function(k, v) { if (v == “Apple”) return “Orange”
and returns a string. else return v; });
(c) json. loads() takes in a string and returns a json (a) { “fruit”: “Apple”} (b) { “fruit” : “Orange”}
object. json. dumps() takes in a string and (c) {“Orange”} (d){“Apple”} Ans. : (b)
returns a string
| Q.3.27 What is the value of json in the following code?
(d) None of these ¥ Ans. : (a) var obj = { fruit: ‘apple’, toJSON: function () {
return ‘orange’; } }; var json = JSON.stringify({x:
Q.319 Which is correct format of writing JSON
name/value pair obj});
(a) {“x”:"orange”} (b) {“fruit”:apple”}
(a) "name" : "value" (b) name = 'value'
(c) { “y”-”apple” }
(d) {“fruit”:”orange”}
(c) name = "value" (d) name value’ ¥ Ans.
: (a)
~ Ans. : (a)
Q. 3.20 What is a JSONStringer used for?
Q. 3.28 What is used by the JSONObject and JSONArmay
(a) It is used to quickly create JSON text. constructors to parse JSON source strings?
(b) It is used to créate number strings in JSON. (a) JSONTokener (b) JSONParser
(c) It quickly converts JSON to Java strings (c) JParser (d) Parser] ¥ Ans. : (a)
in
(d) It is used to create JSON ordered pairs. Q. 3.29 Which statement about the space parameter
¥ Ans. : (a) JSON.stringify () is false?
(a) It controls spacing in the resulting JSON string
Q.3.21 XPath is used to navigate through
(b) It removes whitespace
(a) elements and attributes
(c) It is an optional parameter
(b) files (d) All three statements are false “Ans. : (b)
(c) different pages Q. 3.30. What is a JSONStringer used for?
v Ans. : (a)
(d) none of these (a) It is used to quickly create JSON text.
Q.3.22 XPath is a major element in (b) It quickly converts JSON to Java strings.
(c) It is used to create number strings in JSON.
(a) XSLT (b) XSL
na) (d) It is used to create JSON ordered pairs.
(c)XML_ (d) XHTML v Ans.
¥ Ans.: (a)
A
Tech-Neo Publications..A SACHIN SHAH Venture
(MU-New Syllabus we.f academic year 21-22)(M5-68)
Data intero — XML and JSON)....
ap
Q. 3.31 What is the value of json in the following code? var (d)A collection of native-value pairs, and
(}; days[‘Monday’]) = _ true; ordered list of arrays, or values. “Ans. : (a)
days =
days[‘Wednesday'] = true; days[‘Sunday’] = false; N?
var json = JSON. stringify((x: days});
Q. 3.34 Does whitespace matter in JSO
(a) No, it will be stripped out.
(a) (day”: (“Monday”:"true”,”Wednesday”:"true”,
"Sunday”:"false”} } (b) Yes, only within strings.
ee
hi al NoSQL Distribution
CHAPTER 4 | Model
| -syltab
se
s of NoSQL, comparison between SQL and NoSQL databa
NoSQL database concepts: NoSQL data modeling, Benefit
system. .
distributed data, CAP theorem, Notion of ACID Vs BASE,
Replication and sharding, Distribution Models Consistency in
‘
handling Transactions, consistency and eventual consistency.
Comparison of
Document database and Column Family Data store,
Types of NoSQL databases: Key-value data store,
NoSQL databases w.r.t CAP theorem and ACID properties.
ge tCeeeeeee
eseedsetQUaiOMessatseeQeQeetOnessin @ 42
2202 205
uveuscanesansnuuannenauesnsrssessersieesesssessensiesses
4.1 NoSQL database concepts.........---+ sccuess EE cen 4-2
.::sssssscssesne meennersmsnerstseter
cette
4.1.1 What is meant by NoSQL Databases ? ......-: EE nsnn S nss 42
sents nessttennnanenseta
sssestseetts
AA.2 Why NoSQLis in existence? -.-....sscsssore ennenrestsncan essenan enensees 43
essen
nensaes
sssossseseesesssessssssseesceecseensssssnessesnan
4.1.3. Benefits of NoSQL databases over FIDBMS....esss T 43
esssset
geerEE
iE
-sssssssssss eteessessessereceen
4.1.4 Challenges in using RDBMS.......--ss IIsnne ITr 43
se rsns
esssssreseme ee nnnternseers
ete esnn
4.1.5 Types of NoSQL Databases ........-tsssses srestenneceasa nnentnn nner
nacense nsnse
ase Models.2...:...s---ssiciseseseeetsentessss
4.1.5(A) Performance Parameters of NoSQL Datab EET
ISTE nses ssn
nnnntts
sess
sssssssesssn nennnnsnssss
ecesssnsoP
41.6 NOGQL Data Modelling ......sssssssssssssr
esneneensaaeedhiees
4.1.6(A) Document Oriented Databases....... s cgusascu stiasctya
ecsssscreeretnnettesceten
4.1.6(B) Graph Based Databases «....c.ssssecs
ensecstsssrsisennrnntens
4.1.6(G) Key Value Databases ......-.ssrssserssser
4.1.6(D) Column Store Databases ........-s:ssscsecesieererrceteninnts
nsrerseessnennnensrsnanncennnannens
4.1.7 —_ Benefits Of NOSOL .......seessesesseesense
System
4.1.8 Comparison between SQL and NoSQL Database
ecersecscenneennnennstnn tenn tees
nsenne
4.2 — Replication and Sharding........ssesssscssscss
ees
4.2.1 Whaat is Replication? ..........sescssssssesesseneeetsresseres
4.2.2 — Master-Slave Replication ..........:.-ssssrssrcsesetererererertes
eee
4.2.3. Whatis MongoDB Sharding ?.....--.-ssssrersrseneesers
nne
4.2.4 How Data is Distributed Across Shards Vicnscsesmswie
Data...
4.2.5 Distribution Models Consistency in Distributed
ssserereeensees
4.2.6 Update and Read Consistency «..1-----secereo
enente
4.2.7 —_ CAP THGOF OM ose eeeseeeeeeneenenereesntecnennnnnnens
4.2.8 Notion of ACID Vs BASE .......+ssesesssseerecsees
enserreriee
4.3 Types of NoSQL databases.......--sssrsisrsscsersern rsnterseees 4-15
em and ACID PrOPOrtiS ......scceescceseseeesenseeeesta
4.3.1 Comparison of NoSQL Databases w.r.t CAP Theor neys 4-16
ssssssssssescnscssessetnsaisantnanssstnsateneaensensadeneaseses
4.3.2 RDBMS To NoSQL Database w.r.t ACID and BASE.......scs 4-16
isesreetensnnnneen senses er ;
4.3.3 Features of NoSQL Database.....sssssrrssssssnerr
nsers
4.4. Multiple Choice Quesitons .........ssssssessersereerre tae io net
Sen
eessssm ssssese eesessn caus sepqaheeeeie aiit
nnnananersecracnantnennnns scvanaatigiecscagaels
fe Chapter End .........ssesssessscs
Dr 4.1 NOSQL DATABASE CONCEPTS
a
“A NoSQL (originally referring to "non SQL" or “non relational") database is one that vores and Tetrieve,
data using methods other than the tabular
relations employed in relational databases
It refers to
et »
a wide range of database technologies that were
created in <a = merase in the Volune
of data kept on users, things, and
goods, as well as the frequency, with which ties we SS
performance and processing requirements. NoSQL databases are gPCesSo 25 well ag
typically organised as key-value pain.
graph databases, document-oriented databases, or column-oriented
databases.
The term NoSQL was first used in 1998 by Carlo Strozzi for a relational database that
omitted the use ¢
SQL. The term was picked up again in 2009 and used for conferences of atincanes
of non-telation,
databases such as Last.fm developer Jon Oskarsson, who organized the NoSQL
meet up in San Francisco,
Ch 4.1 What is mean
by NoS
tQL Databases?
NoSQL is a non relational database management system and it
is different for the relational database
management system by many ways.
1. Increased performance
Higher scalability
te
3 Schema less
4. Dynamic
Relational databases are the best suited for some limit of data storage and simple structured data storage but
as today’s trend of data is considered so this traditional RDBMS is having some limitations.
RDBMS assumes a well-defined structure of data and assumes that the data is largely uniform.
It needs the schema of your application and its properties (columns, types, etc.) to be defined up-front
before building the application. This does not match well with the agile development approaches for highly
dynamic applications. :
As the data starts to grow larger, you have to scale your database vertically, ie. adding more capacity to the
existing servers.
(MU-New Syllabus w.e academic year 21-22)(M5-68) fa) Tech-Neo Publications..A SACHIN SHAH Venture
2X 4.1.5(A) Performance Parameters of NoSQL Database Models
they are having some performance
As we have seen different types of NoSQL database model s above
parameters to differentiate each other let's discuss the same,
e Document-oriented databases handle a document as a whole, not as a collection of name/value pairs. This
allows you to group a variety of documents into a single collection at the collection level. Document
databases allow you to index documents based on their attributes as well as their primary identifier.
Today, there are a variety of open-source documentdatabases accessible, but MongoDB and CouchDB are
the most popular. MongoDB has grown in popularity as one of the most widely used NoSQL databases.
° Few databases of this category available in market is as, MongoDB, HBase, Cassandra, Amazon
SimpleDB, Hypertable.
e Document Oriented databases consists key value pairs to represent the data in the database, This type of
databases are going to store a records in the form of documents such as for example if you are storing 4
records in SQL database (MySQL, ORACLE) it going to create a 4 new rows in a table you are inserting
values. Similarly when you try to add 4 new records in Document Oriented Database such as MongoDB it
e NOTES e
(MU-New Syllabus w.e-f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
BH 1dC"S9b?643604b'h5994 eRe A ad
od ee er hue
eas ea ae
za “principal"
permanent" i
vice_principal",
Pw Ut TTT aa
mt
e As we have already discusses about SQL based databases and NoSQL Databases it is noticed that the SQL
databases are schema oriented and having many features such’ as different constraints to ensure data
duplication and NoSQL are schema less so the data duplication and data formats are no matters. The schema
is nothing but how logically your relationships are associated and explores logical structure of the database.
But NoSQL databases like Mongodb is schema less and having data stored in documents each and every
to each and
document is allocated with _id which is object id it’s a 12 byte hexadecimal number allocated
every document.
‘structure of ‘Mongodb
® In this above image you can see the object id is allocated to each document and
database where values are stored in key value pair.
o Nodes: People, businesses, accounts, and any other item to be tracked are examples of entities or
instances. In a relational database, they are generally comparable to a record, relation, or row; in a
document-store database, they are roughly equivalent to a document.
© Edges: The lines that connect nodes to other nodes, often known as graphs or relationships, represent
the interaction between them. Examining the links and interconnections of nodes, attributes, and edges
reveals meaningful patterns. There are two types of edges: directed and undirected. An edge linking two
nodes in an undirected graph has only one meaning. The edges linking two different nodes in a directed
graph have different meanings depending on their orientation. Edges are the most important notion in
graph databases, as they provide an abstraction that can't be easily implemented in a relational or
document-store paradigm.
cS Properties
They are information associated to nodes. For example, if Wikipedia were one of the nodes, it might be tied
to properties such as website, reference material, or words that start with the letter w, depending on which
aspects of Wikipedia are germane to a given database.
¢ A key/value pair's key is a single value in the set that may be quickly looked up to retrieve data.
Key/value
pairs come in a variety of shapes and sizes, with some keeping data in memory. and others
allowing it to be
saved to disc. Oracle's Berkeley DB is a basic yet powerful key/value store.
¢ Incomparison to relational databases, key-value databases operate in a totally different way.
* RDBs specify the database's data structure as a sequence of tables
with fields that have well-defined data
types. By exposing data types to the database software,
it can do a variety of optimizations. Key-value
systems, on the other hand, handle data as a single opaque
collection with several fields for each entry. This
provides more flexibility and adheres more closely to modern
notions such as object oriented programming.
Because optional values are not represented by placeholders
or input parameters, as they are in most
Relational DBs, key-value databases frequently employ significantly
more than placeholders and input
parameters.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) [al Tech-Neo Publications..A SACHIN SHAH Venture
agvan
Kl | AAA,BBB,CCC
K2 | AAA,BBB
K3 | AAA,DDD
K4 | AAA,2,01/01/2015
KS | 3,ZZZ,5S623
» Data can be stored efficiently with column-oriented storage. When storing nulls, it saves wasting space by
simply not saving a column when no value exists for that column.
the unit itself being identifiable by a
» Each data unit can be thought of as a collection of key/value pairs, with
key is commonly referred to as the
primary identifier, sometimes known as the primary key, This primary
row-key in Bigtable and its clones. |
data rather than as
e Incolumn oriented NoSQL database, data is stored in cells grouped in columns of the
which contains the virtually unlimited
rows of data. The columns are grouped logically in to columns
definition of schema.
number of columns that can be created or the runtime or at the
e The column store databases are :
o GooglesBigtable, Cassesndra,
o HBase; CouchDB .
Best suitable for Big Data ; Large amounts of data are generated by each business, application, and service,
which must be appropriately kept. It emphasises the concept of "big data,’ which is concerned with the data
industrial revolution. RDBMS, in most cases, are unable to store unstructured data, as well as many types
and large amounts of data. To address ‘big data’ volumes, enterprises have turned to NoSQL platforms such
as MongoDB or Hadoop.
Location Independence : While doing operations with the databases the users are really getting abstract
view of data. Irrespective of location uses can submit its queries and it can be processed by any site as per
the availability of the site any time users query will get processed and results get generated due to
synchronous and replicated sites.
5. Flexible and Agile Data Model : Traditional database systems, especially big production databases, are
notorious for causing enormous headaches when it comes to handling changes in storage and operating
design. Minor changes must be carefully monitored in such a system. NoSQL database systems, on the other
hand, have no such limits in their data storage architecture. They are adaptable to changes in data genre as
well as data storage architecture, allowing for comparative agility such as the addition of new columns
without significant adjustments or breakdown,
Analytics and Business Intelligence : A key strategic reasons business move to a NoSQL database system
from a Relational Database Management Systems is the more flexible data model that found in most NoSQL
databases. The relational idata model is based on defined relationships between tables which themselves are
defined by determined column structure all of which are explicitly organized in a database schema. A
NoSQL data model often referred as a schema less data model and it is able to accept all kinds of data such
as structured, semi structured and unstructured much more easily than a relational database which rely on a
predefined schema.
NoSQL databases are cheaper : NoSQL databases are intended for utilizing inexpensive commodity
hardware for constructing clusters of the server, which helps in managing huge data volumes and transaction
of data. On the other hand, traditional RDBMSs systems want expensive storage and original servers; this
means they pose a higher cost per volumes for storing the data.
Storage | SQL databases where information | NoSQL databases are document based, key
stored in a tables value based, graph based, column based
data storing.
SQL databases are better for multi row | NoSQL are better for unstructured data
Data transaction
transactions like documents or JSON.
Schema These databases have fixed or static or | They have dynamic schema
predefined schema
MongoDB is a next-generation database that allows you to achieve things that were previously impossible,
It is a significant member of the NoSQL movement and a premier non-relational database management
system. MongoDB stores documents using key-value storage rather than tables and fixed schemas like a
relational database management system (RDBMS).
In big, production contexts, it also provides a variety of horizontal scalability options. MongoDB is a
NoSQL document database system that scales horizontally and uses a key-value structure to store data.
Scaling NoSQL databases to meet rising demand on your application is quite simple compared to traditional
database servers - you simply add anew server, make a few configuration modifications, and it joins to your
existing servers, enlarging the cluster. All existing databases and collections are replicated and synchronised
with the other member nodes automatically. When the full data volume of your database(s) can fit on a single
server, a replication cluster works well. A full copy of your databases will be stored on each server in your
replication cluster.
Replica Sets are a wonderful way to duplicate MongoDB data across many servers while also having the
database failover automatically in the event of a server loss. Clients can connect directly to secondary
instances to scale read workloads. That’s why it is important to note that master/slave MongoDB replication
is not the same as a Replica Set, and it lacks automatic failover.
2S. 4.2.2
:
Master-Slave Replication
ow . All updates saves
: Read can be done
With master slave distribution, you replicate data across atmasternode | Master. at master node
multiple nodes. One node is designated as the master, bs —
or primary. This master is the authoritative source for : :
Changes propagates
the data and is usually responsible for processing any to the slaves
updates to that data. a
MongoDB scales by using a method known as "sharding." It is the process of writing data across multiple
servers in order to distribute the read and write load as well as data storage needs,
MongoDB's method to handling the needs of data growth is sharding, which is the technique of storing data
records across numerous machines. As dala grows in size, a single system may not be able to store it all oF
provide a satisfactory read and write throughput.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications... SACHIN SHAH Venture
NoSQL Distribution Model)....Page no. (4-11
advance Database Managemen System (MU-Sem 5-Comp,
tos
ifficulty of horizo ding to increase the number of
anne : * nal scaling is solved by sharding, You use shar
machines available to handle data expansion and read and write operations
, allocating data
the database takes on the responsibility of
SQL databas
a“ _ asin ee auto-sh arding, where
a Be
to shards ai s § that data access goes to the right shard. This can make it much easier to use sharding
ng i is particularly valuable for performance because it can improve both r ead and
tion. Shardiing
in an applicica
ation.
write performance.
. Using came particularly with caching, can greatly improve read performance but does little for
a way to horizontally scale writes.
applications that have a lot of writes. Sharding provides
ya. 4.2.4 How Data Is Distributed Across Shards ?
a table. In a typical
. collection in MongoDB is similar to a table. Documents are individual rows in
distributes data, or shards, at the collection
database, data is partitioned using a unique key. MongoDB
(table) level, with data partitioned using the shard key.
each document in the collection. To separate
. The Shard Key is based on an indexed key that is present in
partitioning.
sharded keys, MongoDB uses either range-based partitioning or hash-based
* The CAP theorem is frequently used in the NoSQL community as a reason why consistency may be handled
carefully. Eric Brewer proposed it in 2000 [Brewer], and Seth Gilbert and Nancy Lynch [Lynch and Gilbert]
refined it a few years later [Lynch and Gilbert].
* In the view of the handling consistency the basic statement of the CAP theorem : Given the three
properties such as Consistency, Availability, and Partition tolerance, you can only get two. Obviously this
depends very much on how you define these three properties, and differing opinions have led to several
debates on what the real consequences of the CAP theorem are.
¢ A distributed system cannot be a consistent,
available and tolerant to network partitions at the
Consistency
same instance of time. There must be only two of
above properties are satisfied at a time. Since every
distributed system has to be tolerant to the network
partitions and where the two communicating nodes
are also distributed in nature and at a same time one
Partition
has to choose the availability where system always Availability
Tolerance
be available for accepting read and writes and
consistency where an update operation is
synchronized with all other nodes at the same time.
: (103Fig. 4.2.2: Three main features Distributed system
e Consistency : For various transactions, consistency means that the nodes will have the same copies of a
replicated data item visible. Each node in a distributed cluster must return the same, most recent, successful
write. Every client has the same view of the data, which is referred to as consistency. Consistency models
come in a variety of shapes and sizes. Sequential consistency, a particularly powerful form of consistency, is
referred to in CAP.
e Availability :Each read or write request for a data item will either be processed successfully or will receive
an error message indicating that the operation cannot be performed. In a reasonable length of time, every
non-failing node responds to all read and write requests. Every node on the network must be able to reply in
an acceptable length of time in order to be available.
e Partition Tolerance : Partition tolerance means that the system can keep running even if the network
connecting the nodes fails, resulting in two or more partitions, each with its own set of nodes that can only
communicate with one another. That is, despite network partitions, the system continues to function and
maintains its consistency promises. Network partitions are an unavoidable reality. Once a partition repairs,
distributed systems that ensure partition tolerance can gently recover.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications_A SACHIN SHAH Venture
t MU-Sem 5-Comp. NoSQL Distribution Model)....Page no. 4-13
Advance Database
«The basic idea of this ACID and BASE properties which must possess by the database so that one rou
system will be available for the users.
consistency and availability in a partition
e The CAP theorem states that it is impossible to achieve both
tolerant distributed system.
database models is the way they deal with this
e The fundamental difference between ACID and BASE
Eventual Consistency
s the data store to be highly available. It is also
Eventual consistency is a consistency model that enable
buted systems.
known as optimistic replication & is key to distri
and let’s say a write request comes to one of
Whenever we use multiple replicas of a database to store data
strategy to make this write request at one replica
the replicas. In such a situation, Databases had to discover a
write data of the request and become consistent.
reach other replicas so that they all could also
Venture
Tech-Neo Publications::A SACHIN SHAH
21-22)(M5-68)
(MU-New Syllabus w.e.f academic year
Advance Database Management System (MU-Sem 5-Comp. NoSQL Distribution Model)....Page no. (4-15
a
But NoSQL databases are all quite different from each other as well.
Let’s discuss few of them as below,
data element in the
Key value data store:-The simplest type of NoSQL database is a key-value store . Every
database is stored as a key value pair consisting of an attribute name (or "key") anda value.
name
In a sense, a key-value store is like a relational database with only two columns the key or attribute
(such as state) and the value (such as Maharashtra) as below. ,
“State”: “Maharashtra”
¢
This means that if you just need to analyse a few columns, you can read those columns directly without
wasting RAM on irrelevant data. Because columns are frequently of the same kind, they benefit from more
efficient compression, which speeds up reads. The value of.a column in a columnar database can be easily
aggregated.
4.3.1 Comparison of NoSQL Databases w.r.t CAP Theorem and ACID Properties
many
Due to a mismatch between the in-memory data structure and relational data structure of applications,
not need to
problems were faced by application developers. By using NoSQL databases, developers do
point to the
convert in-memory structure to relational structure. Hence, they also use it as an integration
application.
perfectly on clusters.
Relational databases were not designed in such a way that they can run
The storage-requirement is growing day by day and the solution is moving towards distributed systems.
databases to achieve higher scalability, higher speed, and
The organizations are shifting to NoSQL
continuous availability.
* RDBMS systems are made such that they don’t scale. Handle things like foreign keys, maintain relations
over the entire data set. The problem with this is to handle the data on a large set of machines with their
foreign key relationships,
* According to CAP only two properties out of three can be achieved. If the consistency is the absolute
requirement we have to give up the other two. Because the RDBMS follow ACID(Atomicity, Consistency,
Isolation, Durability), so it is difficult to scale the RDBMS.
e The need for Speed : Whenever a fast response time is required, the data should be placed in the memory.
In this case, when the very fast response time is required we have to choose a database that stores the data
in the memory.
e The need of Scale : With the increased number of users and data volumes organizations requires such
databases which are easily scalable:
¢ Need for Continuous Availability : Slow performance can drive a customer away and nothing is worse than
downtime. There is a difference between high scalability approach that RDBMS offer with master-slave
architecture and the continuous availability that NoSQL databases like Cassandra offer no downtime with
redundant copies of data are being spread throughout a cluster across multiple locations.
e Need for Location Independence : The ability to serve data quickly to multiple locations is critical.
Because of fundamental master-slave design, RDBMS struggles to provide fast read access to many
locations.
features over multiple machines for storing files. |" @ 44 Which of the following is a widé-column store?
(a), AMS EMS (a) Cassandra. (b) Riak
(c) File system (d) None of the mentioned (c) MongoDB (d) Redis Ans. 2 (a)
v Ans. : (a) Q.4.5 Why MongoDB is known as best NoSQL
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
Adwance Database (MU-Sem 5-Comp. NoSOQL Distribution Modol)....Page no. 4-17
(c) ObjectID is a 20-byte BSON type the key named post_text from the first document
(d) None of the mentioned “Ans, : (b) retrieved?
@.4.7 Which of the following language is MongoDB (a) db.posts.find(( | ,{_id:0, post_text:1))
Writlen in? (b) db. posts. findOne( (post_text:1))
(a) Javascript (by Cc (c) db.posts.finOne([ },(post_text:1})
(c) C++ (d) All of the mentioned (d) db.posts. finOne( ( }.(_id:0, post_text:1})
“Ana, 3 (d)
~ Ans, : (d)
a>
Q.48 Whatis the aim of NoSQL? Q. 4.15 What is true about Replication?
data
(a) Replication is the process of synchronizing
(a) Not suitable for storing structured data. across multiple servers.
es
data.
(b) Allow storing non-structured (b) Replication provides redundancy and increas
(c) New data format to store large datasets data availability with multiple copies of data on
different database servers.
(d) An alternative to SQL databases to store textual of
(c) Replication protects a database from the loss
data : (c)
~ Ans. a single server.
nosq] is ; (d) All of the above ~ Ans. : (d)
Q.493 The core princofiple
(2) Low availabilit y (b) High availability Q. 4.16 In MongoDB client, how to initiate a new replica
(c) Both AandB © (d) None of the above set?
“Ans.
: (b) (a) rs.initiate() (b) rs.conf()
(c) rs.status() (d) rs.new() ~ Ans. : (a)
Q.410 Which achitecm does NoSQL re
follow?
(a) Shared Memory Q. 4.17 is the process of storing data
(b) Shared Nothing records across multiple machines and it is
(c) Shared Disk MongoDB's approach to mecting the demands of
(d) Shared Nothing Architecture ~ Ans.: (d) data growth
(a) Shading (b) Config Servers
Q@.411 Which of the following is a NoSQL Database
(c) Query Routers (d) Projection Ans. : (a)
Type?
(a) SQL (b) JSON Q. 4.18 Single replica set has limitation of?
(c) Document databases (d) Alll of the above (a) 1ONodes (b) 12 Nodes
7 Ans.
: (c) (c) 8 Nodes (d) Infinite Nodes ¥ Ans. : (b)
Advance Database Management System (MU-Sem 5-Comp.) (NoSQL Distribution Model)....Page no, (4-1 8)
(d) Fetches the posts with likes between 100 and
Q, 4.28 Point out the wrong statement.
200, sets the _id of the first document as null
and then increments it 1 every time “Ans. : (a) (a) Map-reduce cannot have a finalize stage to
make final modifications to the result
Q. 4.21 Which of the following aggregation commands in
MongoDB does not support shaded collections? (b) Map-reduce is less efficient and more complex
(a) aggregate than the aggregation pipeline
(b) mapReduce
(c) group (c) Specifically, a user with the user Admin role
(d) Allofthe above Ans. : (c)
can grant itself any privilege in the database
Q. 4.22 is a binary serialization format used to
store documents and make remote procedure calls (d) All of the mentioned ~ Ans. : (a)
in MongoDB. Q. 4.29 The aggregation pipeline can use to
(a) BSON (b)GridFS improve its performance during some of its stages.
(c) JSON (d) None of the mentioned (a) indexes (b) OptmData
~Ans. : (a)
Q.4.23
(c) functions (d) all of the mentioned
Point out the correct statement.
v Ans, : (a)
(a) ObjectIds are small, likely unique, fast to
generate, and ordered 12 Byte Hexadecimal 'Q. 4.30 MongoDB uses the notation to access
the elements of an array and to access the fields of
number
an embedded document.
(b) Objectlds are large, likely unique, and ordered (a) Dot
(c) ObjectIds values consists of 18-byte ‘(b) Array
(d) Objectlds values consists of 8-byte ¥Ans. : (a) (c) Nested Sets
Q. 4.24 Which of the following data type is depreciated? (d) None of the mentioned ¥ Ans. : (a)
(a) Double (b) String Q. 4,31 MongoDB indexes use a data structure.
(c) Object (d) Undefined v Ans. : (d) (a) Hash
Q. 4.25 In the mongo shell, you can access the creation (b) Map
time of the Objectld, using the method. (c) B-tree
(a) getTime() (b) getTimestamp() (d) Red Black tree v Ans. : (b)
(c) Timestamp(Q) (d) None of the mentioned Q. 4.32 MongoDB uses indexes to index the
¥ Ans, : (b) content stored in arrays.
Q. 4.26 What is eventual consistency (a) single key (b) multi key
(a) At any time, the system is linearizable (c) compkey (d) none of the mentioned
(b) At any time, concurrent reads from any node v'Ans, : (b)
return the same values
Q. 4.33 A replica set can have only primary.
(c) If writes stop, all reads will return the same
(a) One (b) Two
value after a while
(c) Three (d) Many v Ans. : (a)
(d) If writes stop, a distributed system will become
consistent v Ans. : (c) Q. 4.34 MongoDB supports sharding through the
configuration of a sharded
Q. 4.27 are operations that process data
(a) shapes (b) clusters:
records and return computed results.
(a) ReplicaAgg (b) SumCalculation (c) clusters (d) Databases v Ans. : (b)
(c) Aggregations, (d) None of the mentioned
¥ Ans. : (c)
Chapter Ends...
O00
MODULE 5
NoSQL using
CHAPTER 5 _ MongoDB
Introduction to MongoDB Shell, Running the MongoDB shell, MongoDB client, Basic
NoSQL using MongoDB:
operations with MongoDB shell, Basic Data Types, Arrays, Embedded Documents.
operators and sorting, simple aggregate
Querying MongoDB using find() functions, advanced queries using logical
Concepts of replication and horizonal
functions, saving and updating document. MongoDB Distributed environment:
scaling through sharding in MongoDB.
ssesesg2cce 5-2
5.1 NoSQL using MONgoDB..........sssssscsssssessessssssssnseresssessseserenensesessessnesnaetansnenscsneseanesstanansaaneanensqaearassasearinnasernatess
tates 5-4
5.1.1 MOngoDB Client..........essessssesssssssesesssenessnensesncersseanssnanenensearsessessescecenacansssearsaseassnecasanenaneaceanenscusananensnuncgssess
mes 5-5
5.1.2 Comparative Analysis of SOL Database Objects and NoSQL Database ODSCIS ai ectite evr scsvesonduentareeettge
tes 5-5
5.1.3 Basic Operations with MongoDB Shell .........--:-sssssssssssessssnsrssecsssnssseseanserensienrseacenessssnanscnnaransansancanennanaaaneas
5-10
5.1.4 Basic Data Types in MongoDB.............5...4 secesnsnuecectuuuecssasecnennesanssssscnseascssenuesenanegnsonsgvectsuscenasscransenasasensssets
s
BAB AITAYS.cosecsnsecssssesssenrssnecsnssssssscessnecsnnenssnsersnvscessaeesnsecsansccsunecouscesssssssanssstsecansccaunacsussecganeqqanengnastnaantsassnassesseessnss 5-11
rie eae 5-12
5.2 — Querying MongoDB using find() functions ........ssesssseseresssssssneessseenussnsensetenrstesrsassssoneceesesensuusnunannnenenenngns
5.2.14 snes tees 5-19
Sorting im MOMQOMD........ssssessssseveesceesssnessesssssseseessnneseeecensascansuauanenansssinuaentassaccessnuaassenannanaranaaaannsanrss
5.2.2 ite
MongoDB Distributed Ennviriori ety ocoz bed ioxsde tieechedb ikcnecns eeepc
nnie att ctattecccect es henasitacaaseaseananconnats 5-21
s §-22
5.2.2(A) Replication in MOngODB ......-.sssesesssssssssssssssesscesceanssssneesseserersessnuassanannnnannnannannnnnnansnnegnanannnanannnaneanangnannangnneneee
sananraasanssy 5-22
5,2.2(B) Sharding Components .....ssosssssssssssersvssesnnereneettnsscssssnsnseretneeseaeereesnnneesnanennonenaunanenunanazza
* MongoDB, like other database scripting/query languages such as SQL, MySQL, and Oracle, offers excellent
performance, scalability, and availability for database management.
¢ MongoDB is a widely used NoSQL database that stores data in a JSON format. It's what gives Mongo DB its
scalability and flexibility.
= Introduction to MongoDB Shell
* The mongo shell is an interactive JavaScript interface to MongoDB. You can use the mongo shell to
query and update data as well as perform administrative operations.
* The mongo shell is included as part of the MongoDB server installation. If you have already installed the
server, the mongo shell is installed to the same location as the server binary.
MEC Windssessystemid cmd.ece « margaee Student - = ih
Ue Meeste
pe Ce ot
cute Ge
e Here you can see in above image when we type 3 + 4 mongodb shell which is JavaScript enabled and
showing addition of3 + 4=7.
e Let’s see How to start the shell and get connected with MongoDB database.
e After successful download you can get too connected with server but before that it is necessary that the
MongoDb server instance is running and started successfully. You can verify the MongoDb server instance
“mongod” is running on machine. Afterwards open command prompt and navigate to MongoDb
installation
directory up to bin folder and then type “mongo” command your client will get connect to MongoDb server.
e — Let’s see How to start the MongoDb database from binary distribution on windows
machine.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp. NoSQL using MongoDB)....Page no. (5-3
» Step 1: Open command prompt and navigate to MongoDb installation directory up to bin folder as shown
below.
g | —.|..fe} y i
BE CAWindows\system32\cmd.exe ae |
icrosoft Windows (Version 6.1. 7608]
fopyright (c> 2089 Microsoft Corporation. All rights reserved.
C:N\Users\admindE:
E:\>cad E:\8KN DATANDBMS Lab\nongodh-windous~64-3 .4.9\nongodh_3.4Nbin
IE:\SKN DATANDBMS Lab\mongodb-windows-64-3.4.9\mongodh_3 .4\binoL
> Step2: It is necessary to start the mongodb server first before running any client. The client and
server instances are as follows:
(1) Server Instance: - mongod
aid
ieee ae To ee
STUDS Pe ae
ae)
* You can see two red boxes in above image upper red box showing you command we need to type for starting
server the same command is as below,
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
_
Advance Database Management System MU-Sem 5-Comp. NoSQL using MongoDB)....Page no, (5-4
* Here mongod.exe will be a exe to start the server instance mongod and E:/stude
nt is folder created on hard
drive and it is being passed here while Starting server,
> Step 3 : Now as we have seen step 2 we have started mongodb server just keep the same command prompt
running and open new command prompt to run the client and start the client with the help of following
command “mongo.exe student”,
Soong .
CONTROL initandlisten
I CONTROL ete * WARNING: Access control is |
I CONTROL not enabled for the database.
[initandlisten] =* CU en een Met ett)
CONTROL [initandlisten |
eh aces Eberle epee Hotfix KB2731284 or later update is not installed, |
eh UE te) Se abe ReteTp will zer
¢ Here in this above image you can see we have started client by using mongo.exe student this command and
it
is started because our server is running on port number 27017 on local host and directory (student) get
configured. Everything is fine.
* Hence we have successfully started the mongodb server and client through the mongodb shell let’s try
various CRUD operation on mongodb database in next section.
e As above we have seen the server started in one command prompt with the help of proper command
and if
everything is fine it is ‘started on local host on port no 27017. Now the time is to run the client
we have to
run the client and ultimately it will open a connection with the mongodb server running
on a port it
connection is successful then we will get a access to the database directory selected at the time
of Starting the
server.
¢ We can start running client instance mongo by just executing mongo.exe in
separate command prompt and
while calling select the database directory where we need to connect.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp.) NoSQL using MongoDB)....Page no. (5-5
Be CW
e In above image the command shown on red box is to start the mongodb client instance and it will be going
to connect with mongodb server which is already running on 27017 port on local host.
2S 5.1.2 Comparative Analysis of SQL Database Objects and NoSQL Database Objects
e The comparative analysis for the various database objects of SQL and replaced with NoSQL databases
areas below:
Table Collection
Rows/Records/Tuples | Documents
e As per above the various objects are identified with respect to SQL and NoSQL databases.
(2) To display the help manual for mongodb commands you can use,
db.help ();
It show the help options for a collection methods also by following way,
db.<Collection_name>.help();
(3) To display the list of databases in mongodb:-
showdbs; .
or
show databases;
(4) To display the list of collections from the current databases:
show collections;
(5) To display the list of users of current database:-
show users;
(6) To display the various roles of the users from current databases:-
show roles;
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) =: Tech-Neo Publications..A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp.)
———————————————
oe
(NoSQL using MongoDB)....Page no. (5-7)
(7) To create new database in mongodb database: - Let’s create Books database.
use Books
This command will create Books database in mongodb and select the same as a current database. Please
note one thing here until and unless you don’t have any collection created in the empty database it will
not be displayed in the list shown by show dbs command;
(8) To create collection in database we may use below command:-
db.createCollection(“Collection Name”);
BG C\Windows\system32\c
Here we have created new collections on the Books database we have created above show execute show
dbs () as well as show collections () commands and notice the difference now Books database is
displayed in the list.
’
‘ee
MY CAWindows\system32\cmd.exe - mongaexe Stucent CS beetle
(9)
c
{
ME C\Windows\syster3Zi.cmdene - mongo ere Student
MongoDB Enterprise > show dbs;
baat) Pes 165)
Student PL se)
CRRA: 1e:)
FATE)
RSF e es)
®.078GB
Erste)
CRA Tes]
MoneoNR. Enterprise >» show collections;
DBMS_Books
eee we tires yt
fongoDB Enterprise >
(1) so far we have created database then we have seen different database operations and now let’s try to insert
the data in the collection created with name DBMS_Books. Note one thing as we have discussed above as,
when we insert data in mongodb it will get inserted as a document just like inserting rows in SQL databases.
Let’s see few examples.
MongoDB Enterprise >db.DBMS_Books.insert({Rook_id : 2, Book_Name : "Complete Guide to DBMS",
Author:"Desai", Edition : 4});
WriteResult({. "nInserted" : 1 })
i a a ore a ’
> Se
You can see above. we have inserted one document in the mongodb database and while writing insert
operations we have written as db.Collection_name.insert({}), first we have used
db object which is
specifically instance of the currently selected database then
collection_name for in which collection we
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
Advance Database Management System (MU-Som 5-Comp.) (NoSQL using MongoDB)....Page no, (5-9)
suppos
suy - ” ins
and the records ; and followed
7 by data in‘ the form of key: value pair we have written on
parenthesis () while all key value pairs are written inside curly brackets (}.
db.Collection_name.find()
: “"Insteduction to
omplete Guide
and when we
Here you can see when we have added data in the collection it is inserted into document
is having _id
displayed it the same will get displayed as above. The most important part is every document
al number for every
field and along with Objectld added with it. This object id is 12 byte hexadecim
document added by mongodb database only.
of different information such as,
This 12 byte hexadecimal Object id is unique and it’s a combination
_id: ObjectId(4 bytes timestamp,
3 bytes machine id,
2 bytes process id,
3 bytes incrementer)
way.
(3) Display the documents in formatted
db.collection_name.find().pretty();
the document in the form of key value pair in a
This pretty function is used to display the content of
formatted way such as below,
(MU-New Syllabus wieif academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
System (MU-Sem 5-Comp. NoSQL usingi M ongoDB)....Page no, 5-10
(5-
Advance Database Management
"aid" :Objectld("60e7a377632c042c266f6cb8"),
"Rook_id" : 2,
"Book_Name" : "Complete Guide to DBMS",
"Author" : "Desai",
"Edition" : 4
As we can see the difference with pretty() function when it is used display content with find() function it will
be displaying the data in formatted way.
(1) String : The String is the most commonly used data type to store the data, String in MongoDB must be
UTF-8 valid.
(2) Integer : The Integer type is used to store a numerical value. Integer can be 32 bit or
64 bit depending
upon your server.
(3) Boolean : The Boolean type is used to store a Boolean (true/ false) value,
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) le] Tech-Neo Publications..A SACHIN SHAH Venture
System (MU-Sem 5-Comp. NoSQL using MongoDB)....Page no. (5-11
Advance Database Management
(4) Double : The Double type is used to store floating point values.
(5) Min/ Max keys : The Min/Max type is used to compare a value against the lowest and highest BSON
elements.
(6) Arrays : The Array type is used to store arrays or list or multiple values into one key.
(7) Timestamp : The timestamp. It can be handy for recording when a document has been modified or added.
(8) Object : This data type is used for embedded documents.
(14) Code: This data type is used to store JavaScript code into the document.
(15) Regular expression : This data type is used to store regular expression.
A tose
teh te atesthak et
Introduction to arr
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
NoSQL using MongoDB)....Pago no. 5-13
Advance Database Management System (MU-Sem 5-Comp.
and NoSQL operation and already we have discussed
As we have gone through the many SQL operations
nt ways to fetch the
about the finding information from the table in variety of ways now let's see the differe
data from the Mongodb collections.
"id" :Objectld("60e72377632c042¢266f6cb8"),
"Rook_id" ; 2,
"Book_Name" : "Complete Guide to DBMS",
"Author" : "Desai",
"Edition" : 4
],
"Edition" : 5
}
MongoDB Enterprise >
In above query we have fetched all the documents from the DBMS_Books collection and as we have already
discussed pretty() function is used to display the documents in formatted way,
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications... SACHIN SHAH Venture
Advance Database Manage
__
ment tem (MU-Sem 5-Comp. NoSQL using MongoDB)....Page no, 5-14
(2) To find a specific document from the collection with a specific condition we
can use below command,
MongoDB Enterprise >db.Student.find({Marks: 56}).pretty();
{
"id" *Objectld("59b96d9d3fca9f8e61527676"),
"StudnetName’ : "Pramod",
"Section" : "c",
"Marks" : 56,
"AdmissionDate" :ISODate("2017-09-13T17:40:45.1287"
')
}
MongoDB Enterprise >
In above example we are Supposed to find the studen
ts who have secured 56 marks in the examination
why we have written a condition in find function that’s
such as Marks : 56 it is in the form of key : value
pair and it
is displaying result of student matched with the specified
criteria.
(2) Demonstrate the use of findOne() : Simil
arly with find() function findOne() function
displays the first
document from the collection.
MongoDB Enterprise > db.Student findOne();
{
"id" ‘Objectld("59b2d719610568336449e0c9
"),
"StudentName" : "Tarun",
"Section" : "A",
"Marks" : 105 >
"Subject" : [ ],
"AdmissionDate” :ISODate("2017-09-13T1 7:37:09.0222")
}
MongoDB Enterprise >
(3) Sort the documents in ascending or descending order :
For example :- Lets display the documents according to ascending order of Marks.
MongoDB Enterprise >db.Student.find().pretty().sort({Marks : 1});
{
"_id" :ObjectId("59b96d9d3fea98e61527676"),
"StudnetName" : "Pramod",
"Section": "co",
"Marks" : 56,
"AdmissionDate" :ISODate("2017-09-13T17:40:45.1282Z")
"id" :Objectld("59b96d863fca91B661527675"),
"StudnetName" : "Atish",
"Section" : "B",
"Marks" : 78,
"AdmissionDate" :ISODate("2017-09-13T17:40:22.257Z')
"id" :Objectd("59b2d7196f0568336449e0c9"),
"StudentName" : "Tarun",
"Section": "A", -
"Marks" : 105,
"Subject" : []; :
222")
"AdmissionDate" JSODate("2017-09-13T17:37:09.0
}
MongoDB Enterprise >
only display the number of
the output of find() function and limit can
(4) We can use the limit() function to filter
on.
documents specified with the limit functi
use the limit function as,
ple we want to disp lay only 2 documents from the collection we can
For exam
"Subject" : [ ],
"AdmissionDate" ISODate("2017-09-13T17:37:09.022Z") :
}
{
"_id" :;ObjectId("59b2d7266f0568336449e0ca"),
"StudentName" : "Saurabh",
"Section" : "A",
"Marks" : 95,
"AdmissionDate" :ISODate("2017-09-13T17:37:09.0222Z")
}
{
" _id" :Objectld("59b96d863fca9f8e61527675"),
"StudnetName" : "Atish", .
"Section" : "B",
. "Marks" ; 78,
"AdmissionDate" ISODate("2017-09-13T17:40:22.2572Z")
}
A.
"id" :Objectld("59b96d9d3fca9f8e61527676" ;
-"StudnetName" : "Pramod’,
"Section" : "ce",
"Marks" : 56,
"AdmissionDate" 1SODate("2017-09-13T17:40:45.1287")
}
MongoDB Enterprise >
advance Database Management System (MU-Sem 5-Comp.) (NoSQL using MongoDB)....Page no. (5-17)
MongoDB Enterprise >db.Student.find().limit(2).pretty();
{
" id" :Objectld("59b2d719610568336449e0c9"),
"StudentName" ; "Tarun",
"Section" ; "A",
"Marks" : 105,
"Subject" : [],
"AdmissionDate" :ISODate("2017-09-13T17:37 :09.022Z")
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp,) (NoSQL ualng MongoD6) ...Page No, (5-1 8)
"AdmissionDate" TSODate("2017-09-19T17:37:09,0222")
}
MongoDB Enterprise >
Display the students record whose marks are below 80 juat use
following
db.Student.find({Marks : {$lt : 80 }}),pretty();
r¥ Advanced queries using logical operators and sorting
* We can use the logical operators in MongoDB database when ever some logical relation we required such as,
* AND and OR operators we can use and the syntax to use logical operators as below,
* — Let’s select a range of students whose marks is greater than 50 and less than 80
tS $AND operator in mongodb :
e —Let’s select the students whose marks is either 56 or 78 and display the content in formatted way,
The MongoDB $not operator performs a logical NOT operation on the given expression and fetches selected
documents that do not match the expression and the document that do not contain the field as well, specified in
the expression.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications,.A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp. NoSQL using MongoDB)....Page no. (5-19
tr Syntax
{
"_id" :ObjectId("59b96d9d3fca9f8e6 1527676"),
"StudnetName" : "Pramod",
"Section": "c",
"Marks" : 56, ry
"AdmissionDate" ‘ISODate("2017-09-13T17:40:45.128Z")
}
MongoDB Enterprise >
Sorting the documents in mongodb we can arrange the documents in ascending order or descending order
depends upon the requirements,
db.student.find().sort({Marks : 1}).pretty()
we have already executed in last point and we have seen Marks :1 will display the
Above statement
documents in ascending order of marks.
db.student.find().sort({Marks : -1}).pretty()
Marks :-1 will display the
Above statement we have already executed in last point and we have seen
documents in descending order of marks.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp. NoSOL using MongoD®)...Page ne. (5-20
ES Simple aggregate functions
Data records are processed and computed results are returned through aggregation processes. Aggregation
operations combine values from several documents into a single result and can execute a number of functions on
the gathered data. Aggregation is performed in MongoDB using three approaches: the aggregation pipeline, the
map-reduce function, and single-purpose aggregation methods.
EF Aggregation
MongoDB’s aggregation pipeline framework is modelled on the basic concept of data processing pipelines
when we aggregate the documents enter into multi stage pipeline that transforms the document into aggregated
results.
To CAWindows\systemi2\ond
exe - mongo.exe Student
You can see in the above example we have executed simple aggregate operation in first half of the query
$match will start matching the records with given key over there such as we have given {Section: A}, so all
records who has Section: A grouped together and in second half we have aggregated the result by taking sum of
mark of students who have Section: A.
db.collection.save( )
t= Parameters
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) eI Tech-Neo Publications..A SACHIN SHAH Venture
Advance Datebese ft MU-Sem t NoSQL usin MongoDB)....Page no. (5-21
In the following example, save(method performs an insert since the document passed to the method does not
contain the _id field :
Update operations modify existing documents in a collection. MongoDB provides the following methods to
update documents of a collection:
In MongoDB, update operations target a single. collection. All write operations in MongoDB are atomic on
the level of a single document.
You can specify criteria, or filters, that identify the documents to update. These filters use the same syntax as
read operations.
MongoDB Enterprise>db.Student.updateOne({StudentName :"Atish"}, {$set : {Marks : 70}});
MongoDB is the leader in a new generation of databases that are designed for scalability. With a technique
called “sharding” you are able to easily distribute data and grow your deployment over inexpensive hardware or
in the cloud. One of the benefits of scaling with MongoDB is that sharding is automatic and built into the
database. This relieves developers of having to build in sharding logic into the application code to scale out the
system. Concepts of replication and horizontal scaling through sharding in MongoDB.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) © Tech-Neo Publications..A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp. NoSQL using MongoD8) Page no. (5-22
@& 5.2.2(A) Replication in MongoDB
A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets
provide redundancy and high availability, and are the basis for all production deployments. This section
introduces replication in MongoDB as well as the components and architecture of replica sets.
* Sharding is a process of splitting up the large scale of data sets into a chunk of smaller data sets across
multiple MongoDB instances in a distributed environment.
¢ MongoDB sharding provides us scalable solution to store a large amount of data among the number of
servers rather than storing ona single server.
e In practical terms, it is not feasible to store exponentially growing data on a single machine. Querying a huge
amount of data stored on a single server could lead to high resource utilization and may
not provide
satisfactory read and write throughput.
* Basically, there are two types of scaling methods that exist to undertake growing data with the
system:
* Shard is a Mongo instance to handle a subset of original data. Shards are required to be
deployed in the
replica set.
¢ Mongos is a Mongo instance and acts as an interface between a client application and a sharded cluster. It
works as a query router to shards.
* Config Server is a Mongo instance which stores metadata inform
ation and configuration details of cluster.
MongoDB requires the config server to be deployed as a replica set.
(MU-New Syllabus w.e.f academic year 21-22)(MS-68) el Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp. NoSQL using MongoDB)....Page no, (5-23
(3) Replication requires high-end hardware or vertical scaling for handling large datasets, which is too expensive
compared to adding additional servers in sharding,
(4) In replication, read performance can be enhanced by adding more slave/secondary servers, whereas, in
sharding, both read and write performance will be enhanced by adding more shards nodes.
>» 5.4 MULTIPLE CHOI Q. 5.4 What is the interactive shell for MongoDB called?
(a) mongo (b) ‘mongodb
Q. 5.1 Which of the following is not a NoSQL database? (c)dbmong = (d)_snone of the mentioned
(a) SQL Server (b). MongoDB - | tiee't Y Ans. ; (a)
(c) Cassandra (d) mariadb = Ans.: (a) | Q. 5,5 provides statistics on the per-collection
Q. 5.2 “Sharding” a database across many server instances level.
can be achieved with (a) mongosniff (b) mongotop
(a) LAN (b) SAN (c) mongooplog = (d)_mongofiles ~ Ans.
: (b)
(c) MAN (d) All of the mentioned Q.5.6 is a command-line’ tool that displays a
Y Ans.
: (b) summary list of status statistics for a currently
running MongoDB instance.
Q.5.3 In our posts collection, which command can be used (a) mongostat
to find all the posts whose author names begin lie
between "A" and "C" in dictionary order? (b) mongotop
(a) db.posts.find( { post_author: { $gte:"A" , $lte: _ (c) mongooplog
Tell } } ); (d) mongofiles v Ans. H (a)
(b) db.posts.find( ( post_author : { $gte:"C". $lte: | @ 57 Mongo looks for a database server listening on port
"A" } }); 27017 onthe interfa
___ ce.
(c) db.posts.find( { post_author ; { Sgt; "A", Slt: (a) web (b) localhost
"Cc" y })s (c) web host (d) all of the mentioned
(d) This type of search is not supported by ¥ Aacsith)
MongoDB. $lt and $gt operators are
¥ Ans, : (a)
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications...A SACHIN SHAH Venture
Advance Database Management System (MU-Sem 5-Comp.
Q.5.8 After starting the mongo shell, your session will use Q. 5.17 The method limits the number of
the database by default. documents in the result set.
(a) mongo (b) master (a) limitQ) (b) limitOf()
(c) test (d) primary v Ans. : (c) (c) limitByQ (d) none of the mentioned
Q. 5.9 v Ans. : (a)
command display the list of databases.
(a) show db (b) show dbs Q.5.18 Which of the following line skips the first 5
(c) show data (d) display dbs ~ Ans. : (b) documents in the bios collection and returns all
remaining documents?
Q. 5.10 Which of the following operation is used to switch
to new database mydb ? (a) db.bios.find().limit( 5 )
(a) use dbs (b) db. bios.find().skip( 1 )
(b) use db
(c) use mydb (c) db.bios.find().skip( 5)
(d) use mydbs. “v Ans. ; (c)
(d) db.bios.find().sort( 5 ) Y Ans. : (c)
Q.5.11 Which of the following also retums a list of
databases? Q. 5.19 A query may include a that specifies
(a) show databases the fields from the matching documents to return.
(b) show database
(c) display dbs (a) selection (b) projection
(d) all of the mentioned
(c) union (d) none of the mentioned
Y Ans.
: (a)
Y Ans. : (b)
Q. 5.12 Command to check list of collection is
(a) show collection (b) show collections Q. 5.20 Point out the correct statement.
(c) show collect (a) Secondary indexes allow applications to store a
(d) none of the mentioned
view of a portion of the collection in an efficient
¥ Ans. : (b) data structure
Q. 5.13 When you query a collection, MongoDB returns a (b) MongoDB has full support for secondary
object that contains the results of the indexes
query. (c) Most indexes store an ordered representation of
(a) row (b) cursor all values of a field ora group of fields
(c) colums (d) none of the mentioned (d) All of the mentioned Y Ans, : (b)
v Ans.
: (b) Q. 5.21 MongoDB stores all documents in
Q. 5.14 Which of the following method returns true if the (a) tables (b) collections
cursor has documents? (c) rows (d) all of the mentioned
(a) hasMethodQ) (b) hasNextQ) ¥ Ans.
: (b)
{c) hasDoc() (d) all of the mentioned
Q. 5.22 Which of the following operation adds a new
~ Ans, : (b) document to the users collection?
Q. 5.15 method renders the document in a (a) add (b) insert
JSON-like format. (c) truncate (d) drop ~ Ans. : (b)
(a) displayjson (b) print Q. 5.23 Which of the following preference determines how
(c) printjson (d) printdoc ¥ Ans. : (c) the client direct read operations to the set?
Q. 5.16 Which of the following method is called while (a) read (b) write
accessing documents using the array index (c) update (d) delete
notation? ¥ Ans. : (a)
Q. 5.24 Applications can also control the behav
(a) cur.toArray() ior of write
operations using concern,
(b) cursor.toArray() (a) read
(c) doc.toArray() (b) write
(d) all of the mentioned Y Ans. : (b) (c) truncate
(d) all of the mentioned ¥ Ans. : (b)
Chapter Ends...
Qo00
MODULE 6
Trends in Advance
| CHAPTER 6 Databases
Temporal database: Concepts, time representation, time dimension, incorporating time in relational databases.
Graph Database: Introduction, Features, Transactions, consistency, Availability, Querying, Case Study Neo4J
6.2.7
6.3 Spatial database: Introduction, data types, models, operators ANd QUOTES .........ussecessssesteseensseetsstenesesnenesesesteneeeenes 6-14
6.3.1 Spatial Data Types .....c.cciseecevesessessesserensssscessecaniesssansneeneanessescaneenssensesrness e te area 6-14
G.3.2 Spatial OPOratOrs ..:...ccceecsessersteccseseendisctababegsedisnsnssensenlensaaseveduqeqarieniangeasycectioqecguanajeuadsaunasercanquctactnansesdedeaqctiaes 6-15
6.3.3 Models of Spatial Information .........cseceeereserennsee enue renerinsieieassaeassereceaasseesennertonsenasennseees cee 6-16
ua. Explain different types of spatial data models. [[UIUMCRSEMED)............ssssssssscstsssessseceneesssseeesesasssnsnseeees 6-16
6.4 —_ Descriptive QUESTIONS .......sesesececstetssestieneenereeenteeens
6.5 Multiple choice Questions
“¢ Chapter Ends ........... cece Serceravanereecuscensueaiaonevsaseness)anenesnesaafiliisaestbeeianel gaadivencdmeanraseanaacenrasgtonnengenestenseensate
...Page no. (6-2
Databasese
rends in AdvanceS
Advance Database Management tem (MU-Sem 5-Comp.
CEs s oS SooSe=
oo ee re -_= =- a=
eee ee ee OE ee
Consider examples below where time is used to store data for analysis
Patient database must store information about the medical history of patient.
Judicial records.
Various sensory information. So we define a Temporal database — “Database that stores the states of real
world across time”.
© Bi-temporal Data,
EMP_VALID
Name NIN Salary Dept no VST VET
e The temporal data types which specifies date with Year, Month, and Day as YYYY-MM-DD, TIME
(specifying Hour, Minute, and Second as HH:MM:SS), TIMESTAMP. (specifying a Date/Time
combination, with options for including sub-second divisions if they are needed), INTERVAL (a relative
time duration, such as 10 days or 250 minutes), and PERIOD (an anchored time duration with a fixed
starting point to end)
e A temporal database will store information concerning when certain events occur, or when certain facts are
true. The events or facts are typically associated in the database with a single time point in some granularity.
e For example, a bank deposit event may be associated with the timestamp when the deposit was made, or the
total monthly sales of a product (fact) may be associated with a particular month (say, February 1999). Note
that even though such events or facts may have different granularities, each is still associated with a single
time value in the database. Duration events or facts, on the other hand, are associated with a specific time
period in the database
e Forexample, an employee may have worked in a company from August 15, 1993 till November 20, 1998. A
time period’ is represented by its start and end time points [start-time, end-time].If the above period is
represented as [1993-08-15, 1998-11-20]. Such a time period is often used to mean the set of all time points
from start-time to end-time, inclusive, in the specified granularity. Hence, assuming day granularity, the
period as [1993-08-15, 1998-11-20] represents the set of all days from August 15, 1993 until November 20,
1998.
A temporal database using this interpretation is called a transaction time database. Other interpretations can
also be intended, but these two are considered to be the most common ones, and they are referred to as time
dimensions, In some applications, only one of the dimensions is needed and in other cases both time
dimensions are required, in which case the temporal database is called a bitemporal database. If other
interpretations are intended for time, the user can define the semantics and program the applications
appropriately, and it is called a user-defined time.
Consider the example of a person, John: John was born on April 3, 1992 in Chennai. His father registered his
birth after three days on April 6, 1992. He did his entire schooling and college in Chennai. He got a job in
Mumbai and shifted to Mumbai on June 21, 2015. He registered his change of address only on Jan 10, 2016.
_ SQL support data types that is used to integrate time with data. That data types are as date: four digits for the
year (1--9999), two digits for the month (1--12), and two digits for the date (1--31).
Time: Two digits for the hour, two digits for the minute, and two digits for the second, plus optional
fractional digits.
Timestamp : the fields of date and time, with six fractional digits for the seconds field.
o Incorporating time in relational databases
© Incorporating Time in Relational Databases Using Tuple Versioning
Valid Time Relations
The valid time temporal database contents looks look like as shown below with the attributes as Name,
City,
Valid From, Valid Till
¢ Let us now see how the different types of temporal databases that may be represented in the relational
model. First, suppose that we would like to include the history of changes as they occur in the real world.
EMP_VT
DEPT_VT
e Consider again the database emp and dept and consider that the granularity level is day. Then, we could
convert the two relations EMPLOYEE and DEPARTMENT into valid time relations by adding the
attributes VST (Valid Start Time) and VET (Valid End Time), whose data type is DATE in order to
provide day granularity and_ the relations renamed EMP_VT and DEPT_VT, respectively as shown in
Fig 6.1.3
e If update is applied to the database before it becomes effective in the real world, then called a proactive
update. If the update is applied to the database after it becomes effective in the real world, it is called a
retroactive update: An update that is applied at the same time as it becomes effective is called a
simultaneous update.
e The action that corresponds to deleting an employee in a nontemporal database would typically be applied to
a valid time database by closing the current version of the employee being deleted.
Transaction Time Relations
* Ina transaction time database, whenever a change is applied to the database, the actual timestamp of the
transaction that applied the change (insert, delete, or update) is recorded.
e Such a database is most useful when changes are applied simultaneously in the majority of cases for
example, real-time stock trading or banking transactions.
e If we convert the nontemporal database into a transaction time database, then the two relations EMPLOYEE
and DEPARTMENT are converted into transaction time relations by adding the attributes TST (Transaction
Start Time) and TET (Transaction find Time), whose data type is typically TIMESTAMP.
* A transaction time database has also been called a rollback database, 18 because a user can logically roll
back to the actual database state at any past point in time T.
* There are various options for storing the tuples in a temporal relation.
tables: one for the currently
* One is to store all the tuples in the same table and another option is to create two
valid information and the other for the rest of the tuples.
* The tuple versioning approach is already discussed for implementing temporal databases.
* In this approach, whenever one attribute value is changed, a whole new tuple version is created, even though
all the other attribute values will be identical to the previous tuple version. An alternative approach can be
used in database systems that support complex structured objects, such as object databases or object-
relational systems. This approach is called attribute versioning.
In attribute versioning, a single complex object is used to store all the temporal changes of the object. Each
attribute that changes over time is called a time varying attribute.
It has its values versioned over time by adding temporal periods to the attribute.
The temporal periods may represent valid time, transaction time, or bitemporal; depending on the application
requirements.
Uni-Temporal Relations : Has one axis of time, either Valid Time or Transaction Time.
6.1.4 Bi-Temporal Relation (Data Using Both Valid and Transaction Time)
A bi-temporal database which includes both the valid time and transaction time. Transaction time records the
time period during which a database entry is made. So, now the database will have four additional entries the
valid from, valid till, transaction entered and transaction superseded.
The database contents will look aks shown below: Name, oy Valid xin, Valid Till, Entered, Superseded
2 6.2.1 Introduction
A graph database is an online database management system with Create, Read, Update and Delete
(CRUD) operations working on a graph data model. Data represented as a graph n Collection of vertices
(nodes) and edges n Possible to store data associated with both individual nodes and individual edges.
For example, Twitter’s data can be easily represented as a graph because of a small network of followers.
The relationships are key here in establishing the semantic context: namely, that simran follows john, and
that john, in turn, follows simran. Ruth and john likewise follow each other. So it is easy to show all this
connection with the help of graph database. A graph is composed of two elements: a node and a
relationship. Each node represents an entity (a person, place, thing) and each relationship represents how
two nodes are associated.
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publications..A SACHIN SHAH Venture
advance Database Management System (MU-Sem 5-Comp. Trends in Advance Databases) ....Page no. (6-
e This general-purpose structure allows you to model all kinds of scenarios — from a system of roads, to a
network of devices, to a population’s medical history or anything else defined by relationships.
va. 6.2.2 Features of Graph Database
i. Performance
Your data volume will definitely increase in the future, but what's going to increase at an even faster
definitely get bigger, but
clip is the connections (or relationships) between your data. Big data will
connected data will grow exponentially.
the number and depth of
In the traditional databases, relationship queries come to a grinding halt as
even as your data grows
relationships increase. In contrast, graph database performance stays constant
year over year.
2. Flexibility
the speed of business because the
With graph databases, your IT and data architecture teams move at
and industry change. Your tearm
structure and schema of.a graph data model flex as your solutions
(and then exhaustively remodel and
doesn’t have to exhaustively model your domain ahead of time
add to the existing structure without
migrate the DB after some exec asks for a change); instead, you can
endangering current functionality.
the
With the graph database. model, you are the one dictating changes and taking charge; whereas
RDBMS data model represents to its tabular way of seeing the world.
3. Agility
Nodes
Developing with graph technology aligns perfectly
with today’s agile, test-driven development practices, ——s Relationships
allowing your graph-database-backed application to
evolve with-your changing business requirements.
Your agile team now has a database that keeps up with
your daily demands.
The main building blocks of Graph DB Data Model are:
o Nodes
o Relationships
(iF3) Fig. 6.2.1 : Graph DB Data Model
o Properties
2. Soft State
Neo4j is a popular Graph Database. Other Graph Databases are Oracle NoSQL Database, OrientDB,
HypherGraphDB, GraphBase, InfiniteGraph, and AllegroGraph.
t= Querying
The Cypher is the Query Language of Neo4j .Neo4j has a high-level query language, Cypher. There are
declarative commands for creating nodes and relationships (see Figures 24.4(a) and (b)), as well as for
finding nodes and relationships based on specifying patterns.
ced the CREATE command in the
Deletion and modification of data is also possible in Cypher. We introdu
other features of Cypher. A Cypher
previous section, so we will now give a brief overview of some of the
clauses, the result from one clause can be the input to
query is made up of clauses. When a query has several
the next clause in the query.
Cypher Keywords
r
the same way there are a few key words in Cyphe
Most of the programming languages have keywords in
need to be able to create, read, update, or delete data
reserved for specific actions in parts of a query. We
that functionality.
Neo4j, and keywords help us accomplish
: A.MATCH B. RETURN
Let us check in detail with two common keywords
A. MATCH
g node, relationship, label, property, or pattern
The MATCH keyword in Cypher is used to search for an existin
MATCH works like SELECT in SQL.
in the database. If it is similar with SQL
lar node, find all the nodes with a particular
You can find all node labels in the database, search for a particu
and much more using MATCH.
relationship, look for patterns of nodes and relationships,
RETURN
or results you might want to return from a Cypher
The RETURN keyword in Cypher specifies what values
onships, node and relationship propertiesor, patterns in
query. You can tell Cypher to return nodes, relati
doing write procedures, but is needed for reads.
your query results. RETURN is not required when
In order to
earlier become important when using RETURN.
The node and relationship variables we discussed in your MATCH
back nodes , relat ionsh ips, prope rties , or patterns, you need to have variables specified
bring
clause for the data you want to return.
Cypher query examples
keywords.
have learned so far using MATCH and RETURN
Let us look at some examples of the syntax we
have an image below of the
ion of what we are trying to achieve and
Each example will start with an explanat
wser.
results of the query run in Neo4j Bro
SHAH Venture
21-22)(M5-68) fl Tech-Neo Publications...A SACHIN
(MU-New Syllabus w. ef academic year
Advaiics Dated Ma rent System (MU-Sem 5-Comp.) Trends in Advance Databases). ...Page no. (6-10
Example 1
* Find the labeled Person nodes in the graph. Note that we must use a variable like p for the Person node if
we want retrieve the node in the RETURN clause,
o MATCH (p:Person)
o RETURN p
o LIMIT
Example 2
¢ Find Person nodes in the graph that have a name of 'Tom Hanks’. Remember that we can name our variable
anything we want, as long as we reference that same name later.
© Query can be written in cypher as :
MATCH (tom:Person {names Tom Hanks'})
RET URN tom *
(Link for more queries-https://neo4j.com/developer/cypher/querying/)
t& 6.2.6 Neo4] Database Server Setup with Windows exe File
> Step 1: Visit the Neo4j official site using https://neo4j.com/. On clicking, this link will take you to the
homepage of neo4j website.
eared Ms aq x
Neo he tana fae we} 1
CFite
AG. doer fenteten,
Gjneebipyiiue § Saye @aalom teste eden Gate lea FF Gta ma: | Ce batman
> Step 2: As highlighted in the above screenshot, this page has a Downlo
ad button on the to p right hand side.
Click it.
» Step 3: This will redirect you to the downloads page, where you can download the community edition and
the enterprise edition of Neo4j. Download the community edition of the software by clicking the respective
button.
ma - go xX
£1 Download Neotj 3.11. x \Ga,
neox a
ata! gemer®
» & scale-but copablies,
€ |
COE Scare nipcimeo geomiw at) & ) OBE bookmarst
theweirs: —» | Other
St Apps [New Tabby Yahoo Googe @ iaaScpt
o
@Hreoy +
® Naot} 3.1.1
igesigr inacty ee mele seer coe co eae agsdsdase
- (@ neot)-community windows-x64,3.1.1-ere
w# Quick access
I
BB Destop
*&
H Downloads
&
=| Pictures
4
&, Google Drive
8) Documents +
>»
Titer {= &)
Select the folder where you would like Neo4| Community Edition to be installed, then click
Next, ( [iy
Browse. |
» Step 6: Accept the license agreement and proceed with the installation. After completion of the process, you
can observe that Neo4j is installed in your system.
https://www.tutorialspoint.com/neo4j/neo4j_environment_setup.htm)
t= Cisco Systems
“Real-Time Graph Analysis of Documents Saves Company Over 4 Million Employee Hours”, The sales
team at Cisco Systems relies on an extensive series of documents that help them close deals
with potential
customers. By using Neo4j, Cisco was able to create a metadata graph to make relevant sales content
findable,
saving the company millions of hours of otherwise-wasted staff time
(MU-New Syllabus w.e.f academic year 21-22)(M5-68) Tech-Neo Publicati ons...A SACHIN SHAH Venture
aavance Database, Management System (MU-Sem 5-Comp
——=
The company
Cisco Systems is : Orin IT leader that designs, manufactures and sells networking equipment to
enterprise and service providers, small businesses and individuals. With more than 70,000 employees in over
165 countries, they are constantly working to create and patent new networking technologies. An integral
ng with to identify their needs and
part of their DNA. is creating long-lasting customer partnerships, worki :
provide solutions that support their success.
i The Challenge
as documents, files
. Because of the scope of Cisco’s sales pipeline, there is a huge amount of content — such
to sign potential customers
and presentations — in their internal database that Cisco’s sales team relies on
rson spent up to one hour every day
However, there was a major content findability problem: Each salespe
trying to find the content relevant to their prospects’ needs.
employees could search with a
» The company was relying on a typical index-driven search engine their
d metadata, it was a challenge to pull up relevant
series of keywords. But because files didn’t have assigne
understanding of the content.
content. The problem was too much content, and no deeper
i The Strategy
ata to
job ahead of them. They would have to assign metad
To address their findability issue, Cisco had a big team
ntional document browsing smarter so their sales
all of their content and find a way to make conve would also need to
d routes to get to the relevant content. They
wouldn’t have to go through long, complicate :
ical files and tag new documents in real time.
assign metadata tags to a huge library of histor
'F The Solution
ction of Cisco’s
to solve these challenges. To assign metadata to the large colle
¢ Cisco turned to Neo4j
as Microsoft Word and PDF — into
was to transform the file types — such
historical documents, the first step
clustered by large data platforms.
t Diri chle t allo cati on (LDA ), format so the documents could be
an laten phrases were fed into Neo4j,
ment s were clus tere d, a coll ection of common keywords and
Once the docu
logy.
where they were combined to create an onto
ment system to a
pro ces sin g, the doc ume nt is sent from the content manage
* For real-time document and phrases into
repr oces ses the doc ume nt, assigns tags and adds the keywords
machine tagging service that The ability to assign metadata
aba se whil e retu rnin g the doc ument to the document repository.
the Neo4j dat ent findability problem.
to historical data — and in real time — solved Cisco’s cont
of times the
on keywords, content ratings and the number
took it one step further. Based
* But Neo4j ons, providing sellers
Neo 4j was also able to provide content recommendati
document has been accessed,
leverage when closing deals with customers.
with additional information they could
The Result
focus on
that sav es thei r staf f tim e and increases their ability to
rch engine
* Now Cisco has a robust sea are in turn more accurate and effective. Wit
h
hav e few er sea rch resu lts whi ch
additional customers. They
done in half the time.
about 20 million documents, search is
SHAH Venture
ionsions
icat
: eo Publicat
ll Tech-N ..A..A SA SACHIN
21-22)(M5-68)
(MU-New Syllabus w.e.f academic year
Advance Database Management S stem (MU-Sem 5-Comp. rands ,in Advance Databases ....Page no. (§-
6-14
Cisco created their own global sales kit to converge related content together so their salespeople can click on
any grouping of subjects. The sales kit tracks views
and how often a piece of content was downloaded —
of that rich information comes back to their aj
system.
Cisco’s sellers now have the ability to search their vast document
database and quickly provide relevant
content to their customers and prospects, The company now saves over four million hours a year that
are
now used to engage with more prospects and close
more deals.
Example : Oracle Spatial Extension can work with Oracle 10g DBMS
that supports spatial data types (e.g.
polygon), operations (e.g. overlap) callable from SQL3 query language
has spatial indices, e.g. R-trees
% 6.3.1 Spatial Data Types
ed ue Belair ksa ee
~
: Buildin :
For examples : Dullcings, cellular towers, or stationary vehicles. Moving vehicles and other moving objects
.
can be represented by sequence of point locations that change over time.
4
Lines :- ItIti is a representati
Pp i
on of moving ‘
through or connections in; space and it shows sequence of points
Lines oa objects having length, such as roads or rivers, whose spatial characteristics can be
approximated by sequence of connected lines,
Polygons : Polygons are used to represent characteristics of objects that have boundary, like states, lakes, f or
countries.
attribute data
Geographic Information Systems (GIS) uses the descriptive data that is associated with features in the
map.
hi;
For example, in map representing countries within an Indian state E.g. Del
Attributes - Population, largest city/town, area in square miles, water portion on land.and so on.
Image data
It includes camera data like satellite images and aerial photographs and objects of interest such as buildings
and roads, can be identified and overlaid on these images.
» A. Topological operators
n or rotation.
logical operations are applied, like translatio
Topological properties do not vary when topo
structured in many levels.
Topological operators are hierarchically
ions between regions with a
offer s oper ator s, abili ty to check for detailed topological relat
© The base level
broad boundary.
rtain spatial data
more abstr act oper ator s that allow users to query unce
o The higher levels offer
independent of the geometric data model.
loop).
open (regi on), close (region), and inside (point,
Examples
B. Projective operators
the concavity convexity of
establish predicates regarding
Projective operators, like convex hull are used to
objects.
SACHIN SHAH Venture
el Tech-Neo Publications..A
(MU-New Syllabus w.e.f academic year 21-22)(MS-68)
Trends in Advance Databases)....
Example - Having inside
the object’s concavity,
> © Metric Operators
1. Field 2, Object
Field : These models are used to model spatia
l data that is continuous in nature, e.g. terra
quality index, temperatur in elevation, air
e data, and soil variation characteristics
.
* Object : These models have been used for
applications such as transportation networks,
buildings, and land parcels,
other objects that possess both spatial
and non-spatial attributes.
A spatial application is modeled using
either field or an object based model, which
requirements and the traditional choice of depends. on the
model for the application. Example — High
traffic, analysing
system, etc.
e The requests for the Spatial data that uses of
spatial operations are called Spatial Queries.
Spatial queries canbe divided as shown below
:
1, Range queries : These type of spatial queri
es find all objects of a particular type that
are’ within a given
Spatial area.
Example : Finds all hospitals within the pimpr
i chinchwad area. A variation of this query
location, find all objec is for a given
ts within a particular distance, for example,
find all banks within. 5 km range.
2. Nearest neighbor queries : These type of spatia
l queries find object of a particular type which
given location. is nearest to a
4, Spatial Queries: List the names of all bookstores within ten miles of particular region in the city. ) List all
customers who live Maharashtra and its adjoining states.
Important application domains with spatial data and queries are listed below :
1. Army Field Commander : Has there been any significant enemy troop movement since last night? —
Insurance Risk Manager: Which homes are most likely to be. affected in the next great flood on the
Mississippi? — Medical Doctor: Based on this patient's MRI, have we treated somebody with a similar
condition.
2, Mobile phone user : Where is the nearest gas station? Where is the nearest domino’s pizza shop?
as given below
» Two types of spatial data are particularly important to consider for evaluation or analysis are
Computer Aided Design (CAD) data : It includes spatial information about how objects like buildings, cars
ed-design databases are integrated-circuit
_
>>| 6.5 MULTIPLE CHOICE QUESTIONS Q. 6.6 A (geographic) field is a geographic phenomena
for which, for every point in the study area
Q.6.1 Most allow the representation of simple (a) A value can be determined
feometric objects such as points, lines and (b) A value cannot be determined
polygons, (c) A value is not relevant
(a) Active database (d) A value is missing ~ Ans, : (a)
(b) Temporal database Q.6.7 The term that means the value of a data at
(c) Spatial database particular time is __
(d) Deductive databases Y Ans. : (c) (a) Temporal data (b) Spatial data
Q.6,2 GIS stands for (c) Interval data (d) Graphical datav’ Ans. : (a)
(a) Geographic Information System Q. 6.8 Neo4j is
(b) Generic Information System (a) Graph database
(c) Geological Information System (b) Relational database
(d) Geographic Information Sharing ¥ Ans. : (a) (c) Query language
Q.6.3 GIS deals with which kind of data (d) Temporal database Y Ans. : (a)
(a) Numeric data (b) Binary data Q. 6.9 Cypher is used for Querying in
(c) Spatial data (d) Complex data ¥ Ans. : (c) (a) Graph database
Q.64 By ‘spatial data’ we mean data thathas (b) Relational database
(a) Complex values (b) Positional values (c) Query language
(c) Graphic values —_ (d) Decimal values (d) Temporal database Y Ans. : (a)
Q. 6.10 Events or facts are represented in __
¥ Ans. : (b)
Q.6.5 ‘Spatial databases’ are also known as_ (a) Graph database
(b) Relational database
(a) Geodatabases
(b) Monodatabases (c) Query language
(c) Concurrent databases (d) Temporal database Y Ans. : (a)
(d) None of the above Y Ans. : (a)
Chapter Ends...
O00