You are on page 1of 35

CS 8492-DATABASE MANAGEMENT SYSTEMS

UNIT V ADVANCED TOPICS

Distributed Databases: Architecture, Data Storage, Transaction Processing – Object-based


Databases: Object Database Concepts, Object-Relational features, ODMG Object Model,
ODL, OQL – XML Databases: XML Hierarchical Model, DTD, XML Schema, XQuery –
Information Retrieval: IR Concepts, Retrieval Models, Queries in IR systems.

Part-A

1. Define mobile database with an example.


Mobile databases are separate from the main database and can easily be transported to various
places. Even though they are not connected to the main database, they can still communicate with the
database to share and exchange data.
The mobile database includes the following components:
1. The main system database that stores all the data and is linked to the mobile database.
2. The mobile database that allows users to view information even while on the move. It shares
information with the main database.
3. The device that uses the mobile database to access data. This device can be a mobile phone,
laptop etc.
4. A communication link that allows the transfer of data between the mobile database and the
main database.

2. List information types of documents necessary for relevance ranking of documents in IR.
( Nov/Dec-2019)
Relevance ranking is based on factors such as
o Term frequency
o Frequency of occurrence of query keyword in document
o Inverse document frequency
o How many documents the query keyword occurs in
 Fewer  give more importance to keyword
o Hyperlinks to documents
o More links to a document Î document is more important

3. What one could understand from allocation schema?( Nov/Dec-2019)


Each fragment—or each copy of a fragment—must be assigned to a particular site in the
distributed system. This process is called data distribution (or data allocation). The choice of sites
and the degree of replication depend on the performance and availability goals of the system and on
the types and frequencies of transactions submitted at each site.
For example, if high availability is required, transactions can be submitted at any site, and most
transactions are retrieval only, a fully replicated database is a good choice. Finding an optimal or even
a good solution to distributed data allocation is a complex optimization problem.

4. How transaction is performed in Object oriented database?( Nov/Dec-2018)


 Database systems were implemented to provide efficient and consistent way of manipulation with
data. So object oriented database systems also need to support these features to be able to called
"database systems".
 One of the basic notions is ACID (Atomic, Consistent, Isolation, and Durable) transactions.

1
CS 8492-DATABASE MANAGEMENT SYSTEMS
 When transaction is committed, this list is recursively traversed. Recursive traverse is done by
following all references in the object.
 So if we assign reference to some newly created object N to some persistent object P, then this
object P will be included in the list of the modified objects.
 During transaction commit OODBMS traverses object P and reach object N. It detects that N is
not yet persistent (has not assigned OID) and makes it persistent by allocating space for it in the
storage and assigning OID.

5. State the function of XML schema. (Nov/Dec-2017)


 registerSchema(). This registers an XML Schema with Oracle Database
 deleteSchema(). This deletes a previously registered XML Schema.
 copyEvolve(). This Update an XML Schema.

6. Mention two features of Multimedia databases.(Apr/May-2019)


 Media data – The actual data representing an object.
 Media format data – Information such as sampling rate, resolution, encoding scheme etc.
about the format of the media data after it goes through the acquisition, processing and
encoding phase.
 Media keyword data – Keywords description relating to the generation of data. It is also
known as content descriptive data. Example: date, time and place of recording.
 Media feature data – Content dependent data such as the distribution of colors, kinds of
texture and different shapes present in data.

7. Compare sequential access devices versus random access devices with an example. (Apr/May-
2019)
Comparing random versus sequential operations is one way of assessing application
efficiency in terms of disk use. Accessing data sequentially is much faster than accessing it randomly
because of the way in which the disk hardware works. The seek operation, which occurs when the
disk head positions itself at the right disk cylinder to access data requested, takes more time than any
other part of the I/O process.
Because reading randomly involves a higher number of seek operations than does sequential
reading, random reads deliver a lower rate of throughput. The same is true for random writing.

8. Define a distributed database management system. (Nov/Dec-16,Apr/May-18)


A distributed database management system (DDBMS) is a centralized software system that
manages a distributed database in a manner as if it were all stored in a single location.

2
CS 8492-DATABASE MANAGEMENT SYSTEMS
Features:
 DDBMS is used to create, retrieve, update and delete distributed databases.
 It synchronizes the database periodically and provides access mechanisms by the virtue of
which the distribution becomes transparent to the users.
 It ensures that the data modified at any site is universally updated.
 It is used in application areas where large volumes of data are processed and accessed by
numerous users simultaneously.
 It is designed for heterogeneous database platforms.
 It maintains confidentiality and data integrity of the databases.

9. How does the concept of an object in the object-oriented model differ from the concept of an
entity in the Entity-Relationship (E-R) model? (Nov/Dec-16)
E-R Model:
ER model is used to represent real life scenarios as entities. The properties of these entities are
their attributes in the ER diagram and their connections are shown in the form of relationships. An ER
model is generally considered as a top down approach in data designing.
An example of ER model is:

Object Oriented Model:


Object oriented data model is based on using real life scenarios. In this model, the scenarios are
represented as objects. The objects with similar functionalities are grouped together and linked to
different other objects.
An Example of the Object Oriented data model is:

10. What is the difference between XML schema and XML DTD?
DTD, or Document Type Definition, and XML Schema, which is also known as XSD, are two
ways of describing the structure and content of an XML document. DTD is the older of the two, and as
such, it has limitations that XML Schema has tried to improve. The first difference between DTD and
XML Schema, is namespace awareness; XML Schema is, while DTD is not. Namespace awareness
removes the ambiguity that can result in having certain elements and attributes from multiple XML
vocabularies, by giving them namespaces that put the element or attribute into context.

3
CS 8492-DATABASE MANAGEMENT SYSTEMS
Part of the reason why XML Schema is namespace aware while DTD is not, is the fact that XML
Schema is written in XML, and DTD is not. Therefore, XML Schemas can be programmatically
processed just like any XML document. XML Schema also eliminates the need to learn another
language, as it is written in XML, unlike DTD.

11. What is Crawling and Indexing the web? (Nov/Dec-2014)


Crawling the web:
Search engines have crawlers (aka spiders) that “crawl” the World Wide Web to discover pages
that exist in order to help identify the best web pages to be evaluated for a query. The method of travel
by which the crawlers travel are website links.
These website links bind together pages in a website and websites across the web, and in doing
so, create a pathway for the crawlers to reach the trillions of interconnected website pages that exist.
Indexing the web:
Indexing is the act of adding information about a web page to a search engine’s index. The index
is a collection of web pages—a database—that includes information on the pages crawled by search
engine spiders.
The index catalogs and organizes:
 Detailed data on the nature of the content and topical relevance of each web page
 A map of all the pages that each page links to
 The clickable (anchor) text of any links
 Other information about links, such as if they are ads or not, where they are located on the
page, and other aspects of the context of the link and what that implies about the page receiving
the link

12. What is Relevance Ranking? (Nov/Dec-2014)
Relevancy ranking is the method that is used to order the results list in such a way that the records
most likely to be of interest to a user will be at the front. This makes searching easier for users as they
won't have to spend as much time looking through records for the information that interests them.

13.Write short notes on Multi - DBMS Architectures


This is an integrated database system formed by a collection of two or more autonomous database
systems.Multi-DBMS can be expressed through six levels of schemas
Multi-database View Level- Depicts multiple user views comprising of subsets of the integrated
distributed database.
Multi-database Conceptual Level − Depicts integrated multi-database that comprises of global logical
multi-database structure definitions.
Multi-database Internal Level − Depicts the data distribution across different sites and multi-database
to local data mapping.
Local database View Level − Depicts public view of local data.
Local database Conceptual Level − Depicts local data organization at each site.
Local database Internal Level − Depicts physical data organization at each site.
There are two design alternatives for multi-DBMS −Model with multi-database conceptual level.
Model without multi-database conceptual level.

14. What is homogeneous distributed database and heterogeneous distributed database.


A homogeneous distributed database has identical software and hardware running all databases instances,
and may appear through a single interface as if it were a single database.
A heterogeneous distributed database may have different hardware, operating systems, database
management systems, and even data models for different databases.

4
CS 8492-DATABASE MANAGEMENT SYSTEMS

15. Draw the representation Object Oriented Database.

16. Difference between DTD and XSD.

No. DTD XSD


1. DTD stands for Document Type XSD stands for XML Schema
Definition. Definition.
2. DTDs are derived from SGML syntax. XSDs are written in XML.
3. XSD supports datatypes for
DTD doesn't support datatypes
elements and attributes.
4. DTD doesn't support namespace. XSD supports namespace.
5. DTD doesn't define order for child XSD defines order for child ele-
elements. ments.
6. DTD is not extensible. XSD is extensible.
7. XSD is simple to learn because you
DTD is not simple to learn.
don't need to learn new language.
8. DTD provides less control on XML XSD provides more control on
structure. XML structure.

Part-B

1. Discuss in detail about the distributed databases. (Apr/May-2019)(or)


Explain with diagrammatic illustration the architecture of a distributed database
management system. (Apr/May-18)

Distributed Database System:

 Distributed database is a system in which storage devices are not connected to a common
processing unit.
 Database is controlled by Distributed Database Management System and data may be stored at
the same location or spread over the interconnected network. It is a loosely coupled system.
5
CS 8492-DATABASE MANAGEMENT SYSTEMS
 Shared nothing architecture is used in distributed databases.

 The above diagram is a typical example of distributed database system, in which communication
channel is used to communicate with the different locations and every system has its own
memory and database.
Goals of Distributed Database system:
The concept of distributed database was built with a goal to improve:
Reliability: In distributed database system, if one system fails down or stops working for some time
another system can complete the task.
Availability: In distributed database system reliability can be achieved even if sever fails down.
Another system is available to serve the client request.
Performance: Performance can be achieved by distributing database over different locations. So the
databases are available to every location which is easy to maintain.
Types of distributed databases:
(i) Homogeneous distributed databases system:
 Homogeneous distributed database system is a network of two or more databases (With same
type of DBMS software) which can be stored on one or more machines.
 Example: Consider that we have three departments using Oracle-9i for DBMS. If some changes
are made in one department then, it would update the other department also.

6
CS 8492-DATABASE MANAGEMENT SYSTEMS
(ii) Heterogeneous distributed database system.
 Heterogeneous distributed database system is a network of two or more databases with different
types of DBMS software, which can be stored on one or more machines.
 In this system data can be accessible to several databases in the network with the help of generic
connectivity (ODBC and JDBC).
Example: In the following diagram, different DBMS software are accessible to each other using
ODBC and JDBC.

Distributed Data Storage:


There are 2 ways in which data can be stored on different sites. These are:
(i) Replication
In this approach, the entire relation is stored redundantly at 2 or more sites. If the entire
database is available at all sites, it is a fully redundant database. Hence, in replication, systems
maintain copies of data.
This is advantageous as it increases the availability of data at different sites. Also, now
query requests can be processed in parallel.
However, it has certain disadvantages as well. Data needs to be constantly updated. Any
change made at one site needs to be recorded at every site that relation is stored or else it may
lead to inconsistency. This is a lot of overhead. Also, concurrency control becomes way more
complex as concurrent access now needs to be checked over a number of sites.
(ii) Fragmentation
In this approach, the relations are fragmented (i.e., they’re divided into smaller parts) and
each of the fragments is stored in different sites where they’re required. It must be made sure that
the fragments are such that they can be used to reconstruct the original relation (i.e, there isn’t any
loss of data).

7
CS 8492-DATABASE MANAGEMENT SYSTEMS
Fragmentation is advantageous as it doesn’t create copies of data, consistency is not a problem.
Fragmentation of relations can be done in two ways:
 Horizontal fragmentation – Splitting by rows – The relation is fragmented into groups of
tuples so that each tuple is assigned to at least one fragment.
 Vertical fragmentation – Splitting by columns – The schema of the relation is divided into
smaller schemas. Each fragment must contain a common candidate key so as to ensure
lossless join.

2. Give XML representation of bank management system and also explain about
Document Type Definition and XML schema. (Nov/Dec-2018)

Since the XML format is widely accepted, a wide variety of tools are available to assist in
its processing, including browser software and database tools. Just as SQL is the dominant language
for querying relational data, XML is becoming the dominant format for data exchange.
<bank>
<account>
<account-number> A-101 </account-number>
<branch-name> Downtown </branch-name>
<balance> 500 </balance>
</account>
<account>
<account-number> A-102 </account-number>
<branch-name> Perryridge </branch-name>
<balance> 400 </balance>
</account>
<account>
<account-number> A-201 </account-number>
<branch-name> Brighton </branch-name>
<balance> 900 </balance>
</account>
<customer>
<customer-name> Johnson </customer-name>
<customer-street> Alma </customer-street>
<customer-city> Palo Alto </customer-city
</customer>
<customer>

8
CS 8492-DATABASE MANAGEMENT SYSTEMS
<customer-name> Hayes </customer-name>
<customer-street> Main </customer-street>
<customer-city> Harrison </customer-city>
</customer>
<depositor>
<account-number> A-101 </account-number>
<customer-name> Johnson </customer-name>
</depositor>
<depositor>
<account-number> A-201 </account-number>
<customer-name> Johnson </customer-name>
</depositor>
<depositor>
<account-number> A-102 </account-number>
<customer-name> Hayes </customer-name>
</depositor>
(a) XML representation of bank information.
Structure of XML Data:
The fundamental construct in an XML document is the element. An element is simply a pair
of matching start- and end-tags, and all the text that appears between them. XML documents must
have a single root element that encompasses all other elements in the document. In the example in
Figure the <bank> element forms the root element. Further, elements in an XML document must
nest properly. For instance,
<account> . . . <balance> . . . </balance> . . . </account>
is properly nested, whereas
<account> . . . <balance> . . . </account> . . . </balance>
is not properly nested. While proper nesting is an intuitive property, we may define it more
formally. Text is said to appear in the context of an element if it appears between the start-tag and
end-tag of that element. Tags are properly nested if every start-tag has a unique matching end-tag
that is in the context of the same parent element.
 The ability to nest elements within other elements provides an alternative way to represent
information. Figure shows a representation of the bank information from Figure, but with
account elements nested within customer elements.
 The nested representation makes it easy to find all accounts of a customer, although it
would store account elements redundantly if they are owned by multiple customers.

9
CS 8492-DATABASE MANAGEMENT SYSTEMS
Nested representations are widely used in XML data interchange applications to avoid
joins.
 For instance, a shipping application would store the full address of sender and receiver
redundantly on a shipping document associated with each shipment, whereas a normalized
representation may require a join of shipping records with a company-address relation to
get address information.
 In addition to elements, XML specifies the notion of an attribute. For instance, the type of
an account can represented as an attribute, as in Figure. The attributes of an element
appear as name=value pairs before the closing “>” of a tag.
 Attributes are strings, and do not contain markup. Furthermore, attributes can appear only
once in a given tag, unlike subelements, which may be repeated.
<account>
This account is seldom used any more.
<account-number> A-102 </account-number>
<branch-name> Perryridge </branch-name>
<balance> 400 </balance>
</account>
(b) Mixture of text with subelements
<bank-1>
<customer>
<customer-name> Johnson </customer-name>
<customer-street> Alma </customer-street>
<customer-city> Palo Alto </customer-city>
<account>
<account-number> A-101 </account-number>
<branch-name> Downtown </branch-name>
<balance> 500 </balance>
</account>
<account>
<account-number> A-201 </account-number>
<branch-name> Brighton </branch-name>
<balance> 900 </balance>
</account>
</customer>
<customer>
10
CS 8492-DATABASE MANAGEMENT SYSTEMS
<customer-name> Hayes </customer-name>
<customer-street> Main </customer-street>
<customer-city> Harrison </customer-city>
<account>
<account-number> A-102 </account-number>
<branch-name> Perryridge </branch-name>
<balance> 400 </balance>
</account>
</bank-1>
(c) Nested XML representation of bank information.
 One final syntactic note is that an element of the form <element></element>, which
contains no subelements or text, can be abbreviated as <element/>; abbreviated elements
may, however, contain attributes.
 Since XML documents are designed to be exchanged between applications, a namespace
mechanism has been introduced to allow organizations to specify globally unique names
to be used as element tags in documents.
 The idea of a namespace is to prepend each tag or attribute with a universal resource
identifier (for example, a Web address) Thus, for example, if First Bank wanted to ensure
that XML documents
...
<account acct-type= “checking”>
<account-number> A-102 </account-number>
<branch-name> Perryridge </branch-name>
<balance> 400 </balance>
</account>
...
(d)Use of attributes.
 It created would not duplicate tags used by any business partner’s XML documents, it can
prepend a unique identifier with a colon to each tag name.
 The bank may use a Web URL such as a unique identifier. Using long unique identifiers in
every tag would be rather inconvenient, so the namespace standard provides a way to
define an abbreviation for identifiers.
 In Figure, the root element (bank) has an attribute xmlns:FB, which declares that FB is
defined as an abbreviation for the URL given above. The abbreviation can then be used in
various element tags, as illustrated in the figure.
11
CS 8492-DATABASE MANAGEMENT SYSTEMS
 A document can have more than one namespace, declared as part of the root element.
Different elements can then be associated with different namespaces. A default namespace
can be defined, by using the attribute xmlns instead of xmlns:FB in the root element.
 Elements without an explicit namespace prefix would then belong to the default
namespace.Sometimes we need to store values containing tags without having the tags
interpreted as XML tags. So that we can do so, XML allows this construct:
<![CDATA[<account> · · ·</account>]]>
Because it is enclosed within CDATA, the text <account> is treated as normal text data, not as a
tag. The term CDATA stands for character data.
<bank xmlns:FB=“http://www.FirstBank.com”>
...
<FB:branch>
<FB:branchname> Downtown </FB:branchname>
<FB:branchcity> Brooklyn </FB:branchcity>
</FB:branch>
...
</bank>
(e) Unique tag names through the use of namespaces.
XML Document Schema:
 Databases have schemas, which are used to constrain what information can be stored in the
database and to constrain the data types of the stored information.
 In contrast, by default, XML documents can be created without any associated schema: An
element may then have any subelement or attribute.
 While such freedom may occasionally be acceptable given the self-describing nature of the
data format, it is not generally useful when XML documents must be processesed
automatically as part of an application, or even when large amounts of related data are to be
formatted in XML.
 Here, describe the document-oriented schema mechanism included as part of the XML
standard, the Document Type Definition, as well as the more recently defined XMLSchema.
Document Type Definition:
 The Document Type Definition (DTD) is an optional part of an XML document. The main
purpose of a DTD is much like that of a schema: to constrain and type the information
present in the document.

12
CS 8492-DATABASE MANAGEMENT SYSTEMS
 However, the DTD does not in fact constrain types in the sense of basic types like integer or
string. Instead, it only constrains the appearance of subelements and attributes within an
element.
 The DTD is primarily a list of rules for what pattern of subelements appear within an
element. Figure shows a part of an example DTD for a bank information document;
<!DOCTYPE bank [
<!ELEMENT bank ( (account—customer—depositor)+)>
<!ELEMENT account ( account-number branch-name balance )>
<!ELEMENT customer ( customer-name customer-street customer-city )>
<!ELEMENT depositor ( customer-name account-number )>
<!ELEMENT account-number ( #PCDATA )>
<!ELEMENT branch-name ( #PCDATA )>
<!ELEMENT balance( #PCDATA )>
<!ELEMENT customer-name( #PCDATA )>
<!ELEMENT customer-street( #PCDATA )>
<!ELEMENT customer-city( #PCDATA )>
(f) Example of a DTD
 The account element is defined to contain subelements account-number, branch name and
balance (in that order). Similarly, customer and depositor have the attributes in their schema
defined as subelements.
 Finally, the elements account-number, branch-name, balance, customer-name, customer-
street, and customer-city are all declared to be of type #PCDATA. The keyword #PCDATA
indicates text data; it derives its name, historically, from “parsed character data
XML Schema:
 An effort to redress many of these DTD deficiencies resulted in a more sophisticated
schema language, XMLSchema. Here an example of XMLSchema, and list some areas in
which it improves DTDs, without giving full details of XMLSchema’s syntax.
 Figure (g) shows how the DTD in Figure (f) can be represented by XMLSchema. The first
element is the root element bank, whose type is declared later.
 The example then defines the types of elements account, customer, and depositor. Observe
the use of types xsd:string and xsd:decimal to constrain the types of data elements.
 Finally the example defines the type BankType as containing zero or more occurrences of
each of account, customer and depositor. XMLSchema can define the minimum and
maximum number of occurrences of subelements by using minOccurs and maxOccurs.

13
CS 8492-DATABASE MANAGEMENT SYSTEMS
 The default for both minimum and maximum occurrences is 1, so these have to be explicity
specified to allow zero or more accounts, deposits, and customers. Among the benefits that
XMLSchema offers over DTDs are these:
 It allows user-defined types to be created.
 It allows the text that appears in elements to be constrained to specific types, such as
numeric types in
<xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema”>
<xsd:element name=“bank” type=“BankType” />
<xsd:element name=“account”>
<xsd:complexType>
<xsd:sequence>
<xsd:element name=“account-number” type=“xsd:string”/>
<xsd:element name=“branch-name” type=“xsd:string”/>
<xsd:element name=“balance” type=“xsd:decimal”/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name=“customer”>
<xsd:element name=“customer-number” type=“xsd:string”/>
<xsd:element name=“customer-street” type=“xsd:string”/>
<xsd:element name=“customer-city” type=“xsd:string”/>
</xsd:element>
<xsd:element name=“depositor”>
<xsd:complexType>
<xsd:sequence>
<xsd:element name=“customer-name” type=“xsd:string”/>
<xsd:element name=“account-number” type=“xsd:string”/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:complexType name=“BankType”>
<xsd:sequence>
<xsd:element ref=“account” minOccurs=“0” maxOccurs=“unbounded”/>
<xsd:element ref=“customer” minOccurs=“0” maxOccurs=“unbounded”/>
<xsd:element ref=“depositor” minOccurs=“0” maxOccurs=“unbounded”/>
14
CS 8492-DATABASE MANAGEMENT SYSTEMS
</xsd:sequence>
</xsd:complexType>
</xsd:schema>

3. Illustrate the approach to store relations in distributed databases.( Nov/Dec-


2019)
There are four basic strategies for distributing databases:
1. Data replication
2. Horizontal partitioning
3. Vertical partitioning
4. Combinations of the above

1. Data replication
 Data Replication is the process of storing data in more than one site or node. It is useful
in improving the availability of data. It is simply copying data from a database from one
server to another server so that all the users can share the same data without any
inconsistency.
 The result is a distributed database in which users can access data relevant to their tasks
without interfering with the work of others.
 Data replication encompasses duplication of transactions on an ongoing basis, so that
the replicate is in a consistently updated state and synchronized with the source. However
in data replication data is available at different locations, but a particular relation has to
reside at only one location.
 There can be full replication, in which the whole database is stored at every site. There can
also be partial replication, in which some frequently used fragment of the database are
replicated and others are not replicated.
Types of Data Replication:
Transactional Replication – In Transactional replication users receive full initial copies of the
database and then receive updates as data changes. Data is copied in real time from the publisher
to the receiving database(subscriber) in the same order as they occur with the publisher therefore
in this type of replication, transactional consistency is guaranteed.
Transactional replication is typically used in server-to-server environments. It does not
simply copy the data changes, but rather consistently and accurately replicates each change.

15
CS 8492-DATABASE MANAGEMENT SYSTEMS
Snapshot Replication – Snapshot replication distributes data exactly as it appears at a specific
moment in time does not monitor for updates to the data. The entire snapshot is generated and
sent to Users. Snapshot replication is generally used when data changes are infrequent.
It is bit slower than transactional because on each attempt it moves multiple records from
one end to the other end. Snapshot replication is a good way to perform initial synchronization
between the publisher and the subscriber.
Merge Replication – Data from two or more databases is combined into a single database. Merge
replication is the most complex type of replication because it allows both publisher and subscriber
to independently make changes to the database. Merge replication is typically used in server-to-
client environments. It allows changes to be sent from one publisher to multiple subscribers.
Advantages of Data Replication:
Reliability − In case of failure of any site, the database system continues to work since a copy is
available at another site(s).
Reduction in Network Load − Since local copies of data are available, query processing can be
done with reduced network usage, particularly during prime hours. Data updating can be done at
non-prime hours.
Quicker Response − Availability of local copies of data ensures quick query processing and
consequently quick response time.
Simpler Transactions − Transactions require less number of joins of tables located at different
sites and minimal coordination across the network. Thus, they become simpler in nature.
Disadvantages of Data Replication:
Increased Storage Requirements − Maintaining multiple copies of data is associated with
increased storage costs. The storage space required is in multiples of the storage required for a
centralized system.
Increased Cost and Complexity of Data Updating − Each time a data item is updated, the
update needs to be reflected in all the copies of the data at the different sites. This requires
complex synchronization techniques and protocols.
Undesirable Application – Database coupling − If complex update mechanisms are not used,
removing data inconsistency requires complex co-ordination at application level. This results in
undesirable application – database coupling.
Fragmentation:
Fragmentation is the task of dividing a table into a set of smaller tables. The subsets of the
table are called fragments. Fragmentation can be of three types: horizontal, vertical, and hybrid
(combination of horizontal and vertical). Horizontal fragmentation can further be classified into
two techniques: primary horizontal fragmentation and derived horizontal fragmentation.

16
CS 8492-DATABASE MANAGEMENT SYSTEMS
Fragmentation should be done in a way so that the original table can be reconstructed from
the fragments. This is needed so that the original table can be reconstructed from the fragments
whenever required. This requirement is called “reconstructiveness.”
Advantages of Fragmentation:
 Since data is stored close to the site of usage, efficiency of the database system is
increased.
 Local query optimization techniques are sufficient for most queries since data is locally
available.
 Since irrelevant data is not available at the sites, security and privacy of the database
system can be maintained.
Disadvantages of Fragmentation:
 When data from different fragments are required, the access speeds may be very high.
 In case of recursive fragmentations, the job of reconstruction will need expensive
techniques.
 Lack of back-up copies of data in different sites may render the database ineffective in
case of failure of a site.
(i)Vertical Fragmentation:
In vertical fragmentation, the fields or columns of a table are grouped into fragments. In
order to maintain reconstructiveness, each fragment should contain the primary key field(s) of the
table. Vertical fragmentation can be used to enforce privacy of data.
For example, let us consider that a University database keeps records of all registered
students in a Student table having the following schema.
Relation: STUDENT

Regd_No Name Course Address Semester Fees Marks

Now, the fees details are maintained in the accounts section. In this case, the designer will
fragment the database as follows:
CREATE TABLE STD_FEES AS SELECT Regd_No, Fees FROM STUDENT;
(ii)Horizontal Fragmentation:
Horizontal fragmentation groups the tuples of a table in accordance to values of one or
more fields. Horizontal fragmentation should also confirm to the rule of reconstructiveness. Each
horizontal fragment must have all columns of the original base table.
For example, in the student schema, if the details of all students of Computer Science
Course need to be maintained at the School of Computer Science, then the designer will
horizontally fragment the database as follows:

17
CS 8492-DATABASE MANAGEMENT SYSTEMS
CREATE COMP_STD AS SELECT * FROM STUDENT WHERE COURSE = "Computer
Science";
(iii)Hybrid Fragmentation:
In hybrid fragmentation, a combination of horizontal and vertical fragmentation
techniques are used. This is the most flexible fragmentation technique since it generates
fragments with minimal extraneous information. However, reconstruction of the original table is
often an expensive task.
Hybrid fragmentation can be done in two alternative ways:
 At first, generate a set of horizontal fragments; then generate vertical fragments from one
or more of the horizontal fragments.
 At first, generate a set of vertical fragments; then generate horizontal fragments from one
or more of the vertical fragments.

4. How effectiveness of retrieval is measured? Discuss. (Nov/Dec-2019)

Assessing system effectiveness: evaluation measures:

Evaluation measures provide a way of quantifying retrieval effectiveness. Together, the


test collection and evaluation measure provide a simulation of the user of an information retrieval
system. For example, in the case of ad hoc retrieval the user is modelled as submitting a single
query and being presented with a ranked list of results.

One assumes that the user then starts at the top of the ranked list and works their way
down examining each document in turn for relevance. This, of course, is an estimation of how
users behave; in practice they are often far less predictable. There are also further complications
that must be considered.

(i)Set-based measures:
 Two simple measures developed early on were precision and recall. These are set-based
measures: documents in the ranking are treated as unique and the ordering of results is
ignored. Precision measures the fraction of retrieved documents that are relevant; recall
measures the fraction of relevant documents that are retrieved.
 Precision and recall hold an approximate inverse relationship: higher precision is often
coupled with lower recall. However, this is not always the case as it has been shown that
precision is affected by the retrieval of non-relevant documents; recall is not.
 Compared to other evaluation measures, precision is simple to compute because one only
considers the set of retrieved documents (as long as relevance can be judged). However, to
compute recall requires comparing the set of retrieved documents with the entire
collection, which is impossible in many cases (e.g., for Web search). In this situation
techniques, such as pooling, are used.

18
CS 8492-DATABASE MANAGEMENT SYSTEMS
 Often preference is given to either precision or recall. For example, in Web search the
focus is typically on obtaining high precision by finding as many relevant documents in
the top n results. However, there are certain domains, such as patent search, where the
focus is on finding all relevant documents through an exhaustive search.

(ii)Rank-based measures:
 More commonly used measures are based on evaluating ranked retrieval results, where
importance is placed, not only on obtaining the maximum number of relevant documents,
but also for returning relevant documents higher in the ranked list.
 A common way to evaluate ranked outputs is to compute precision at various levels of
recall (e.g., 0.0, 0.1, 0.2, ... 1.0), or at the rank positions of all the relevant documents and
the scores averaged (referred to as average precision).
 A further common measure is precision at a fixed rank position, for example Precision at
rank 10 (P10 or P@10). Because the number of relevant documents can influence the
P@10 score, an alternative measure called R-precision can be used: precision is measured
at the rank position Rq, the total number of relevant documents for query q.
 More recently, measures based on non-binary (or graded) relevance judgments have been
utilised, such as discounted cumulative gain. In such measures, each document is given a
score indicating relevance (e.g., relevant=2; partially-relevant=1; non-relevant=0).
 Discounted Cumulative Gain (GCG) computes a value for the number of relevant
documents retrieved that includes a discount function to progressively reduce the
importance of relevant documents found further down the ranked results list. This
simulates the assumption that users prefer relevant documents higher in the ranked list.
 The measure also makes the assumption that highly relevant documents are more useful
than partially relevant documents, which in turn are more useful than non-relevant
documents. The score can be normalised to provide a value in the range 0 to 1, known
as normalised DCG (nDCG).
 The measure can be averaged across multiple topics similar to computing mean average
precision, and it has also been extended to compute the value of retrieved results across
multiple queries in a session, referred to as normalised session Discounted Cumulative
Gain or nsDCG.

(iii)Other measures:
 Additional measures have been developed to evaluate different information retrieval
problems. For example, to measure the success of search tasks where just one relevant
document is required (known-item search), measures, such as Mean Reciprocal
Rank (MRR), can be used.
 In practice it is important to select an evaluation measure that is suitable for the given
task; for example, if the problem is known-item search then the mean reciprocal rank
would be appropriate; for an ad hoc search task then mean average precision or averaged
normalised discounted cumulative gain would be applicable.

19
CS 8492-DATABASE MANAGEMENT SYSTEMS

5. Explain in detail about the Deductive DB and Spatial DB.( Apr/May-2019)

(i)The Deductive DB:

 In a deductive database system typically specify rules through a declarative language—a


language in which specify what to achieve rather than how to achieve it. An inference
engine (or deduction mechanism) within the system can deduce new facts from the
database by interpreting these rules.
 The model used for deductive databases is closely related to the relational data model, and
particularly to the domain relational calculus formalism. It is also related to the field
of logic programming and the Prolog language.
 The deductive database work based on logic has used Prolog as a starting point. A
variation of Prolog called Datalog is used to define rules declaratively in conjunction with
an existing set of relations, which are themselves treated as literals in the language.
 Although the language structure of Datalog resembles that of Prolog, its operational
semantics—that is, how a Datalog program is executed—is still different.
A deductive database uses two main types of specifications: facts and rules.
 Facts are specified in a manner similar to the way relations are specified, except that it is
not necessary to include the attribute names. Recall that a tuple in a relation describes
some real-world fact whose meaning is partly determined by the attribute names.
 In a deductive database, the meaning of an attribute value in a tuple is determined solely
by its position within the tuple.
 Rules are somewhat similar to relational views. They specify virtual relations that are not
actually stored but that can be formed from the facts by applying inference mechanisms
based on the rule specifications.
 The main difference between rules and views is that rules may involve recursion and
hence may yield virtual relations that cannot be defined in terms of basic relational views.

20
CS 8492-DATABASE MANAGEMENT SYSTEMS
 The evaluation of Prolog programs is based on a technique called backward
chaining, which involves a top-down evaluation of goals. In the deductive databases
that use Datalog, attention has been devoted to handling large volumes of data stored in a
relational database.
 Hence, evaluation techniques have been devised that resemble those for a bottom-up
evaluation. Prolog suffers from the limitation that the order of specification of facts and
rules is significant in evaluation.
(ii)The Spatial DB:

Spatial data is associated with geographic locations such as cities, towns etc. A spatial
database is optimized to store and query data representing objects. These are the objects which
are defined in a geometric space.

Characteristics of Spatial Database:


A spatial database system has the following characteristics
1. It is a database system
2. It offers spatial data types (SDTs) in its data model and query language.
3. It supports spatial data types in its implementation, providing at least spatial indexing and
efficient algorithms for spatial join.
Example:
A road map is a visualization of geographic information. A road map is a 2-dimensional
object which contains points, lines, and polygons that can represent cities, roads, and political
boundaries such as states or provinces.
In general, spatial data can be of two types:
1. Vector data: This data is represented as discrete points, lines and polygons
2. Rastor data: This data is represented as a matrix of square cells.

21
CS 8492-DATABASE MANAGEMENT SYSTEMS

The Spatial Data


The spatial data in the form of points, lines, polygons etc. is used by many different
databases as shown above.

6. Illustrate the usage of OQL, the DMG’s query language.(Nov/Dec-2019)

(i) Object Query Language (OQL):

OQL is SQL-like query language to query Java heap. OQL allows to filter/select
information wanted from Java heap. While pre-defined queries such as "show all instances of
class X" are already supported by HAT, OQL adds more flexibility. OQL is based on
JavaScript expression language.
OQL query is of the form:
select <JavaScript expression to select>
[ from [instanceof] <class name> <identifier>
[ where <JavaScript boolean expression to filter> ] ]

where class name is fully qualified Java class name (example: java.net.URL) or array class
name. [C is char array name, [Ljava.io.File; is name of java.io.File]] and so on.
 Note that fully qualified class name does not always uniquely identify a Java class at runtime.
There may be more than one Java class with the same name but loaded by different loaders.
So, class name is permitted to be id string of the class object.
 If instanceof keyword is used, subtype objects are selected. If this keyword is not specified,
only the instances of exact class specified are selected. Both from and where clauses are
optional.

22
CS 8492-DATABASE MANAGEMENT SYSTEMS
 In select and (optional) where clauses, the expression used in JavaScript expression. Java
heap objects are wrapped as convenient script objects so that fields may be accessed in natural
syntax.
 For example, Java fields can be accessed with obj.field_name syntax and array elements can
be accessed with array[index] syntax. Each Java object selected is bound to a JavaScript
variable of the identifier name specified in from clause.
OQL Examples:
 select all Strings of length 100 or more

select s from java.lang.String s where s.value.length >= 100

 select all int arrays of length 256 or more

select a from [I a where a.length >= 256

 show content of Strings that match a regular expression

select s.value.toString() from java.lang.String s


where /java/.test(s.value.toString())

 show path value of all File objects


<code< b="">
select file.path.value.toString() from java.io.File file
</code<>
 show names of all ClassLoader classes

select classof(cl).name from instanceof java.lang.ClassLoader cl

 show instances of the Class identified by given id string

select o from instanceof 0xd404b198 o

OQL built-in objects, functions:

Heap object:
The heap built-in object supports the following methods:
 heap.forEachClass -- calls a callback function for each Java Class

heap.forEachClass(callback);

 heap.forEachObject -- calls a callback function for each Java object

heap.forEachObject(callback, clazz, includeSubtypes);

(clazz is the class whose instances are selected. If not specified, defaults to
java.lang.Object. includeSubtypes is a boolean flag that specifies whether to include
subtype instances or not. Default value of this flag is true.)
 heap.findClass -- finds Java Class of given name
23
CS 8492-DATABASE MANAGEMENT SYSTEMS

heap.findClass(className);

where className is name of the class to find. The resulting Class object has following
properties:
o name - name of the class.

o superclass - Class object for super class (or null if java.lang.Object).


o statics - name, value pairs for static fields of the Class.
o fields - array of field objects. field object has name, signature properties.
o loader - ClassLoader object that loaded this class.
o signers - signers that signed this class.
o protectionDomain - protection domain to which this class belongs.
Class objects have the following methods:
o isSubclassOf - tests whether given class is direct or indirect subclass of this class or
not.
o isSuperclassOf - tests whether given Class is direct or indirect superclass of this class
or not.
o subclasses - returns array of direct and indirect subclasses.
o superclasses - returns array of direct and indirect superclasses.

(ii) The DMG’s query language:

• ODMG (Object Data Management Group) is a consortium of ODBMS vendors.


• It proposed a standard for ODBMS
 in 1993 as ODMG 1.0,
 in 1995 ODMG 2.0 and
 in 2000 ODMG 3.0.
• The ODMG 3.0 standard is made up of several parts:
‣ Object model ‣ Object definition language (ODL)
‣ Object query language (OQL) ‣ Binding to OOP languages
ODMG Objects and Literals:
• The basic building blocks of the object model are
‣ Objects
‣ Literals
• The main difference between the two is that an object has both an object identifier and a
state (current value), where as a literal has a value but no identifier.
• In either case, value can have a complex structure.

24
CS 8492-DATABASE MANAGEMENT SYSTEMS
Literal Types:
Collection Literals The ODMG Object Model supports collection literals of the following
types: set<t> , bag<t>, list<t>, array<t>, dictionary<t, v>, where t is a type of objects or values in
the collection.
Object Structure:
• The structure of object can be either atomic or not, in which case the object is composed
of other objects.
• An atomic object type is user-defined. There are no built-in atomic object types included
in the ODMG Object Model.
• In the ODMG Object Model, instances of collection objects are composed of distinct
elements, each of which can be an instance of an atomic type, another collection, or a
literal type.
Collection Objects:
The collections supported by the ODMG Object Model include:
‣ Set<t>: A Set object is an unordered collection of elements, with no duplicates allowed.
‣ Bag<t>: A Bag object is an unordered collection of elements that may contain
duplicates.
‣ List<t>: A List object is an ordered collection of elements.
‣ Array<t> : An Array object is a dynamically sized, ordered collection of elements that
can be located by position.
‣ Dictionary<t,v>: A Dictionary object is an unordered sequence of key-value pairs with
no duplicate keys.
Each of these is a type generator, parameterized by the type shown within the angle
brackets.
ODMG Interface:
• An interface is a specification of the abstract behavior of an object type.
• It is a signature for the persistent object. Interface tells external world how to interact
with an object. That is, an interface describes the interface of types of objects: their visible
attributes, relationships and operations.
• Interfaces are non-instantiable but they serve to define operations that can be inherited
by the user-defined objects for a particular application.
• State properties of an interface (i.e., its attributes and relationships) cannot be inherited.
Interfaces and Behavior Inheritance:
• In ODMG, two types of inheritance relationships exist.

25
CS 8492-DATABASE MANAGEMENT SYSTEMS
• An interface is a specification of the abstract behavior of an object type, which specifies
the operation signatures.
• Interfaces are noninstantiable – that is, one cannot create objects that correspond to an
interface definition.
• They are mainly used to specify abstract operations that can be inherited by classes or
by other interfaces.
• Subtyping pertains to the inheritance of behavior only and it is specified by colon (: ).

7. Brief on the methods to store XML documents. (Nov/Dec-2019)

Existing architectures of XML database management systems will be presented and


commented with respect to storing and querying opportunities of XML data.

1) Storage of XML documents in relational systems:


To store XML data in relational systems has been popular for the last several years and
as well implemented by few commercial DBMS. The reason is that it is based on adaptation and
reuse of relational technology. However, there are significant differences among the existing
methods on how to save and hence query XML data in RDBMS.
Shredding XML documents into relations:
 XML documents are transformed into atomic values and after that stored as relations in
relational tables.
 XQueries are translated to SQL queries to be evaluated by the RDBMS query processor.
The advantage of this method is that it does not require big modifications of the existing
database engine.
XML, stored as unparsed text:
 XML data is stored in VARCHAR or LOB (large object) columns of relational tables.
XML data is queried by a XQuery processor external to the database and invoked as a
user-defined function.
 This approach is used by commercial systems offering XML support (MD2, MS SQL).
This solution is relatively simple but the entire XML document usually has to be first
loaded into the memory in order to be processed.
Hybrid XML-relational databases:
 XML documents are stored on disk pages in tree structures matching the XML data
model. Hence no mapping is needed between XML and relational structures.
 The native XML storage is complemented with XML indexes. It is provided XQuery
and SQL support.

26
CS 8492-DATABASE MANAGEMENT SYSTEMS
2) Native XML Data Management Systems:
These are systems like Niagara and Timber that support only XQuery. By this method,
the XML document is broken into nodes and the node information, stored in a B+ tree as all
document nodes are stored in order at the leaf level. In Niagara, so called inverted list indexes
are created to enable efficient structural joins algorithms.

There are various ways to solve the problem of effective, automatic conversion of XML
data into and out of relational databases.
Database vendors such as IBM, Microsoft, Oracle, and Sybase have developed tools to assist
in converting XML documents into relational tables. The various solutions are as follows.
 Oracle XML SQL Utility models XML document elements as a collection of nested tables.
Enclosed elements are modeled by employing the Oracle Object datatype. The "SQL-to-
XML" conversion constructs an XML document by using a one-to-one association between a
table, referenced by Object datatype, and a nested element. "XML-to-SQL" might require
either data model amending (converting it from relational into object-relational) or
restructuring the original XML document.
 IBM DB2 XML Extender allows storing XML documents either as BLOB-like objects or as
decomposed into a set of tables. The latter transformation, known as XML collection, is
defined in XML 1.0 syntax.
 Microsoft approaches the problem by extending SQL-92 and by introducing OPENXML row
set.
 Sybase Adaptive Server introduces the ResultSetXml Java class as a base for processing
XML documents in both directions.

Let use the following vocabulary as an example.

<!-- Primitive Types -->

<!ELEMENT CURRENCY1 (#PCDATA)>

<!ATTLIST CURRENCY1 e-dtype NMTOKEN #FIXED "string"

e-dsize NMTOKEN #FIXED "3">

<!ELEMENT CURRENCY2 (#PCDATA)>

<!ATTLIST CURRENCY2 e-dtype NMTOKEN #FIXED "string"

e-dsize NMTOKEN #FIXED "3">

27
CS 8492-DATABASE MANAGEMENT SYSTEMS

<!ELEMENT AMOUNT (#PCDATA)>

<!ATTLIST AMOUNT e-dtype NMTOKEN #FIXED "decimal">

<!ELEMENT SETTLEMENT (#PCDATA)>

<!ATTLIST SETTLEMENT e-dtype NMTOKEN #FIXED "date">

<!ELEMENT BANKCODE (#PCDATA)>

<!ATTLIST BANKCODE e-dtype NMTOKEN #FIXED "string">

<!ELEMENT BANKACCT (#PCDATA)>

<!ATTLIST BANKACCT e-dtype NMTOKEN #FIXED "string">

<!-- Derived Types -->

<!ELEMENT ACCOUNT (BANKCODE, BANKACCT)>

<!ELEMENT FXTRADE (CURRENCY1, CURRENCY2, AMOUNT, SETTLEMENT,


ACCOUNT)>

Oracle XML-SQL Utility (XSU):

SQL to XML Mapping:

Oracle translates the chain of object references from the database into the hierarchical
structure of XML elements. In an object-relational database, the field ACCOUNT in the table
FXTRADE is modeled as an object reference of type AccountType:

CREATE TABLE FXTRADE

{
CURRENCY1 CHAR (3), CURRENCY2 CHAR (3),

AMOUNT NUMERIC (18,2), SETTLEMENT DATE,

ACCOUNT AccountType // object reference

CREATE TYPE AccountType as OBJECT

BANKCODE VARCHAR (100), BANKACCT VARCHAR (100)

28
CS 8492-DATABASE MANAGEMENT SYSTEMS

A corresponding XML document generated from the given object-relational model (using
"SELECT * FROM FXTRADE") looks like

<?xml version="1.0"?>

<ROWSET>

<ROW num="1">

<CURRENCY1>GBP</CURRENCY1>

<CURRENCY2>JPY</CURRENCY2>

<AMOUNT>10000</AMOUNT>

<SETTLEMENT>20010325</SETTLEMENT>

<ACCOUNT>

<BANKCODE>812</BANKCODE>

<BANKACCT>00365888</BANKACCT>

</ACCOUNT>

</ROW>

<!-- additional rows ... -->

</ROWSET>

Storing XML in the database:


In this example OracleXMLSave is employed to store our XML document in an object-
relational model; the insertXML method performs the actual insertion of the data.

import java.sql.*;

import oracle.xml.sql.dml.OracleXMLSave;

public class testXMLInsert

public static void main(String args[])

throws SQLException
29
CS 8492-DATABASE MANAGEMENT SYSTEMS

Connection conn = getConnection("scott","tiger");

OracleXMLSave sav = new OracleXMLSave(conn, "scott.


FXTRADE");

// Assume that the user passes in this document as 0-arg

sav.insertXML(args[0]);

sav.close();

}
...
}

If the XML and object-relational model in the database are synchronized, but what if they
aren't? we have two options in that case.

 Adjust the object-relational model -- a modifiable object-relational view can be


constructed to accomplish multi-table modifications; or,
 alternatively, the XML document can be decomposed into a collection of "flat"
subdocuments, using XSLT.

XSU does not permit storage of attribute values; it's recommended that you transform attributes
into elements.

8. Explain the different approaches used for object oriented databases.


The concepts of object orientation can be incorporated into a programming language that is used
to manipulate the database.

 "Object-relational systems – add complex types and object-orientation to relational


language.
 " Persistent programming languages – extend object oriented programming language to
deal with databases by adding concepts such as persistence and collections

(i) Object-relational systems:


 Perhaps the best hope for the immediate future of database objects is the Object-
Relational model.
 A recent development, stimulated by the advent of the Object-oriented model, the Object-
Relational model aims to address some of the problems of pure OO technology - such as
the poor support for ad hoc query languages - and open database technology, and provide
better support for existing relational products, by extending the Relational model to
incorporate the key features of Object-orientation.
 The Object-Relational model also provides scope for those using existing Relational
databases to migrate towards the incorporation of objects, and this perhaps is its key
strength, in that it provides a path for the vast number of existing Relational database
30
CS 8492-DATABASE MANAGEMENT SYSTEMS
users gradually to migrate to an Object database platform, while maintaining the support
of their Relational vendor.
 A major addition to the Relational model is the introduction of a stronger type of system
that can accommodate the use of complex data types, which still allow the Relational
model to be preserved.
 Several large database suppliers, including IBM Informix and Oracle, have embraced the
Object-Relational model as the way forward.

Object-Relational Database

(ii)Persistent programming languages:

 Persistent data: data that continue to exist even after the program that created it has
terminated.
 A persistent programming language is a programming language extended with constructs to
handle persistent data. It distinguishes with embedded SQL in at least two ways:
1. In a persistent program language, query language is fully integrated with the host
language and both share the same type system. Any format changes required in
databases are carried out transparently.
Comparison with Embedded SQL where (1) host and DML have different type
systems, code conversion operates outside of OO type system, and hence has a higher
chance of having undetected errors; (2) format conversion takes a substantial amount
of code.
2. Using Embedded SQL, a programmer is responsible for writing explicit code to fetch
data into memory or store data back to the database.
In a persistent program language, a programmer can manipulate persistent data
without having to write such code explicitly.

31
CS 8492-DATABASE MANAGEMENT SYSTEMS
 Drawbacks:
(1) Powerful but easy to make programming errors that damage the database;
(2) Harder to do automatic high-level optimization; and
(3) Do not support declarative querying well.

(a) Persistence of Objects:

Several approaches have been proposed to make the objects persistent.

a. Persistence by class. Declare class to be persistent: all objects of the class are then
persistent objects. Simple, not flexible since it is often useful to have both transient
and persistent objects in a single class. In many OODB systems, declaring a class to be
persistent is interpreted as ``persistable'' -- objects in the class potentially can be made
persistent.
b. Persistence by creation. Introduce new syntax to create persistent objects.
c. Persistence by marking. Mark an object persistent after it is created (and before the
program terminates).
d. Persistence by reference. One or more objects are explicitly declared as (root)
persistent objects. All other objects are persistent iff they are referred, directly or
indirectly, from a root persistent object. It is easy to make the entire data structure
persistent by merely declaring the root of the structure as persistent, but is expensive to
follow the chains in detection for a database system.

(b) Object Identity and Pointers


 The association of an object with a physical location in storage (as in C++) may change
over time.
 There are several degrees of permanence of identity:
a. Intraprocedure: Identity persists only during the execution of a single procedure,
e.g., local variables within procedures.
b. Intraprogram: Identity persists only during the execution of a single program or
query, e.g., global variables in programming languages, and main memory or
virtual memory pointers.
c. Interprogram: Identity persists from one program execution to another, e.g.,
pointers to file system data on disk but may change if the way data is stored in the
file system is changed.
d. Persistent: Identity persists not only among program executions but also among
structural reorganizations of the data. This is the persistent form of identity
required for object-oriented systems.
 In persistent extension of C++, object identifiers are implemented as ``persistent
pointers'' which can be viewed as a pointer to an object in the database.

32
CS 8492-DATABASE MANAGEMENT SYSTEMS
(c)Storage and Access of Persistent Objects:
Objects storage in a database:
Code (that implements methods) should be stored in the database as part of the
schema, along with type definitions, but many implementations store them outside of the
database, to avoid having to integrate system software such as compilers with the database
system.
Data: stored individually for each object.
Find the objects:
1. Give names to objects like we give names to files: works only for small sets of
objects.
2. Expose object identifiers or persistent pointers to the objects:
3. Store the collections of object and allow programs to iterate over the collections to
find required objects. The collections can be modelled as objects of a collection
type. A special case of a collection is a class extent, which is a collection of all
objects belonging to the class.
 Most OODB systems support all three ways of accessing persistent objects. All objects
have object identifiers.
 Names are typically given only to class extents and other collection objects, and
perhaps to other selected objects, but most objects are not given names.
 Class extents are usually maintained for all classed that can have persistent objects, but
in many implementations, they contain only persistent objects of the class.

9. Explain about object oriented database in detail.

The ODBMS which is an abbreviation for object oriented database management system, is the
data model in which data is stored in form of objects, which are instances of classes. These
classes and objects together makes an object oriented data model.
Components of Object Oriented Data Model:
The OODBMS is based on three major components, namely: Object structure, Object classes, and
Object identity. These are explained as following below.
1. Object Structure:
The structure of an object refers to the properties that an object is made up of. These
properties of an object are referred to as an attribute. Thus, an object is a real world entity with
certain attributes that makes up the object structure. Also an object encapsulates the data code into
a single unit which in turn provides data abstraction by hiding the implementation details from the
user.

33
CS 8492-DATABASE MANAGEMENT SYSTEMS
The object structure is further composed of three types of components: Messages, Methods, and
Variables. These are explained as following below.
 Messages:
A message provides an interface or acts as a communication medium between an object and
the outside world. A message can be of two types:
 Read-only message: If the invoked method does not change the value of a variable,
then the invoking message is said to be a read-only message.
 Update message: If the invoked method changes the value of a variable, then the
invoking message is said to be an update message.
 Methods:
When a message is passed then the body of code that is executed is known as a method.
Every time when a method is executed, it returns a value as output. A method can be of two
types:
 Read-only method: When the value of a variable is not affected by a method, then it
is known as read-only method.
 Update-method: When the value of a variable changes by a method, then it is known
as an update method.
 Variables –
It stores the data of an object. The data stored in the variables makes the object
distinguishable from one another.
2. Object Classes:
An object which is a real world entity is an instance of a class. Hence first we need to
define a class and then the objects are made which differ in the values they store but share the
same class definition. The objects in turn corresponds to various messages and variables stored in
it.
Example :
class CLERK
{ //variables
char name;
string address;
int id;
int salary;

//messages
char get_name();
34
CS 8492-DATABASE MANAGEMENT SYSTEMS
string get_address();
int annual_salary();
};
In above example we can see, CLERK is a class that holds the object variables and
messages.
 An OODBMS also supports inheritance in an extensive manner as in a database there may be
many classes with similar methods, variables and messages.
 Thus, the concept of class hierarchy is maintained to depict the similarities among various
classes.
 The concept of encapsulation that is the data or information hiding is also supported by object
oriented data model. And this data model also provides the facility of abstract data types apart
from the built-in data types like char, int and float.
 ADT’s are the user defined data types that hold the values within it and can also have methods
attached to it.
 Thus, OODBMS provides numerous facilities to it’s users, both built-in and user defined. It
incorporates the properties of an object oriented data model with a database management
system, and supports the concept of programming paradigms like classes and objects along
with the support for other concepts like encapsulation, inheritance and the user defined ADT’s
(abstract data types).

35

You might also like