You are on page 1of 18

Chapter 12

Object –Oriented Databases


- based upon the object –oriented-programming paradigm
- Overcome the limitation imposed by relational model. For e.g. traditional database application
consists of data processing task that have conceptually simple data types. And the basic data items
are records that are fairly small and whose fields are atomic – that is, they are not further
structured. Today, more complex data types are demanded. The multi value attribute say address
could be viewed as an atomic data item of type string hiding details such as street address, city,
state and postal code, which could be of interest to queries. So the address can be represented as
structured data types.
- A system built with object-oriented methods is one whose components are encapsulated chunks of
data and function, which can inherit attributes and behavior from other such components, and
whose components communicate via messages with each other.

-
Concepts within the object oriented data model.
1. Object structure
- Corresponds to an entity in the E-R model.
- An object is an abstraction of a set of real-world things such that
- All the things in the set - the instances - have the same characteristics, and
- All instances are subject to and conform to the same set of rules and policies.
- An abstraction of something in a problem domain, reflecting the capabilities of the
system to keep information about it (attributes, states) and interact with it (services).
- Object has set of variables (data for object), messages to which object responds and
methods (body of code implementing message).
- Message refers to the passing of requests among objects without regard to specific
implementation details.
- Invoking method is used to denote the act of sending a message to an object and the
execution of the corresponding method.

2. Object Classes
 Group representing the similar objects.

Complied By : AU 1
 A collection of one or more objects with a uniform set of attributes and services,
including a description of how to create new objects in the class.
 Each such object is called an instance of its class.

Key characteristics of OO method

• abstraction <=> generalization-specialization;


Aggregation-decomposition
• inheritance <=> generalization hierarchies
• encapsulation <=> data capsule
• message passing <=> communication

OO Methods promote
• Reuse
– through specialization: class libraries
– on higher levels
– new concept of “design”
• Polymorphism
– operation can apply to several types
– implementation: inheritance & abstract classes
• Prototyping
– fast development
– reuse
– No change of paradigm in the life cycle (?)
• Distribution
– encapsulation
– communication by message passing

Object –Oriented languages


- The concepts of object orientation are implemented as design tool and encode them
into, for example, a relational database.
- Incorporate the concepts object orientation into language that is used to manipulate
the databases. Possible languages are

Complied By : AU 2
• Using SQL by adding complex types and object orientation. Systems that
provide object oriented extensions to relational systems are called object-
relational systems.
• Use an existing object-oriented programming language and to extend it to deal
with databases. Such languages are called persistent programming languages.

Note: To study more about persistent programming language (see Silberschatz chapter 8 and 9)

Distributed Database system


 A logical interrelated collection of shared data physically distributed over a computer network is
called as a distributed database.
 Distributed database system is one in which database is stored on several computers which
communicate with one another through various communication media such as high-speed networks
or telephone lines.
 The computers in a distributed system are referred to by a number of different names such as sites or
nodes.
 Distributed databases are geographically separated, separately administered and have a slower
connection.
 In distributed system, transaction can be local or global. A local transaction is one that accesses data
only from sites where the transaction was initiated. A global transaction is one that either accesses
data in a store different from the one at which transaction was initiated.

Complied By : AU 3
Site 1 Communication Site 2
network

Site 3

Fig: A distributed system.

Why distributed database system?


 Sharing data: In distributed database system, there is the provision of accessing the data from
other sites. E.g. in distributed banking system a user in one branch can access data from another
branch to transfer funds from one branch to another.
 Autonomy: In distributed system, there is a global database administrator responsible for the
entire system. A part of these responsibilities is delegated to the local database administrator for
each site. Each local administrator may have a different degree of local autonomy i.e. each local
site is able to retain degree of control over data that are stored locally.
 Availability "Availability means to have required data at required time. This can be ensured with
distributed system. If data items are replicated in several sites, a transaction needing a particular
data item may find that item in any of several sites. Thus failure of site does not imply system
shutdown. Real-time applications must ensure this property.
Characteristics of DDBS:
 Collection of logically related shared data
 data is split into no of fragments
 Fragment may be replicated
 Fragments-replicas are allocated to sites
 Sites are linked by communication networks
 Data at each site is under the control of a DBS

Complied By : AU 4
 At each site, DBMS can handle local applications, autonomously.
 Each DBMS participates in at least one global application

Complexities of distributed database system


 Software-development cost: difficulties to implement the system increases cost required.
 Greater potential for bugs: since the sites that constitute the distributed operate in parallel, it is
harder to ensure the correctness of algorithms, especially operation during failures of part of the
system, and recovery from failures.
 Increased processing overhead: inter site coordination increases the processing overhead.

Classification of distributed database system


1. Homogeneous distributed database
 All sites having identical DBMS software
 Local sites surrender a portion of their autonomy in terms of their right to change
schemas or DBMS software.
Oracle db
Oracle db

Communicatio Site B
n
Network
Site A

Oracle db
Oracle db

Site C
Site D
Fig: Homogeneous DDBMS with Oracle

2. Heterogeneous distributed database

Complied By : AU 5
 Different sites may use different schemas and different DBMS software
 Sites may not be aware of one another and they may provide only limited facilities for
cooperation in transaction processing.

Oracle db MS Access

Site B
Communicatio
n
Network
Site A

INGRES

DB2

Site C
Site D

Fig: Heterogeneous DDBMS with different DBMS

Complied By : AU 6
Architecture of DDBS
ANSI-SPARC 3 level architecture is for centralized DBMS only.

Global External Global External


Schema 1 Schema n

Global Conceptual
schema

Fragmentation
schema

Allocation schema

Local mapping Local mapping


schema schema

Local conceptual Local conceptual


schema schema

Local internal Local internal


schema schema

Database Database

Fig: Architecture of DDBMS

Complied By : AU 7
1. Global conceptual schema
 describe whole database as if it were not distributed
 correspond to the conceptual level of ANSI,SPARC architecture
 Contains definitions of entities, relationships, constraints, security and integrity information.
 provides physical data independence from the distributed environment
 global external schemas provide logical data independence
2. Fragmentation and allocation schemas
 Description of how data is to be logically partitioned.
 allocation of schema is a description of where data is to be located, taking account of any
replication
3. Local schemas
 Each local DBMS has its own set of schemas
 Local conceptual and local internal schemas correspond to the equivalent levels of the ANSI
SPARC architecture.
 The local mapping schema maps fragments in the allocation schema into external objects in the
local database.
 DBMS independent and is the basis for supporting heterogeneous DBMSs.

Homogenous distributed database


2 approaches of data storage for homogeneous distributed database
1. Allocation
 each fragment is stored at the site with optimal distribution
 Allocation of data involves 4 strategies
a. Centralized
b. Fragmented
c. Complete replication
d. Selective replication
2. Replication
 System maintains several replicas of the relation and stores each replica at different
sites.
 Partial replication : important & frequently used fragments are only replicated

Complied By : AU 8
 Full replication : copy whole database at every site
Advantages & disadvantages
a. Ensures availability of data even any one of site fails.
b. Increased parallelism: when majority of site access the relation for only reading then
parallel access can be maintained. It also minimizes the data movement between sits
as replicas may found in the site where the transaction is being executed.
c. Increased overhead on update: system must ensure that all replicas of relation are
consistent so whenever update is made in on relation it must be propagated to all sites
containing replicas. This increased overhead.

3. Data Fragmentation
 System partitions the relation into several fragments and stores each fragment at a
different site.
 2 schemes
1) Horizontal fragmentation
- Splits the relation by assigning each tuple of r to one more fragments.
e.g. Account-schema – ( acc_no , branch_name ,balances)
if banking system has only 2 branches then there are 2 different
fragments.
Account1 = branch-name="perryridge"(account)
Account2=branch-name="Hillside"(account)
- This is similar to ri = pi (r)
Where ri is ith fragmented relation pi is predicate on relation r

2) Vertical fragmentation
-Splits the relation by decomposing the scheme R of relation r.

- Each fragment ri of r is defined by ri=Ri(r)

and r= r1 r2 …………….. rn

Complied By : AU 9
Heterogeneous Distributed Databases
Manipulation of information located in heterogeneous distributed database required an additional
software layer on top of existing database system. The software layer is called multimedia database systems.
The local database systems may employ different logical models and data-definition and data –manipulation
languages, and may differ in their concurrency-control and transaction-management mechanisms. A
multimedia database system creates the illusion of logical database integration without requiring physical
database integration.

Transparencies in DDBMS

1. Distributed Transparency

2. Transaction transparency

3. Performance Transparency

4. DBMS Transparency

1. Distributed Transparency
- User does not need to know that data is fragmented (fragmented transparency) or the
location of data items(local transparency)
 Fragmented transparency
 location transparency
 replication transparency
 local mapping transparency
 naming transparency

2. Transaction Transparency
o Ensures all distributed transactions maintain the distributed database integrity and
consistency.
o Distribution transaction access the data stored at more than one location.
o Each transaction is divided into a number of sub transactions and a sub transaction is
represented by an agent.
3. Performance Transparency

Complied By : AU 10
o Ensures centralized system and should not suffer from any performance degradation
due to the distributed architecture.
4. DBMS Transparency
o hides the fact that local DBMSs may be different
o applicable to heterogeneous DDBMS

Complied By : AU 11
1.

Multimedia databases

Multimedia databases provide features that allow users to store and query different types of multimedia
information, which includes images(such as photos or drawings) ,video clips(such as movies, newsreels, or
home videos) , audio clips(such as songs, phone messages ,or speeches) , and documents(such as books or
articles).
DBMSs have been constantly adding to the types of data they support. Today the following types of
multimedia data are available in current systems.
 Text: May be formatted or unformatted. For ease of parsing structured documents, standards like
SGML and variations such as HTML are being used.
 Graphics: Examples include drawings and illustrations that are encoded using some descriptive
standards (e.g. CGM, PICT, and postscript).
 Images: Includes drawings, photographs, and so forth, encoded in standard formats such as bitmap,
JPEG, and MPEG. Compression is built into JPEG and MPEG. These images are not subdivided into
components. Hence querying them by content (e.g., find all images containing circles) is nontrivial.
 Animations: Temporal sequences of image or graphic data.
 Video: A set of temporally sequenced photographic data for presentation at specified rates– for
example, 30 frame per second.
 Structured audio: A sequence of audio components comprising note, tone, duration, and so forth.
 Audio: Sample data generated from aural recordings in a string of bits in digitized form. Analog
recordings are typically converted into digital form before storage.

 Composite or mixed multimedia data: A combination of multimedia data types such as audio and
video which may be physically mixed to yield a new storage format or logically mixed while
retaining original types and formats. Composite data also contains additional control information
describing how the information should be rendered.

Complied By : AU 12
Nature of Multimedia Applications: Multimedia data may be stored, delivered, and utilized in many
different ways. Applications may be categorized based on their data management characteristics as follows:
 Repository applications: A large amount of multimedia data as well as metadata is stored for
retrieval purposes. Examples include repositories of satellite images, engineering drawings and
designs, space photographs, and radiology scanned pictures.
 Presentation applications: A large amount of applications involve delivery of multimedia data
subject to temporal constraints; simple multimedia viewing of video data, for example, requires a
system to simulate VCR-like functionality. Complex and interactive multimedia presentations
involve orchestration directions to control the retrieval order of components in a series or in parallel.
Interactive environments must support capabilities such as real-time editing analysis or annotating of
video and audio data.
 Collaborative work using multimedia information: This is a new category of applications in which
engineers may execute a complex design task by merging drawings, fitting subjects to design
constraints, and generating new documentation, change notifications, and so forth. Intelligent
healthcare networks as well as telemedicine will involve doctors collaborating among themselves,
analyzing multimedia patient data and information in real time as it is generated.

Data management Issues in Multimedia database

Multimedia information systems are very complex and embrace a large set of issues. Dealing with
thousands of images, documents, audio and video segments, and free text data is really a complex jobs. So
following issues are important.
 Design
– Conceptual, logical, and physical design of multimedia has not been addressed fully.
 Storage
– Multimedia data on standard disk like devices presents problems of representation,
compression, mapping to device hierarchies, archiving, and buffering during the input/output
operation.
 Queries and retrieval
– “Database” way of retrieving information is based on query languages and internal index
structures.

Complied By : AU 13
 Performance
– Multimedia applications involving only documents and text, performance constraints are
subjectively determined by the user.
– Applications involving video playback or audio-video synchronization, physical limitations
dominate.

Multimedia database application

Large-scale applications of multimedia databases can be expected encompasses a large number of


disciplines and enhance existing capabilities.
 Documents and records management
 Knowledge dissemination
- Very effective means of knowledge dissemination , will encompass a phenomenal growth in
electronic books, catalogs, manuals, encyclopedias and repositories of information on many
topics
 Education and training
- For kindergarten students to equipment operators to professionals
- Educate and trained through digital libraries
 Marketing, advertising, retailing, entertainment, and travel
 Real-time control and monitoring
- For monitoring and controlling complex tasks such as manufacturing operations, nuclear
power plants , patients in intensive care units , and transportation

Multimedia database queries and information retrieval

The main types of database queries that are needed involve locating multimedia sources that contain certain
objects of interests. For e.g. one may want to retrieve the audio clips (say speech) delivered by person
named Fidel Castro. Also one may want to retrieve the information based on activities included in them. For
e.g. video clips related to fighter jet crashing during bombing in Baghdad .This type of queries are referred

Complied By : AU 14
to as content-based retrieval, because the multimedia source is being retrieved based on its containing
certain objects or activities. Hence, a multimedia database must use some model to organize an index the
multimedia sources based on their contents.

2 approaches of identifying the contents


1. Automatic analysis of the multimedia sources to identify certain mathematical characteristics of
their contents .This approach uses different techniques depending in the type of multimedia sources (image,
text, video, or audio)
2. Manual identification of objects and activities of interests in each multimedia source and on
using this information to index the sources. This can be applied to all the different multimedia sources, but it
requires a manual preprocessing phase where a person has to scan each multimedia source to identify and
catalog the objects and activities it contains so that they can be used to index this source.

Parallel Databases
Parallel database system is a DBMS running across multiple processors and disks that is designed to
perform various tasks concurrently like loading data, building indexes and evaluating queries.

3 main architecture designs


1. Shared Memory Architecture
2. Shared Disk Architecture
3. Shared Nothing Architecture

1. Shared Memory Architecture


 Multiple CPUS are attached to an interconnection network to share a single or global main memory
and common disk arrays.
 Single copy of multithreaded OS and multithreaded DBMS can support these multiple processors
(CPUSs).
 Shared memory is tightly coupled architecture in which multiple CPUS share memory.
 Also known as Symmetric Multiprocessing (SMP)

Complied By : AU 15
Advantages
 High speed data access for a limited number of processors
 Communication is efficient, lesser overheads.
Disadvantages
 Not scalable beyond 80 or 100 CPUS, in parallel as the interconnection networks is shared by all
CPUs.
 The bus or the interconnection network becomes a bottleneck as the number of CPUs increase.

CPU CPU CPU CPU

Interconnection network

Shared
Memory
Disk Disk

Fig: Shared –memory architecture

2. Shared Disk Architecture


 Multiple CPUs are attached to an interconnection network.
 Each CPU has its own memory but all have access to the same disk
 Memory is not shared among CPUS, so each notes has its own copy of SO and DBMS
 Loosely coupled architecture optimized for applications that are inherently centralized.
 Each CPU can access all disks directly but each has its own private memory. So also known as
clusters
Advantages
 no bottleneck as each CPUS has its own memory
 load balancing is easy

Complied By : AU 16
 better fault tolerance
Disadvantage
 Increase in no of CPUs, problems of interference and memory contentions also increases.
 scalability problems

Private
Private Private Private
Memory
Memory Memory Memory

CPU CPU CPU CPU

Interconnection network

Disk Disk Disk

Fig: Shared-disk Architecture

3. Shared nothing architecture


 Multiprocessor architecture in which each processor has its own memory and disk
 Also known as Massively Parallel processing
Advantages
 Better scalability as no sharing of resources
 linear speed-up and linear scale-up
 minimize network references
Disadvantages
 Communication costs higher as sending data involves software interaction
 Costs of non-local disks mean that data may need to be redistributed and reorganized.

Complied By : AU 17
Private
Memory

CPU Disk

Interconnection network

CPU
Disk

Private
Memory

Fig: Shared –nothing Architecture

Complied By : AU 18

You might also like