You are on page 1of 19

Distribution

Q1 - Rajat
Q2 - Naman & Pranav (Completed)
Q3 - Pratik
Q4 - Shivank

Poore poore answers dalna yahan pr and then end mein apne apne word doc banalena format
krke.

Questions
Q2. You need to write graphical database schema for complete university database. ch 2
book navathe
Q4. Introduce commercial databases
a. IBM Db2
b. Oracle
c. Ms SQL Server

1. Answer 1: Apply Cost based functions for select operation and join operation.

The cost of executing a query includes the following components:

Access cost to secondary storage- This is the cost of transferring (reading and writing) data
blocks between secondary disk storage and main memory buffers. This is also known as disk I/O
(input/output) cost. The cost of searching for records in a disk file depends on the type of access
structures on that file, such as ordering, hashing, and primary or secondary indexes. In addition,
factors such as whether the file blocks are allocated contiguously on the same disk cylinder or
scattered on the disk affect the access cost.

Disk storage cost- This is the cost of storing on disk any intermediate files that are generated by
an execution strategy for the query.

Computation cost- This is the cost of performing in-memory operations on the records within the
data buffers during query execution. Such operations include searching for and sorting records,
merging records for a join or a sort operation, and performing computations on field values. This
is also known
as CPU (central processing unit) cost.
Memory usage cost- This is the cost pertaining to the number of main memory buffers needed
during query execution.

Communication cost- This is the cost of shipping the query and its results from the database
site to the site or terminal where the query originated. In distributed databases (see Chapter 25),
it would also include the cost of transferring tables and results among various computers during
query evaluation. For large databases, the main emphasis is often on minimizing the access cost
to secondary storage. Simple cost functions ignore other factors and compare different query
execution strategies in terms of the number of block transfers between disk and main memory
buffers. For smaller databases, where most of the data in the files involved in the query can be
completely stored in memory, the emphasis is on minimizing computation cost.
Catalog Information Used in Cost Functions

For a file whose records are all of the same type

number of records (tuples) (r)


average record size (R)
number of file blocks (b)
blocking factor (bfr)

We must also keep track of the primary file organization for each file. The primary file organization
records may be unordered, ordered by an attribute with or without a primary or clustering index,
or hashed (static hashing or one of the dynamic hashing methods) on a key attribute. Information
is also kept on all primary, secondary, or clustering indexes and their indexing attributes.

Number of levels (x) of each multilevel index (primary, secondary, or clustering) is needed for cost
functions that estimate the number of block accesses that occur during
query execution.

In some cost functions the number of first-level index blocks (bI1) is needed.

number of distinct values (d) of an attribute

selectivity (sl), which is the fraction of records satisfying an equality condition on the
attribute.

selection cardinality (s= sl*r) of an attribute, which is the average number of records that
will satisfy an equality selection condition on that attribute.

For a key attribute, d = r, sl = 1/r and s= 1.

For a nonkey attribute, by making an assumption that the d distinct values are uniformly distributed
among the records, we estimate sl = (1/d) and so s = (r/d).20
Information such as the number of index levels is easy to maintain because it does not change
very often. However, other information may change frequently
.
Cost Functions for SELECT

These cost functions are estimates that ignore computation time, storage cost, and other factors.
The cost for method Si is referred to as CSi block accesses.

S1—Linear search (brute force) approach. We search all the file blocks to retrieve all records
satisfying the selection condition; hence, CS1a = b.
For an equality condition on a key attribute, only half the file blocks are searched on the average
before finding the record, so a rough estimate for CS1b = (b/2) if the record is found; if no record
is found that satisfies the condition, CS1b = b.

S2—Binary search. This search accesses approximately CS2 = log2b + ⎡ (s/bfr)⎤ −1 file blocks.
This reduces to log2b if the equality condition is on a unique (key) attribute, because s = 1 in this
case.

S3a—Using a primary index to retrieve a single record. For a primary index, retrieve one disk
block at each index level, plus one disk block from the data file. Hence, the cost is one more disk
block than the number of index levels: CS3a = x + 1.

S3b—Using a hash key to retrieve a single record. For hashing, only one disk block needs to
be accessed in most cases. The cost function is approximately CS3b = 1 for static hashing or
linear hashing, and it is 2 disk block accesses for extendible hashing.

S4—Using an ordering index to retrieve multiple records. If the comparison condition is >,
>=, <, or <= on a key field with an ordering index, roughly half the file records will satisfy the
condition. This gives a cost function of CS4 = x + (b/2). This is a very rough estimate, and although
it may be correct on the average, it may be quite inaccurate in individual cases. A more accurate
estimate is possible if the distribution of records is stored in a histogram.

S5—Using a clustering index to retrieve multiple records. One disk block is accessed at each
index level, which gives the address of the first file disk block in the cluster. Given an equality
condition on the indexing attribute, s records will satisfy the condition, where s is the selection
cardinality of the
indexing attribute.

This means that ⎡ (s/bfr)⎤ file blocks will be in the cluster of file blocks that hold all the selected
records, giving CS5 = x + ⎡ (s/bfr)⎤ .

S6—Using a secondary (B+-tree) index. For a secondary index on a key (unique) attribute, the
cost is x + 1 disk block accesses. For a secondary index on a nonkey (nonunique) attribute, s
records will satisfy an equality condition, where s is the selection cardinality of the indexing
attribute. However, because the index is nonclustering, each of the records may reside on a
different disk block, so the (worst case) cost estimate is CS6a = x + 1 + s. The additional 1 is to
account for the disk block that contains the record pointers after the index is searched. If the
comparison condition is >, >=, <, or <= and half the file records are assumed to satisfy the
condition, then (very roughly) half the first-level index blocks are accessed, plus half the file
records via the index. The cost estimate for this case, approximately, is CS6b = x + (bI1/2) + (r/2).
The r/2 factor can be refined if better selectivity estimates
are available through a histogram. The latter method CS6b can be very costly.

S7—Conjunctive selection. We can use either S1 or one of the methods S2 to S6 discussed


above. In the latter case, we use one condition to retrieve the records and then check in the main
memory buffers whether each retrieved record satisfies the remaining conditions in the
conjunction. If multiple indexes exist, the search of each index can produce a set of record
pointers (record ids) in the main memory buffers. The intersection of the sets of record pointers
(referred to in S9) can be computed in main memory, and then the resulting records are retrieved
based on their record ids.

Example of Using the Cost Functions

In a query optimizer, it is common to enumerate the various possible strategies for executing a
query and to estimate the costs for different strategies. An optimization technique, such as
dynamic programming, may be used to find the optimal (least) cost estimate efficiently, without
having to consider all possible execution strategies. We do not discuss optimization algorithms
here; rather, we use a simple example to illustrate how cost estimates may be used. Suppose
that the EMPLOYEE file in has rE = 10,000 records stored in bE = 2000 disk blocks with blocking
factor bfrE = 5 records/block and the following access paths:

1. A clustering index on Salary, with levels xSalary = 3 and average selection cardinality sSalary
= 20. (This corresponds to a selectivity of slSalary = 0.002).

2. A secondary index on the key attribute Ssn, with xSsn = 4 (sSsn = 1, slSsn = 0.0001).

3. A secondary index on the nonkey attribute Dno, with xDno = 2 and first-level index blocks
bI1Dno = 4. There are dDno = 125 distinct values for Dno, so the selectivity of Dno is slDno =
(1/dDno) = 0.008, and the selection cardinality is sDno = (rE * slDno) = (rE/dDno) = 80.

4. A secondary index on Sex, with xSex = 1. There are dSex = 2 values for the Sex attribute, so
the average selection cardinality is sSex = (rE/dSex) = 5000. (Note that in this case, a histogram
giving the percentage of male and female employees may be useful, unless they are
approximately equal.)

We illustrate the use of cost functions with the following examples:


OP1: Ssn=‘123456789’(EMPLOYEE)
OP2: Dno>5(EMPLOYEE)
OP3: Dno=5(EMPLOYEE)
OP4: Dno=5 AND SALARY>30000 AND Sex=‘F’(EMPLOYEE)

The cost of the brute force (linear search or file scan) option S1 will be estimated as CS1a = bE
= 2000 (for a selection on a nonkey attribute) or CS1b = (bE/2) = 1000 (average cost for a
selection on a key attribute).

For OP1 we can use either method S1 or method S6a; the cost estimate for S6a is CS6a = xSsn
+ 1 = 4 + 1 = 5, and it is chosen over method S1, whose average cost is CS1b = 1000. For OP2
we can use either method S1 (with estimated cost CS1a = 2000) or method S6b (with estimated
cost CS6b = xDno + (bI1Dno/2) + (rE /2) = 2 + (4/2) + (10,000/2) = 5004), so we choose the linear
search approach for OP2. For OP3 we can use either method S1 (with estimated cost CS1a =
2000) or method S6a (with estimated cost CS6a = xDno + sDno = 2 + 80 = 82), so we choose
method S6a.

Finally, consider OP4, which has a conjunctive selection condition.We need to estimate the cost
of using any one of the three components of the selection condition to retrieve the records, plus
the linear search approach. The latter gives cost estimate CS1a = 2000. Using the condition (Dno
= 5) first gives the cost estimate CS6a = 82.

Using the condition (Salary > 30,000) first gives a cost estimate CS4 = xSalary + (bE/2) = 3 +
(2000/2) = 1003.Using the condition (Sex = ‘F’) first gives a cost estimate CS6a = xSex + sSex =
1 + 5000 = 5001.

The optimizer would then choose method S6a on the secondary index on Dno because it has the
lowest cost estimate. The condition (Dno = 5) is used to retrieve the records, and the remaining
part of the conjunctive condition (Salary > 30,000 AND Sex = ‘F’) is checked for each selected
record after it is retrieved into memory. Only the records that satisfy these additional conditions
are included in the result of the operation.

Cost Functions for JOIN

To develop reasonably accurate cost functions for JOIN operations, we need to have an estimate
for the size (number of tuples) of the file that results after the JOIN operation. This is usually kept
as a ratio of the size (number of tuples) of the resulting join file to the size of the CARTESIAN
PRODUCT file, if both are applied to the same input files, and it is called the join selectivity ( js).
If we denote the number of tuples of a relation R by |R|, we have: js = |(R c S)| / |(R S)| = |(R c S)|
/ (|R| * |S|)

If there is no join condition c, then js = 1 and the join is the same as the CARTESIAN PRODUCT.
If no tuples from the relations satisfy the join condition, then js = 0. In general, 0 􀁦 js 􀁦 1. For a
join where the condition c is an equality comparison R.A = S.B, we get the following two special
cases:

1. If A is a key of R, then |(R c S)| ≤|S|, so js ≤(1/|R|). This is because each record in file S will be
joined with at most one record in file R, since A is a key of R. A special case of this condition is
when attribute B is a foreign key of S that references the primary key A of R. In addition, if the
foreign key B has the NOT NULL constraint, then js = (1/|R|), and the result file of the join will
contain |S| records.

2. If B is a key of S, then |(R c S)| ≤|R|, so js ≤(1/|S|).


Having an estimate of the join selectivity for commonly occurring join conditions enables the query
optimizer to estimate the size of the resulting file after the join operation, given the sizes of the
two input files, by using the formula |(R c S)| = js * |R| * |S|.We can now give some sample
approximate cost functions for estimating the cost of some of the join algorithms given in Section
19.3.2. The join operations are of the form: R A=B S where A and B are domain-compatible
attributes of R and S, respectively. Assume that R has bR blocks and that S has bS blocks:

J1—Nested-loop join. Suppose that we use R for the outer loop; then we get the following cost
function to estimate the number of block accesses for this method, assuming three memory
buffers.We assume that the blocking factor for the resulting file is bfrRS and that the join selectivity
is known:

CJ1 = bR + (bR * bS) + (( js * |R| * |S|)/bfrRS)

The last part of the formula is the cost of writing the resulting file to disk. This cost formula can be
modified to take into account different numbers of memory buffers. If nB main memory buffers are
available to perform the join, the cost formula becomes:

CJ1 = bR + ( ⎡ bR/(nB – 2)⎤ * bS) + ((js * |R| * |S|)/bfrRS)

J2—Single-loop join (using an access structure to retrieve the matching record(s)). If an


index exists for the join attribute B of S with index levels xB, we can retrieve each record s in R
and then use the index to retrieve all the matching records t from S that satisfy t[B] = s[A]. The
cost depends on the type of index. For a secondary index where sB is the selection cardinality for
the join attribute B of S,21 we get:

CJ2a = bR + (|R| * (xB + 1 + sB)) + (( js * |R| * |S|)/bfrRS)

For a clustering index where sB is the selection cardinality of B, we get

CJ2b = bR + (|R| * (xB + (sB/bfrB))) + (( js * |R| * |S|)/bfrRS)


For a primary index, we get

CJ2c = bR + (|R| * (xB + 1)) + ((js * |R| * |S|)/bfrRS)

If a hash key exists for one of the two join attributes—say, B of S—we get
CJ2d = bR + (|R| * h) + (( js * |R| * |S|)/bfrRS)
where h ≥1 is the average number of block accesses to retrieve a record, given its hash key value.
Usually, h is estimated to be 1 for static and linear hashing and 2 for extendible hashing.

J3—Sort-merge join. If the files are already sorted on the join attributes, the cost function for this
method is

CJ3a = bR + bS + ((js * |R| * |S|)/bfrRS)

If we must sort the files, the cost of sorting must be added. We can use the formulas from to
estimate the sorting cost.

Example of Using the Cost Functions

Suppose that we have the EMPLOYEE file described in the example, and assume that the
DEPARTMENT file in consists of rD = 125 records stored in bD = 13 disk blocks. Consider the
following two join operations:

OP6: EMPLOYEE Dno=Dnumber DEPARTMENT


OP7: DEPARTMENT Mgr_ssn=Ssn EMPLOYEE

Suppose that we have a primary index on Dnumber of DEPARTMENT with xDnumber=1 level
and a secondary index on Mgr_ssn of DEPARTMENT with selection cardinality sMgr_ssn= 1 and
levels xMgr_ssn=2.

Assume that the join selectivity for OP6 is jsOP6 = (1/|DEPARTMENT|) = 1/125 because
Dnumber is a key of DEPARTMENT. Also assume that the blocking factor for the resulting join
file is bfrED= 4 records per block.We can estimate the worst-case costs for the JOIN operation
OP6 using the applicable methods J1 and J2 as follows:

1. Using method J1 with EMPLOYEE as outer loop:


CJ1 = bE + (bE * bD)+(( jsOP6 * rE * rD)/bfrED) = 2000+(2000*13)+(((1/125) * 10,000 *
125)/4) = 30,500

2. Using method J1 with DEPARTMENT as outer loop:


CJ1 = bD + (bE * bD) + (( jsOP6 * rE * rD)/bfrED) = 13 + (13*2000) + (((1/125) * 10,000 *
125/4) = 28,513
3. Using method J2 with EMPLOYEE as outer loop:
CJ2c = bE + (rE * (xDnumber+1))+(( jsOP6 * rE * rD)/bfrED = 2000+(10,000 * 2) + (((1/125) *
10,000 *125/4) = 24,500

4. Using method J2 with DEPARTMENT as outer loop:


CJ2a = bD + (rD * (xDno+sDno)) + (( jsOP6 * rE * rD)/bfrED) = 13+(125*(2 + 80))+(((1/125)*10,000
* 125/4) = 12,763

Case 4 has the lowest cost estimate and will be chosen.Notice that in case 2 above, if 15 memory
buffers (or more) were available for executing the join instead of just 3, 13 of them could be used
to hold the entire DEPARTMENT relation (outer loop relation) in memory, one could be used as
buffer for the result, and one would be used to hold one block at a time of the EMPLOYEE file
(inner loop file), and the cost for case 2 could be drastically reduced to just bE + bD + (( jsOP6 *
rE * rD)/bfrED) or 4,513. If some other number of main memory buffers was available, say nB =
10, then the cost for case 2 would be calculated as follows, which would also give better
performance than case 4:

CJ1 = bD + ( ⎡ bD/(nB – 2)⎤ * bE) + ((js * |R| * |S|)/bfrRS)


= 13 + ( ⎡ 13/8⎤ * 2000) + (((1/125) * 10,000 * 125/4) = 28,513
= 13 + (2 * 2000) + 2500 = 6,513

2. Answer 2
STUDENT :
Student_Name Student_ID Major

COURSE :
Course_ID Title Tot_Credits

SECTION :
Section_ID Course_ID Semester Year

PREREQUISITES :
Pre_Number Course_ID

GRADE REPORT :
Student_ID Section_ID Grade
The graphical representation of the above schema is as follows:

STUDENT GRADE REPORT


Student_ID
Student_Name
Section_ID
Student_ID
Grade
Major

COURSE SECTION
Course ID Section_ID
Title Course_ID
Tot_Credits Semester
Year

PREREQUISITES
Pre_Number
Course ID

3. Answer 3: Overview of SQL3 and ORDBMS design.


SQL:1999 (also called SQL 3) was the fourth revision of the SQL database query language.
It introduced many new features, many of which required clarifications in the subsequent
SQL:2003. In the meanwhile SQL:1999 is deprecated.
Summary
The ISO standard documents were published between 1999 and 2002 in several installments,
the first one consisting of multiple parts. Unlike previous editions, the standard's name used a
colon instead of a hyphen for consistency with the names of other ISO standards. The first
installment of SQL:1999 had five parts:
● SQL/Framework ISO/IEC 9075-1:1999
● SQL/Foundation ISO/IEC 9075-2:1999
● SQL/CLI : an updated definition of the extension Call Level Interface, originally
published in 1995, also known as CLI-95 ISO/IEC 9075-3:1999
● SQL/PSM : an updated definition of the extension Persistent Stored Modules,
originally published in 1996, also known as PSM-96 ISO/IEC 9075-4:1999
● SQL/Bindings ISO/IEC 9075-5:1999
Three more parts, also considered part of SQL:1999 were published subsequently:
● SQL/MED Management of External Data (SQL:1999 part 9) ISO/IEC 9075-9:2001
● SQL/OLB Object Language Bindings (SQL:1999 part 10) ISO/IEC 9075-10:2000
● SQL/JRT SQL Routines and Types using the Java Programming Language
(SQL:1999 part 13) ISO/IEC 9075-13:2002
New features

Data types

Boolean data types

The SQL:1999 standard calls for a Boolean type, but many commercial SQL servers (Oracle
Database, IBM DB2) do not support it as a column type, variable type or allow it in the results
set. Microsoft SQL Server is one of the few database systems that properly supports
BOOLEAN values using its "BIT" data type. Every 1–8 bit fields occupies one full byte of space
on disk. MySQL interprets "BOOLEAN" as a synonym for TINYINT (8-bit signed integer).
PostgreSQL provides a standard conforming Boolean type

Distinct user-defined types of power

Sometimes called just distinct types, these were introduced as an optional feature (S011) to
allow existing atomic types to be extended with a distinctive meaning to create a new type and
thereby enabling the type checking mechanism to detect some logical errors, e.g. accidentally
adding an age to a salary. For example:
create type age as integer FINAL;
create type salary as integer FINAL;

creates two different and incompatible types. The SQL distinct types use name equivalence
not structural equivalence like typedefs in C. It's still possible to perform compatible operations
on (columns or data) of distinct types by using an explicit type CAST.
Few SQL systems support these. IBM DB2 is one those supporting them. Oracle database
does not currently support them, recommending instead to emulate them by a one-place
structured type.

Structured user-defined types

These are the backbone of the object-relational database extension in SQL:1999. They are
analogous to classes in objected-oriented programming languages. SQL:1999 allows only
single inheritance.

Common table expressions and recursive queries


SQL:1999 added a WITH [RECURSIVE] construct allowing recursive queries, like transitive
closure, to be specified in the query language itself; see common table expressions.

Some OLAP capabilities

GROUP BY was extended with ROLLUP, CUBE, and GROUPING SETS.

Role-based access control

Full support for RBAC via CREATE ROLE.

Keywords

SQL:1999 introduced the UNNEST keyword.

SQL:2003 is the fourth revision of the SQL database query language. The standard consists
of 9 parts which are described in detail in SQL. It was updated by SQL:2006. Which was
updated by SQL:2008
New features
The SQL:2003 standard makes minor modifications to all parts of SQL:1999 (also known
as SQL3), and officially introduces a few new features such as:
● XML-related features (SQL/XML)
● Window functions
● the sequence generator, which allows standardized sequences
● two new column types: auto-generated values and identity-columns
● the new MERGE statement
● extensions to the CREATE TABLE statement, to allow "CREATE TABLE AS" and
"CREATE TABLE LIKE"
● removal of the poorly implemented "BIT" and "BIT VARYING" data types
● OLAP capabilities (initially added in SQL:1999) were extended with a window
function.

An object-relational database (ORD), or object-relational database management system


(ORDBMS), is a database management system (DBMS) similar to a relational database, but
with an object-oriented database model: objects, classes and inheritance are directly
supported in database schemas and in the query language. In addition, just as with pure
relational systems, it supports extension of the data model with custom data types and
methods.
Example of an object-oriented database model

An object-relational database can be said to provide a middle ground between relational


databases and object-oriented databases. In object-relational databases, the approach is
essentially that of relational databases: the data resides in the database and is manipulated
collectively with queries in a query language; at the other extreme are OODBMSes in which
the database is essentially a persistent object store for software written in an object-oriented
programming language, with a programming API for storing and retrieving objects, and little or
no specific support for querying.

Overview
The basic need of Object-relational database arises from the fact that both Relational and
Object database have their individual advantages and drawbacks. The isomorphism of the
relational database system with a mathematical relation allows it to exploit many useful
techniques and theorems from set theory. But these types of databases are not useful when
the matter comes to data complexity and mismatch between application and the DBMS. An
object oriented database model allows containers like sets and lists, arbitrary user-defined
datatypes as well as nested objects. This brings commonality between the application type
systems and database type systems which removes any issue of impedance mismatch. But
Object databases, unlike relational do not provide any mathematical base for their deep
analysis.
The basic goal for the Object-relational database is to bridge the gap between relational
databases and the object-oriented modeling techniques used in programming languages such
as Java, C++, Visual Basic .NET or C#. However, a more popular alternative for achieving
such a bridge is to use a standard relational database systems with some form of object-
relational mapping (ORM) software. Whereas traditional RDBMS or SQL-DBMS products
focused on the efficient management of data drawn from a limited set of data-types (defined
by the relevant language standards), an object-relational DBMS allows software developers to
integrate their own types and the methods that apply to them into the DBMS.
The ORDBMS (like ODBMS or OODBMS) is integrated with an object-oriented programming
language. The characteristic properties of ORDBMS are 1) complex data, 2) type inheritance,
and 3) object behavior. Complex data creation in most SQL ORDBMSs is based on
preliminary schema definition via the user-defined type (UDT). Hierarchy within structured
complex data offers an additional property, type inheritance. That is, a structured type can
have subtypes that reuse all of its attributes and contain additional attributes specific to the
subtype. Another advantage, the object behavior, is related with access to the program
objects. Such program objects must be storable and transportable for database processing,
therefore they usually are named as persistent objects. Inside a database, all the relations with
a persistent program object are relations with its object identifier (OID). All of these points can
be addressed in a proper relational system, although the SQL standard and its
implementations impose arbitrary restrictions and additional complexity
In object-oriented programming (OOP), object behavior is described through the methods
(object functions). The methods denoted by one name are distinguished by the type of their
parameters and type of objects for which they attached (method signature). The OOP
languages call this the polymorphism principle, which briefly is defined as "one interface, many
implementations". Other OOP principles, inheritance and encapsulation, are related both to
methods and attributes. Method inheritance is included in type inheritance. Encapsulation in
OOP is a visibility degree declared, for example, through the public, private and
protected access modifiers.
History
Object-relational database management systems grew out of research that occurred in the
early 1990s. That research extended existing relational database concepts by adding object
concepts. The researchers aimed to retain a declarative query-language based on predicate
calculus as a central component of the architecture. Probably the most notable research
project, Postgres (UC Berkeley), spawned two products tracing their lineage to that research:
Illustra and PostgreSQL.
In the mid-1990s, early commercial products appeared. These included Illustra (Illustra
Information Systems, acquired by Informix Software, which was in turn acquired by IBM),
Omniscience (Omniscience Corporation, acquired by Oracle Corporation and became the
original Oracle Lite), and UniSQL (UniSQL, Inc., acquired by KCOMS). Ukrainian developer
Ruslan Zasukhin, founder of Paradigma Software, Inc., developed and shipped the first version
of Valentina database in the mid-1990s as a C++ SDK. By the next decade, PostgreSQL had
become a commercially viable database, and is the basis for several current products that
maintain its ORDBMS features.
Computer scientists came to refer to these products as "object-relational database
management systems" or ORDBMSs.
Many of the ideas of early object-relational database efforts have largely become incorporated
into SQL:1999 via structured types. In fact, any product that adheres to the object-oriented
aspects of SQL:1999 could be described as an object-relational database management
product. For example, IBM's DB2, Oracle database, and Microsoft SQL Server, make claims
to support this technology and do so with varying degrees of success.
Comparison to RDBMS
An RDBMS might commonly involve SQL statements such as these:
CREATE TABLE Customers (
Id CHAR(12) NOT NULL PRIMARY KEY,
Surname VARCHAR(32) NOT NULL,
FirstName VARCHAR(32) NOT NULL,
DOB DATE NOT NULL
);
SELECT InitCap(Surname) || ', ' || InitCap(FirstName)
FROM Customers
WHERE Month(DOB) = Month(getdate())
AND Day(DOB) = Day(getdate())

Most current SQL databases allow the crafting of custom functions, which would allow the
query to appear as:
SELECT Formal(Id)
FROM Customers
WHERE Birthday(DOB) = Today()

In an object-relational database, one might see something like this, with user-defined data-
types and expressions such as BirthDay():
CREATE TABLE Customers (
Id Cust_Id NOT NULL PRIMARY KEY,
Name PersonName NOT NULL,
DOB DATE NOT NULL
);
SELECT Formal( C.Id )
FROM Customers C
WHERE BirthDay ( C.DOB ) = TODAY;

The object-relational model can offer another advantage in that the database can make use of
the relationships between data to easily collect related records. In an address book application,
an additional table would be added to the ones above to hold zero or more addresses for each
customer. Using a traditional RDBMS, collecting information for both the user and their
address requires a "join":
SELECT InitCap(C.Surname) || ', ' || InitCap(C.FirstName), A.city
FROM Customers C join Addresses A ON A.Cust_Id=C.Id -- the join
WHERE A.city="New York"

The same query in an object-relational database appears more simply:


SELECT Formal( C.Name )
FROM Customers C
WHERE C.address.city="New York" -- the linkage is 'understood'
by the ORDB
4. Answer 4
Abc parts

a. IBM DB2
DB2 is a database product from IBM. It is a Relational Database Management
System (RDBMS). DB2 is designed to store, analyze and retrieve the data
efficiently. DB2 product is extended with the support of Object-Oriented features
and non-relational structures with XML.

History
Initially, IBM had developed DB2 product for their specific platform. Since year
1990, it decided to develop a Universal Database (UDB) DB2 Server, which can
run on any authoritative operating systems such as Linux, UNIX, and Windows.

Versions
For IBM DB2, the UDB current version is 10.5 with the features of BLU
Acceleration and its code name as 'Kepler'. All the versions of DB2 till today are
listed below:

Version Code Name

3.4 Cobweb

8.1, 8.2 Stinger

9.1 Viper
9.5 Viper 2

9.7 Cobra

9.8 It added features with Only PureScale

10.1 Galileo

10.5 Kepler

Data server editions and features


Depending upon the requirement of needful features of DB2, the organizations
select appropriate DB2 version. The following table shows DB2 server editions
and their features:

Editions Features

Advanced Enterprise It is designed for mid-size to large-size business organizations. Platform


Server Edition and - Linux, UNIX, and Windows. Table partitioning High Availability
Enterprise Server Disaster Recovery (HARD) Materialized Query Table (MQTs)
Edition (AESE / Multidimensional Clustering (MDC) Connection concentrator Pure
ESE) XML Backup compression Homogeneous Federations

Workgroup Server It is designed for Workgroup or mid-size business organizations. Using


Edition (WSE) this WSE you can work with - High Availability Disaster Recovery
(HARD) Online Reorganization Pure XML Web Service Federation
support DB2 Homogeneous Federations Homogeneous SQL
replication Backup compression
Express -C It provides all the capabilities of DB2 at zero charge. It can run on any
physical or virtual systems with any size of configuration.

Express Edition It is designed for entry level and mid-size business organizations. It is
full featured DB2 data server. It offers only limited services. This
Edition comes with - Web Service Federations DB2 homogeneous
federations Homogeneous SQL Replications Backup compression

Enterprise Developer It offers only single application developer. It is useful to design, build and
Edition prototype the applications for deployment on any of the IBM server.
The software cannot be used for developing applications.

b. Oracle Database
Oracle database (Oracle DB) is a relational database management system
(RDBMS) from the Oracle Corporation. Originally developed in 1977 by
Lawrence Ellison and other developers, Oracle DB is one of the most trusted and
widely-used relational database engines.

The system is built around a relational database framework in which data objects
may be directly accessed by users (or an application front end) through
structured query language (SQL). Oracle is a fully scalable relational database
architecture and is often used by global enterprises, which manage and process
data across wide and local area networks. The Oracle database has its own
network component to allow communications across networks.

Oracle DB is also known as Oracle RDBMS and, sometimes, just Oracle.

There are other database offerings, but most of these command a tiny market
share compared to Oracle DB and SQL Server. Fortunately, the structures of
Oracle DB and SQL Server are quite similar, which is a benefit when learning
database administration.

Oracle DB runs on most major platforms, including Windows, UNIX, Linux and
Mac OS. Different software versions are available, based on requirements and
budget. Oracle DB editions are hierarchically broken down as follows:
Enterprise Edition: Offers all features, including superior performance and
security, and is the most robust
Standard Edition: Contains base functionality for users that do not require
Enterprise Edition’s robust package
Express Edition (XE): The lightweight, free and limited Windows and Linux
edition
Oracle Lite: For mobile devices
A key feature of Oracle is that its architecture is split between the logical and the
physical. This structure means that for large-scale distributed computing, also
known as grid computing, the data location is irrelevant and transparent to the
user, allowing for a more modular physical structure that can be added to and
altered without affecting the activity of the database, its data or users. The
sharing of resources in this way allows for very flexible data networks whose
capacity can be adjusted up or down to suit demand, without degradation of
service. It also allows for a robust system to be devised as there is no single
point at which a failure can bring down the database, as the networked schema
of the storage resources means that any failure would be local only.

c. Microsoft SQL Server is a relational database management system


developed by Microsoft. As a database server, it is a software product with the
primary function of storing and retrieving data as requested by other software
applications—which may run either on the same computer or on another computer
across a network (including the Internet).
Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed
at different audiences and for workloads ranging from small single-machine
applications to large Internet-facing applications with many concurrent users.
Mainstream editions
Enterprise
Standard
Web
Business Intelligence
Workgroup
Express
Specialized editions
Azure
Azure
Compact (SQL CE)
Developer
Embedded (SSEE)
Evaluation
Fast Track
LocalDB
Analytics Platform System (APS)
Datawarehouse Appliance Edition
Discontinued editions
MSDE
Personal Edition
Datacenter

The protocol layer implements the external interface to SQL Server. All operations that
can be invoked on SQL Server are communicated to it via a Microsoft-defined format,
called Tabular Data Stream (TDS). TDS is an application layer protocol, used to transfer
data between a database server and a client. Initially designed and developed by
Sybase Inc. for their Sybase SQL Server relational database engine in 1984, and later
by Microsoft in Microsoft SQL Server, TDS packets can be encased in other physical
transport dependent protocols, including TCP/IP, named pipes, and shared memory.
Consequently, access to SQL Server is available over these protocols. In addition, the
SQL Server API is also exposed over web services.

You might also like