You are on page 1of 101

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


INTRODUCTION

1.

A Database Management System (DBMS) is a set of computer programs that controls the creation, maintenance, and the use of the database of an organization and its end users. It allows organizations to place control of organization-wide database development in the hands of database administrators (DBAs) and other specialists. DBMSes may use any of a variety of database models, such as the network model or relational model. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way. It helps to specify the logical organization for a database and access and use the information within a database. It provides facilities for controlling data access, enforcing data integrity, managing concurrency controlled, and restoring database. The first DBMS appeared during the 1960's at a time in human history where projects of momentous scale were being contemplated, planned and engineered. Never before had such large datasets been assembled in this new technology. Problems on the floor were identified and solutions were researched and developed - often in real-time. The DBMS became necessary because the data was far more volatile than had earlier been planned, and because there were still major limiting factors in the costs associated with data storage media. Data grew as a collection, and it also needed to be managed at a detailed transaction by transaction level. In the 1980's all the major vendors of hardware systems large enough to support the evolving needs of evolving computerized record keeping systems of larger organizations, bundled some form of DBMS with their system solution. The first DBMS species were thus very much vendor specific. IBM as usual led the field, but there were a growing number of competitors and clones whose database solutions offered varying entry points into the bandwagon of computerized record keeping systems.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 1

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

1.1. DBMS Definitions Some of the technical terms of DBMS are defined as below: 1.1.1. Database

A database is a logically coherent collection of data with some inherent meaning, representing some aspect of real world and which is designed, built and populated with data for a specific purpose. Ex: consider the name, telephone number, and addresses. You can record this data in an indexed address book. For maintain database we generally use such software DBASE IV, Ms-Access or Excel 1.1.2. DBMS

It is a collection of programs that enables user to create and maintain a


Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 2

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

database. In other words it is general-purpose software that provides the users with the processes of defining, constructing and manipulating the database for various applications. 1.1.3. Database system

The database and DBMS software together is called as Database system. 1.2. Components of database 1.2.1. Database administrator (DBA) In many organizations where many persons use the same resources, there is a need for a chief administrator to manage these resources. In a database environment, the primary resource is the database itself and the secondary resource is the DBMS and the related software. To manage these resources, we need the database administrator. DBA is responsible for authorizing access to the database and for acquiring S/W and H/W resource as needed. 1.2.2. Database designer They are responsible for identifying the data to be stored in the database and for choosing appropriate structure to represent and store this data. The responsibility of the database designer is to communicate with the database user and to understand their requirement. 1.2.3. End users These are the persons whose jobs requires to access to the database for querying, updating and generating the reports. The databases generally exist for their use. There are several categories of end users: A. Casual end users: who occasionally access the database but they need different information each time. B. Parametric end user: make up a sizable portion of the database end user their main job function involves constantly querying and updating the database. By using standard types of queries and updates called canned transaction tat have been carefully programmed and tested. Such as bank tellers' checks accounts balances, withdraws and deposits. C. Sophisticated end users: includes engineers, scientist, and business analyst who toughly familiarize with the facilities of the DBMS so as to implement their application to meet the complex requirement.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 3

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


D. Stand alone end users: maintains personal database by using readymade software that provide easy to use menu or graphical based interface. Ex: tax packages that store a variety of personal financial data for tax purpose. E. System analyst and application programmer: System analyst determines the requirement of the end users, especially parametric end users and develops specification for the canned transaction to meet their requirement. Application programmer implements these specifications as programs then they test, debug document and maintain these canned transaction. These programmers are known as software engineer. 1.3. Advantages of DBMS 1. 2. 3. 4. 5. 6. 7. 8. Controlling redundancy Restricting unauthorized access Providing persistent storage for program object and data structure Database interfacing Providing multiple user interface Presenting complex relationship among data Enforcing integrity constraints Providing backup and recovery

1.4. Disadvantage in File Processing System 1. Data redundancy & inconsistency. 2. Difficult in accessing data. 3. Data isolation. 4. Data integrity. 5. Concurrent access is not possible. 6. Security Problems.

2.

DATA MODELS

Data model is a set of concepts that can be used to describe the structure of the data base.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 4

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

By the structure of the database as data type, relationship and constraints that should hold for the data. Most of the data items also include a set of basic operations for specifying the modification on the data. 2.1. Categories of data models A. High level or conceptual data model: that describe how the user will use the database. High-level data model uses concepts such as entities, attributes and relationship. B. Entity: represents real world objects such as employee or project that is stored in the database. C. Attribute: represents some property of interest that further describes the entity such as employee name or salary. D. Relationship: it represents the relationship between two or more entity. Low level or physical data model: that describe how the data is stored in the computer. E. Representational or implementation data model: it hides some of the details of data storage but can be implemented on a computer system in a direct way. 2.2. Schemas and instances The description of the database is called the database schema. The database schema is specified during the database design. The displayed schema is called a schema diagram and is not change frequently.

The actual data in the database may change frequently. In a data base changes occur every time. We add a new student or entry a new grade for a student. The data in the database at the particular moment of time is called the database state or instance or snapshot. 2.3. DBMS architecture Three important characteristics of the database 1. Insulation of program and data 2. Support of multiple user view 3. Use of catalogue to store the database schema The architecture of the database system is called as three- schema architecture 1. Internal schema 2. Conceptual schema
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 5

TRAJECTORY EDUCATION
3. External schema

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

1. Internal schema: it describes the physical storage structure of the database. The internal schema uses a physical data model and describes the complete details of data storage and access path for the database. 2. Conceptual schema: it describes the structure of a whole database for a community of users. The conceptual schema hides the details of physical storage structure. High-level data model or an implementation data model can be used at this level. 3. External schema: it describes the part of the database that a particular user group is interested in and hides the rest of the database from that user group. 2.4. Data independence Three schema architecture can be used to explain the concepts of data independence which can be defines the capacity to change the schema at one level of the database system without change the schema at the next higher level. There are two types of data independence: 2.4.1. Logical data independence

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 6

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

This is the capacity to change the conceptual schema without having to change external schema or application programs. We can change the conceptual schema to expand the database or to reduce the database.

2.4.2. Physical data independence This is the capacity to change the internal schema without having to change the conceptual or external schema. Changes to the external schema may be needed because some physical files have to be reorganized. Ex: by creating additional access structure to improve the performance of retrieval or updates. 2.5. Classification of database management system We can categorize the DBMS as follows: 1. 2. 3. 4. Relational data model Network data model Hierarchal data model Object oriented data model

2.5.1. Relational data model Relational data model represents a database as a collection of tables where each table is stored as a separate file. Most relational database has high level query language and support a limited form of users view.

2.5.2. Network data model Represent data as a record type and also represent a limited type of 1:N relationship, called a set of types.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 7

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

2.5.3. Hierarchal data model It represents data as hierarchal tree structure. Each hierarchy represents a number of related records. There is no standard language for hierarchal model. 2.5.4. Object oriented data model It define a database in term of objects their properties and their operations. Objects with the same structure and behavior belong to a class and classes are organized into a hierarchy and cyclic graph. 2.6. Database languages and interfaces 2.6.1. DBMS languages The first thing is to specify conceptual and internal schema for the database and any mapping between two. In many DBMS where no strict separation of levels is maintained one language called the data definition language (DDL) is used by the DBA and data base designer to define both schemas. In DBMS, there is a DDL compiler, whose function is to process DDL statements in order to identify the description of the schema constructs and to store the schema description in the DBMS catalog. Where the clear separation of Conceptual schema Internal schema A. Then DDL is used to specify conceptual schema only. B. SDL (storage definition language) is used to specify internal schema only. Mapping between two levels is specifying by the any of the two languages. In some DBMS VDL (view definition language) is used to specify the users view and their mapping to the conceptual schema. But in most DBMS, DDL is used to
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 8

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

specify both conceptual and external schema. Once the database schema is created and database is filled with data. Users must have to manipulate the database. Manipulations include: Retrieval Insertion Deletion Modification

For that purpose DBMS provides DML (database manipulation language). 2.6.1.1. DML database manipulation language

There are two main type of DMLs: 1. High-level or nonprocedural DML( SQL) 2. Low level or procedural DML 1. High-level or nonprocedural DML: can be used to specify complex database operations. Many DBMS allows high-level DML statement either to be entered interactively from a terminal or to be embedded in a general purpose programming language. DML statement must be identified within the program so that they can be extracted by a pre-compiler and processed by the DBMS. High-level DML such as SQL can be specify and retrieve many records in a single DML statement and hence are called set-at-a-time or set-oriented DMLs. 2. Low level or procedural DML: must be embedded in a general purpose programming language. This type of DML typically retrieves individual records or objects from the database and processes each separately. Hence it needs to use programming language, such as looping, to retrieve and process each record from a set of records. Low-level DML are also called record-at-a-time DML because of this property. Whenever DML commands, high/low level are embedded in a general purpose programming language that language is called the host language and the DML is called the data sub language. On the other hand, high level DML used in a stand-alone interactive manner is called a query language.

2.6.2.

DBMS interfaces

User friendly interfaces provided by a DBMS may include the following: Menu based interfaces: these interfaces present the user within list of options, called menus, which lead the user through the formulation of a request. The
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 9

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

query is composed step-by-step by picking option from a menu that is displayed by the system. Forms based interface: a form-based interface display a form to each users. Users can file out all of the form entries to insert new data or they file only certain entries. Forms are actually designed and programmed for parametric end users. Graphical user interface: GUI displays a schema to the user. User can then specify a query by manipulating the diagram. Most GUI uses a pointing device as mouse to pick up the certain part of the displayed schema. Natural language interface: natural language interface refers to the world in its schema as well as a set of standard word to interpret the request. If the interpretation is successful, the interface generate a high level query corresponding to the natural language request and submit it to the DBMS for processing. Interfaces for parametric users: parametric users, such as bank teller, often have a small set of operations that they must perform repeatedly. System analyst and programmer designed and implement a special interface for parametric user. They generate keys by which that command automatically runs. Interfaces for the DBA: the DBA staff uses these interfaces. These commands are for creating accounts, setting system parameters, granting account authorization, changing a schema and reorganizing the storage structure of a database. 2.7. Database system environment The database and the DBMS catalog are usually stored on the disk. Access to the disk is controlled primarily by the operating system, which schedules disk input/ outputs. 2.7.1. Data manager: Modules of the DBMS controls: A. Access to the DBMS information i.e. Stored on the disk. B. It uses some basic OS services for carrying out low level data transfer between the disk and computer main storage. C. Handling buffers in the main memory. 2.7.2. DDL compiler It processes schema definition specified in the DDL. The stored description of the schemas in the DBMS catalog DBMS catalog: includes the following information Name of the files Data items
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 10

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


Storage details of each file Mapping information

2.7.3. Run-time database processor It handles database accesses. It receives retrieval or updates operations and carries them to the database. Access to the disk goes through the stored data manager. 2.7.4. Query compiler Handles high level queries that are entered interactively and then generates calls to the run time processors for executing the codes. 2.7.5. Pre-compiler Extracts DML commands from an application program written in a host language.
Then commands send to the DML compiler for compilation of object code.

2.8. Entity Relationship Model For designing a successful database application there are two terms that play major role in the designing of database application: Database application Application program

Database application: refers to a particular database (bank database) and associate program implements the queries and updates. Example: program thats implements database updates corresponding to customers. Making deposits and withdraws these program provides user friendly graphical interfaces (GUIs) utilizing forms and the testing of these application program. 2.8.1. Entities and attributes

Entities: the basic object that the ER model represents is an entity. The entity may be an object with a physical existence a particular person, car, house or employee or it may be an object with conceptual existence a company, a job or a universally course. Attribute: a particular property that describes the entity. Ex: entity employee may be describe by the employees name, age, address, salary and job.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 11

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Composite attribute: composite attribute can be divided into the subparts which represents more basic attributes with independent meaning.

Simple or atomic attribute: Attributes that are not divisible are called simple or atomic attribute. Single valued attributes: most attribute have a single value for a particular entity, such attribute are called single valued attribute. Ex: (age) single valued attribute for person. Multi valued attributes: the attributes, which may have more than one value. Colors attributes of a car. Car with one color have a single value where cars may have multiple values. Such attributes are called multi-valued attributes. Stored attributes: in some cases two attributes values are related. Ex: age and birth date of person. The value of an age can be determined by the current data and the value of the persons birth date. The age attribute is called the derived attribute and the birth date is called the stored attribute. Null values: in some cases a particular entity may not have appropriate value for an attribute. Ex: apartment number Complex attribute: we represent composite attribute between parenthesis () and separating the components by commas. Multi valued attributes by { }. Such attributes are called complex attribute. {Address phone ({phone (area code, phone number)})}

2.8.2. Entity types, entity sets, keys and values sets Entity types: an entity types defined a collection ( or sets ) of entities that have the same attributes. Each entity type in the database is described by its name and attributes. Entity sets: the collection of all entities of a particular entity type in the database at any point in time is called an entity sets.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 12

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Key attributes: an entity type usually have key attribute whose values are distinct for each individual entity in the collection. Such an attribute is called the key attribute. Values sets (domain of attribute) each simple attribute of an entity type is associated with a value set (or domain of value), which specify the set of values that may be assigned to that attribute for each individual entity. Ex: employee Age specify in the range 16 to 70. 2.8.3. Relationship types, sets and instances An association among entities is called a relationship. Relationship type R among n entity types E1, E2..En defines a set of associations. In another word, the attribute set R is a set of relationship instances.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 13

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Degree of relationship type: The degree of relationship type is the number of participated entry types. Ex: work for relationship is of degree two. Degree two- binary relationship Degree three - ternary relationship Role name: each entity type that participates in a relationship type plays a particular role in relationship. The role name specify the role that a particular entity from the entity play in each relationship instances and helps to explain what the relationship means. Recursive relationship: Role name is not important where all the participating entity type is distinct, since each entity type name can be used as the role name.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 14

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

In some cases, some entity type participates in more than one in a relationship type in different roles. In such cases role name becomes essential for distinguishes the meaning of each participation. Such relationship types are called recursive relationships.

Employee and supervisor entities are the member of the same employee entity types. Weak entity type: The entity types that do not have key attribute are called weak entity type. Weak entity type some times called the child entity type. Regular/ strong entity type: that have key attribute are called the regular or strong entity type. Identifying entity type is also some time called the parent entity type or dominant entity type. 2.8.4. Notations for ER diagram Symbols Meaning

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 15

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 16

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

2.8.5. Generalization We think of a reverse process of abstraction in which we suppress the differences among several entity type, identifying their common features and generalize them into a single super class.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 17

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 18

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

2.8.6. Aggregation Aggregation is an abstraction concept for building composite object from their component objects. There are calls where this concept can be used and related to EER module. Where we aggregate attribute value of an object to form the whole object. When we represent an aggregate relationship as an ordinary relationship. Combining objects that are related by a particular relation instances.

3.

RELETIONAL MODEL

The relational model represents the database as a collection of relations. Relation is thought of as a table of values, each row in the table represents a collection of related data values. In relational model, each row in the table corresponds to entity or relationship. In a relational model concept, a row is called a tuples, columns are called as attributes, and the table is called a relation. The data type describing the type of values that can appear in each column is called a domain. Domain: The domain D is a set of atomic values. Atomic means that each value in the domain is indivisible. USA_phone_number- 10 .digits Relation schemas: R is denoted as R(A1,A2,A3.......An) R is the relation name Ai attributes for I=1,2,3,.....n Student (name, SSN, home phone, address, office phone, age) 3.1. Characteristics of relation 1. Ordering of tuple in relation: a relation is defined as a set of tuple. Tuples in a relation do not have any specific meaning. 2. Ordering of values within a type: n-type is an ordered list of n- values, so ordering of value in a type. Attributes values are with in types of order.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 19

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


3. Values in the tuples: each value in a tuple is a atomic value. I.e. it is not divisible into its components. In a relational model concepts composite and multi valued attributes are not allowed. 4. Interpretation of relation: the relation schema can be interpreted as a declared or type of assertion.

Relational constraints: in this relational constraints we will study about the restrictions apply on the database schema. These includes Domain constraints: it specifies that the value of each attribute must be atomic value. Key constraints: a relation is defined as a set of tuples. All elements of sets are distinct. Hence all tuples in the relation must be distinct. No two tuple can have the same combination of all their attribute values. Entity integrity constraints: no primary key value can be null, because it is used to identify the individual tuples n a relation. Referential integrity constraints: it is specified between two relations and is used to maintain the consistency among tuples of the two relations. It is based on the foreign key concepts. 3.2. Operations of the relation model Operations on the relational model can be categorized into retrieval and updates. There are three basic updates operations on relations. Insert operation: it provides a list of attributes for a new tuple t that can be inserted into a relation R. Delete operation: it is used to delete a tuple from a relation if the tuple is being deleted as referenced by the foreign key from other tuple in the database. We use condition to delete the tuple. Ex: delete from employee Where SSN=. 985676; Update operation: is used to change the value of one or more attribute in a tuple of relation R. Ex: update employee Set age=.25. Where SSN=.576787; 3.3. Relational algebra operation

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 20

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

1. Select operation: is used to select the subset of the tuples from a relation that specify a selection condition or it is used to select some of the row from a relation.

2. Project operation: it is used to select some of the column (set of attribute) from a relation.

3. Rename operation: which is used to rename either relation name or attribute names or both. Rename (old table name) to (new table name) 3.4. Set theoretic operation Several set theoretic operations are used to merge the elements of two sets in various ways. These operations are as follows.

3.4.1. Union The result of this operation is denoted by the R U S, is a relation that includes all tuples that are either in R or in S or in both R and S. Duplicate tuples are eliminated. R U S = S U R {commutative operation} Select salesman 'ID", name From sales_master Where city =.mumbai. union Select client "ID" , name From client_master Where city =.mumbai.; 3.4.1.1. Restrictions on using an union operation is as follows

1. The number of column in all the queries should be same. 2. The data type of the column in each query must be the same. 3. Union cannot be used in the sub query.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 21

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


4. Aggregate function cannot be used with union clause.

3.4.2. Intersection The result of this operation is denoted by R S is a relation that include all tuples that are in both R and S. Select salesman "ID", name From sales_master Where city =.mumbai. intersect Select client "ID", name From client_master Where city =.mumbai.; 3.4.3. Set difference The result of this operation is denoted by R.S, is a relation that includes all tuples that are in in R but not in S. Selecr product_no from product_master Minus Select product_no from sales_order;

3.4.4. Join operation Denoted by X, is used to combine related tuples from two relations into a single tuple. This operation is very important because it allow us to process relationship among relations.

R X (join condition) S
There are some categories of join operations. 1. Cartesian product (cross product) or (cross Join); the main difference between the Cartesian product and join, in join, only combination of tuples satisfy the join condition appear in the result. 2. equi join: where only comparison operator is used =, is called the equi join. Each pair of attributes with identical value is spurious. Removal of spurious tuples is followed by natural join R * S.

3.4.5. Division operation Division operation if used for special kind of query that sometimes occurs in database application.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 22

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

3.4.6. Aggregate function On collection of values from the database, these functions are as follows:

SUM, AVERAGE, MAX, MIN


3.4.7. COUNT This function is used to count tuples and attributes.

3.4.8. Grouping This is used to group the attribute of any relation. Select company, sum (amount) from sales Group by company Having sum (amount) > 10,000;

3.4.9. Recursive closure operation This operation is used a recursive relationship.

3.4.10. Outer join


Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 23

TRAJECTORY EDUCATION
Natural join is denoted by R * S Where R, S are relations

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Only tuples from R that have matching tuple in S will be selected in the result and without matched tuples are eliminated. Null tuples also eliminated. A set of operations, called outer join can be used when we want to keep all the tuples in R and S or in both. The relations whether they match or not. Outer join is used to take the union of tuples of twp relations, if the relation is not union compatible. Then they are called partially compatible. Only some of their attributes are union compatible. This type of attribute has a key for both the relation. Left outer join: R =>< S Keeps every tuple or R, if no matching found in S, then S have null values. Right outer join: R ><= S Full outer join R=><=S {If no match found set null value in the tuple} Outer union: Student (name, SSN, department, advisor) Faculty (name, SSN, department, Rank) Result (name, SSN, department, advisor, Rank) All the tuples of both the relation will appear in the result. 3.5. Tuple relational calculus Relational calculus is formal query language. When we write one declarative expression to specify a relation request and hence there is no description how to evaluate the query. Tuple relational calculus is based on specifying a number of tuple variables. Variables may take as its value any individual tuple from that relation. A simple tuple relational calculus queries is of the form {t | cond(t)}

Result will display the set of all tuple t that satisfy cond(t). Ex: find all employees whose salary > 50,000.
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 24

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

{t|employee(t) and t.salary>50,000} This notation resembles how attributes name are qualified with relation names. {t.fname,t.lname|employee(t) and t.salary>50,000} select t.fname, t.lname from employee as t where t.salary >50,000; 3.5.1. Expression and formulas in tuples calculus A general expression of the tuple relational calculus of the form {t1.a1, t2.a2.....tn.an | cond(t1.t2.t3.t4........tn)} Where t1.a1, t2.a2.....tn.an tuple variable Ai is an attribute of the relation on which ti ranges. Cond is a condition or formula Formula: Formula is made up of predicate calculus atoms which can be one of the followings. 1. An atom of the form R(ti) where R relation name ti tuple variable R(ti) identifies the range of the tuple variable ti as the relation whose name is R 2. An atom of the form ti. A op tj.B where op comparison operator set = { > < >= <= #} ti and tj are tuple variable A attribute of the relation on which ti ranges B attribute of the relation on which tj ranges 3. An atom of the form ti.A op c or c op tj.B where op comparison operator ti and tj are tuple variable A attribute of the relation on which ti ranges B attribute of the relation on which tj ranges C constant value A formula is made up of one or more atoms connected via the logical operator AND, OR, NOT, and is defined as follows 1. Every atom is a formula. 2. If f1 & f2 are formulas, then so are ( f1 AND f2), ( f1 OR f2), ( f1 NOT f2) and NOT (f2)

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 25

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


3. The truth values of these formulas are derived from their component formulas f1 and f2 as follows. a. (f1 AND f2) is true if both f1 and f2 are true. b. ( f1 OR f2)is false , if both f1 and f2 are false otherwise true c. NOT (f1) is true if f1 is false, it is false if f1 is true d. NOT (f2) is true if f2 is false, it is false if f2 is true

3.5.2. Existence and universal quantifier Two special symbols called quantifier can appear in formulas, there are 1. Universal quantifier 2. Existential quantifier Firstly we need to define the concept of free and bound tuples in formulas. Bound: a tuple variable t is bound if it is quantified meaning that it appear in an and Free: otherwise it is free. We can define the tuple variable in a formula as free and bound according to the following rule. 1. An occurrence of a tuple variable in a formula F that is an atom is free in F. 2. An occurrence of a tuple variable t is free or bound in formula made up of logical connectives. (f1 AND f2), ( f1 OR f2), ( f1 NOT f2) and NOT (f2) depending on whether it is free or bound in f1 and f2. a tuple variable may be free or bound either in f1 or in f2. 3. All free occurrence of a tuple variable t in f are bound in a formula f of the form. F.= ( f) or F.= (F) The tuple variables are quantifier specified in f. F1= d.dname=.research. F2= ( d.dname=t.DNO) ( d.mgrssn=.12345677) F3= Tuple variable d is free in both f1 & f2. where it is found to the universal quantifier in f3. t is bound to the quantifier in f2.

3.5.3. 1.

Rules for the definition of a formula if f is formula then so is ( f) where t tuple variable
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 26

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


the formula ( f) is true if the formula f evaluates to true some ( at least one) tuple assigned to free occurrence of t in f. otherwise ( f) is false. if f is a formula , then so is (F) where t tuple variable The formula (F) is true, if formula f evaluates to true for every tuple (in the universe) assigned to free occurrence to t in f. otherwise (F) is false. Note: quantifier called the existential quantifier because a formula (f) is true , if there exist some tuples that makes f true. quantifier called the universal quantifier (F) is true for every possible tuple.

2.

3.6. Transforming the universal and existential quantifier We now use some of the transformation from mathematical logic that states the universal and existential quantifier. It is possible to transform a universal quantifier into an existential quantifier and vise-versa.

3.6.1. Domain relational calculus There is another type of relational calculus called the domain relational calculus or simply domain calculus. The QBE language related to domain calculus. The specification of domain calculus was proposed after the development of QBE language. Domain calculus is differing from the tuple calculus in the type of variable used in the formula. An expression of the domain calculus is of the form {X1, x2.....xn+1.....xn+m) | cond(x1,x2.....xn+1.....xn+m} where X1, x2.....xn+1.....xn+m are domain variable that ranges domain of attributes. Cond= is the condition or formula of the domain relational calculus.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 27

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

A formula is made up of atoms. A formula can be one of the followings. 1. An atom of the form R(x1, x2....xj) R name of relation of degree And each Xi 1<= I<=j is a domain variable 2. An atom of the form xi op xj where op comparison operator in the set 3. An atom is of the form xi op c or c op xj where op comparison operator in the set Xi and xj are domain variable c constant value

4.

DATABASE DESIGN

Conceptual database design gives us a set of relational schemas and integrity constraints (ICs) that can be regarded as a good starting point for the final database design. This initial design must be refined by taking the ICs in to account more fully than is possible with just the ER model constructs and also by considering performance criteria and typical workloads. We concentrate on an important class of constraints called functional dependencies. Other kind of ICs, for example, multi-valued dependencies and join dependencies, also provide useful information. They can sometimes reveal redundancies that cannot be detected using functional dependencies alone. 4.1. Schema Refinement Redundant storage of information is the root cause of problems. Although decomposition can eliminate redundancy; it can lead to problems of its own and should be used with caution. 4.1.1. Guidelines for relation schema

1. Semantics of the attributes: every attributes in the relation must belong to the relation as we know; relation is a collection of attributes and having a meaning. Semantics means, how the attribute values in a tuple relate to one another. Example: (ename, ssn, bdate, address, dnumber) Each attribute give the information about employees. 2. Redundant information in the tuples: For the best use of free space, we disallow the redundant tuples from a relation. For this we use some anomalies. Insert Anomalies
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 28

TRAJECTORY EDUCATION
Deletion Anomalies Modification Anomalies

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

3. Reducing null values in a tuple: Because this can waste space at the storage level and may create a problem with under standing the meaning of the attribute. Null values can have multiple interpretations. Attributes values does not apply. Attribute values are not known for a tuple. Value is known but has not been recorded yet. 4. Spurious tuples: Spurious tuples are those tuples which give the wrong information. The spurious tuples are marked by asterisks (*). Example: Emp_loc (ename, plocation) Emp_proj (ssn, pno, hours, pname, plocation)

4.2. Functional Dependencies A functional dependency is denoted by X X and Y. For any two tuples t1 and t2 in r T1[X]=t2[X] We must also have T1[Y]=t2[Y] This means that the value of Y component of a tuple is depend on, or determine by the value of X components or vise-versa. X called the left hand side of the FD Y called the right hand of the FD X functionally determines the Y in a relation R if and only if whenever two tuples of r(R) agree on their x value and agree on y values. 1. 2. X is a candidate key. Because the key constraints imply that not two tuples will have the same value of X. if X Y in R, this does not say whether or not Y X in R. Y between two sets of attributes

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 29

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

4.2.1. Interference rules for Functional Dependencies Set of functional dependency is denoted by F that is specified on relational schema R. it is impossible to specify all possible functional dependencies that may hold. The set of all such dependency is called the closure of F and is denoted by F+. F={ssn {ename,bdate,address,dnumber}, Dnumber {dname,dmgrssn}} Ssn {dname,dmgrssn}, Ssn ssn, Dnumber dname F+ is also known as infer dependency. To determine a systematic way to infer, we use inference rules. F=x y is used to denote that the functional dependency. X Y is inferred from the set of FD of F.

4.2.2.

Axioms to check if FD holds

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 30

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

4.2.3. An Algorithm to Compute Attribute Closure X+ with respect to F Let X be a subset of the attributes of a relation R and F be the set of functional dependencies that hold for R. 1. Create a hyper graph in which the nodes are the attributes of the relation in question. 2. Create hyperlinks for all functional dependencies in F. 3. Mark all attributes belonging to X 4. Recursively continue marking unmarked attributes of the hyper graph that can be reached by a hyperlink with all ingoing edges being marked. Result: X+ is the set of attributes that have been marked by this process. 4.2.3.1. Hyper graph for F

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 31

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

4.3. NORMALIZATION 4.3.1. Basics of normal forms A set of functional dependencies is specified for each relation, the process which is top-down fashion and decomposing relation as necessary. Initially codd(1972) proposed 1NF,2NF,3NF. The stronger definition of 3NF is boyce-codd normal form proposed be Boyce and codd. All these normal forms are based on the FD of a relation. After some time, 4NF & 5NF were proposed based on the concept of multi-valued dependencies and join dependency. 4.3.1.1. 1NF (first normal form) It was defined to disallow multi-value and composite attribute and their combination. It states that the domain of an attribute must include only atomic value. Values of any attributes in a tuple must be a single value from the domain of that attribute.

4.3.1.2.

2NF (second normal form)


Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 32

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Second normal form is based on the concept of full functional dependency. A FD (X Y) if full functional dependency. If removal of any attribute A from x means that the dependency does not hold any more. i.e A x (X {A}) does not determine Y. X Y is partial dependency if some attribute removes from x.

4.3.1.3. 3NF (third normal form) It is based on the concept of transitive dependency. FD x Y in a relation schema R is transitive dependency. There is a attribute z neither candidate key not a subset of any key of R.
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 33

TRAJECTORY EDUCATION
X Z Z Y Dependency hold.

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

4.3.1.4. Boyce-Codd normal form (BCNF) Boyce-Codd normal form (or BCNF) is a normal form used in database normalization. It is a slightly stronger version of the third normal form (3NF). A table is in Boyce-Codd normal form if and only if, for every one of its non-trivial functional dependencies X Y, X is a superkey - that is, X is either a candidate key or a superset thereof.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 34

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Only in rare cases does a 3NF table not meet the requirements of BCNF. A 3NF table which does not have multiple overlapping candidate keys is guaranteed to be in BCNF. Depending on what its functional dependencies are, a 3NF table with two or more overlapping candidate keys may or may not be in BCNF. An example of a 3NF table that does not meet BCNF is:

Today's Court Bookings Court 1 1 1 2 2 2 Start Time 09:30 11:00 14:00 10:00 11:30 15:00 End Time 10:30 12:00 15:30 11:30 13:30 16:30 Rate Type SAVER SAVER STANDARD PREMIUM-B PREMIUM-B PREMIUM-A

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 35

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


Each row in the table represents a court booking at a tennis club that has one hard court (Court 1) and one grass court (Court 2) A booking is defined by its Court and the period for which the Court is reserved Additionally, each booking has a Rate Type associated with it. There are four distinct rate types:

SAVER, for Court 1 bookings made by members STANDARD, for Court 1 bookings made by non-members PREMIUM-A, for Court 2 bookings made by members PREMIUM-B, for Court 2 bookings made by non-members

4.3.1.5. Algorithm for relational database For a database, a universal relation schema R=(A1,A2,.An) that include all the attribute of the database. In this universal relation assumption, this states that every attribute name is unique. A set of functional dependency that should hold on the attribute or R specified by the database designers. Using functional dependency, the algorithms decompose the universal relation schema R into a set of relation schema D=(R1,R2,.Rm) D= relational database schema (D is called a decomposition of R) We must sure that each attribute in R will appear in at least one relation schema Ri in the decomposition, so that no attribute are lost.
I=1UmRi=R

R = {R1UR2UR3.Rm} This is called attribute preservation condition of decomposition.

4.3.1.6. Decomposition and dependency preservation If each functional dependency XY specified in F appears directly in one of the relation schemas Ri in the decomposition D or could be inferred from the dependencies that appears in some Ri. This is the dependency preservation condition. We want to preserve the dependency because each dependency in F represents constraints on the database. That is needed to join two or more relations. Suppose that a relation R is given and a set of functional dependency F. F+ is the closure of F. Decomposition D = {R1, R2Rm} of R is dependency preservation with respect to F.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 36

TRAJECTORY EDUCATION
4.3.1.7.

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


Decomposition and lossless (non-additive) joins

Another property a decomposition D should process in the loss-less join or nonadditive join property. Which ensure that no spurious tuples are generated, when a normal join operation is applied to the relation in the decomposition. The condition of no spurious tuples should hold on every legal relation state. Every relation satisfies the functional dependency in F. A decomposition D={R1,R2Rm} of R has the loss-less (nonadditive) join property with respect to the set of dependency F of R. if every relation state r of R that satisfy F. where * is the natural join of all the relation in D.

Word loss in loss-less refers to the loss of information, not loss of tuples. If decomposition does not have loss-less join property. We may get additional spurious tuples. 4.3.1.8. Multi-valued dependencies and fourth normal forms In this section we will study about multi-valued dependency. That is a consequence of first normal form (1NF), which allowed an attribute in a tuple to have a set of values. For multi-valued attribute, we repeat every value of one of the attribute with every value of the other attribute to keep the relation state consistent. This constraint is specified by a multi-valued dependency.

An employee may work on several projects and several dependent. But project and dependent are independent to each other. To make the relation consistent, we must have a separate tuple to represent every combination of an employees dependent and employee project. This constraint is specified as multi-valued dependency.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 37

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

4.3.1.8.1. Inference rules for functional and multi-valued dependency To develop inference rule that includes both FDs and MVDs, so that both types of constraints can be considered together. Inference rules IR1 through IR8 form a complete set for FDs and MVDs from a given set of dependency. R={A1,A2.Am} and X,Y,Z,W are subset of R.

4.3.1.8.2. Fourth normal forms A relation schema R is in 4NF respect to a set of dependency F (that includes FD and MVD) if, for every MVDs X Y in F+, X is a super key for R.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 38

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

4.3.1.9.

Loss-less join decomposition

4.3.1.10. Join decomposition and fifth normal form Join dependency (JD), denoted by JD (R1, R2..Rn) specified on relation schema R, specifies constraints on state r of R. The constraints state that every legal state r of R should have a loss-less join decomposition into R1, R2..Rn.

A join decomposition JD (R1, R2..Rn) specified on relation schema R is a trival JD if one of the relation schema Ri in JD (R1, R2..Rn) is equal to R. such a dependency is called trival dependency because it has the loss-less join property. For any relation state r of R and hence does not specify any constraints on R. 4.3.1.10.1. Fifth normal forms (Project join normal form)

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 39

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

A relation schema R is in 5NF or project join normal form (PJNF) with respect to a set F of functional , multi-valued dependency JD (R1, R2..Rn) in F+ (i.e. implies by F), every Ri is a super key of R. Example:

4.4. Inclusive dependency Inclusion dependency was defined in order to formalize certain interrogational constraints. Example: Foreign key constraints cannot specify as FDs or MVDs. Because it relates attributes across relations. It can be specified as inclusive dependencies. Inclusive dependency is also used to represent the constraints between two relations. An inclusive dependency R.X<S.Y between two relation (set of attributes) X of relation R And Y of relation S x of R and y of S must have the same number of attributes. Example: If X = {A1, A2.An} And Y = {B1, B2Bn} Where 1<=I<=n Ai corresponds to Bi.
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 40

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


IDIR1 reflexive rule R.X<R.X IDIR2 Attribute correspondence If R.X<S.Y where X={A1,A2.An} And Y={B1,B2Bn} Ai corresponds to Bi. R.Ai<S.Bi for 1<=i<=n IDIR3 transitive rule If R.X<S.Y And S.Y<T.Z Then R.X<T.Z All the inclusion dependency represents referential integrity constraints.

Inference rules for inclusive dependency. 1. 2.

3.

5.

TRANSACTION MANAGEMENT

5.1. Transaction Concept A transaction is a unit of program that access and possibly updates various data items. A transaction usually results from the execution of a user program written in high level language or data manipulation language or any other programming language. Example: SQL, COBOL, C, PASCAL And is determine by statements or system calls of the form begin transaction and end transaction. The transaction consist of all the operation between begin and end. To ensuring the integrity of data, we require that the database must maintain the following properties. 1. Atomicity: either all operation of the transaction is reflected property in database or none. 2. Consistency: execution of a transaction in isolation (i.e. no other transaction execution concurrently) preserve the consistency of the database.
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 41

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


3. Isolation: even through multiple transactions can execute concurrently. Ti & Tj set of transactions Ti execution finished Tj start execution 4. 4. Durability: after a transaction complete successfully, the changes it has made to the database persist, even if there is a system failure.

These properties are called as ACID properties. Access to the database accomplished by the following two operations. 1. Read(X): which transfer the data item X from the database to local buffer belonging to the transaction that execute the read operation. 2. Write(X): that execute the write back to the database. Example: Ti that transfer $50 from account A to account B. Ti: READ(A) A:=A-50; WRITE(A) READ(B) B:=B+50; WRITE (B) Initial value of A and b are 1000$ and 2000$. Suppose the system failure occurs after the write (A) and before .Then the account information A=950$ B=2000$ 5.2. Transaction state Compensating transaction: to undo the effect of committed transaction is to execute a compensating transaction. We establish a simple abstract transaction model transaction must be in the following states. Active: the initial state, the transaction stays in this state while executing. Partially committed: after the final statement has been executed. Failed: after the discovery that normal executing can no longer proceed. Aborted: after the transaction has been rolled back and the database has been restored the prior state. Committed: after successful completion.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 42

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

A transaction enters the failed state after the system determines that the transaction can no longer proceed with its normal execution. Example: Hardware or logical errors, such as a transaction must be rolled back, then entered the aborted state, at this point system has two options. 1. Restart the transaction: hardware or software error 2. Kill the transaction: internal logical error that can be correct only by rewriting or because due to the bad input. 5.3. Implementation of atomicity and durability Recovery management component of a database system implements the support of atomicity and durability. Shadow-database scheme: transaction that wants to update on the database, first create the complete copy of the database. All updates are done into the new copy of the database, leaving the original copy, called the shadow copy. If at any time, transaction has to be aborted, the new copy deleted. The old copy of the database is unaffected. If transaction completes, operating system asks write all the new copy on to the disk. In UNIX operating system FLUSH command is used. After the FLUSH has completed db_pointer, now points to the current copy of the database.

5.4. Concurrent Execution A database system must control the interaction among the concurrent transaction to ensure consistency of the database. In this section, we focus on the concept of concurrent execution. Example: Consider the set of transaction that access and updates the bank account. Let T1 and T2 be two transactions. T1: READ(A) A:=A-50; WRITE(A) READ(B) B:=B+50; WRITE(B) T2: READ(A)
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 43

TRAJECTORY EDUCATION
TEMP:=A*0.1; A:=A-TEMP; WRITE(A) READ(B) B:=B+TEMP; WRITE(B)

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Initial value of A and b are 1000$ and 2000$. CASE1. If T1 followed by T2 A=855$ B=2145$ CASE2 If T2 followed by T1 A=850$ B=2150$ 5.5. Schedule Execution sequences are called as schedules that show the order of transaction execution. These schedules are called serial schedule. Each serial schedule consists of a sequence of instruction from the various transactions, where the instruction belonging to the single transaction appears together in the execution. If two transactions are running concurrently, the CPU switches between the two transactions or shared among all the transaction.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 44

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Final value of A and B are A=855$, B=2145$ Some of the schedules leave the database in inconsistence state. Consider the example:

Final value of A and B are A=900$, B=2150$. Here we gained 50$.


Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 45

TRAJECTORY EDUCATION
5.6. Serializability

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

The database system must control execution of concurrent transaction to ensure that the database system remains consistent. Then we first understand which will ensure consistency and which schedule will not. Generally transaction performs two operations. I. Read operation II. Write operation A transaction performs this sequence of operations on the copy of Q that is residing in the local buffer of the transaction. Here we will discuss different forms of schedules. I. Conflict Serializability II. View Serializability

5.6.1.

Conflict Serializability

Consider a schedule S that consist two consecutive transactions Ti and Tj. Where Ii and Ij are instructions respectively (Ij) 1. If Ii and Ij refer to the different data items then we can swap Ii and Ij without affecting the result of any instruction in the schedule. 2. If Ii and Ij refer to the same data items then the order of two steps may matters. Here we are dealing with two operation read operation and write operation. a. Ii=READ(Q), Ij=READ(Q) order does not matter. Because the same value of Q is read by both ( Ti and Tj) b. Ii=READ(Q), Ij=WRITE(Q) order will matter. c. Ii=WRITE(Q), Ij=READ(Q) order will matter. d. Ii=WRITE(Q), Ij=WRITE(Q) Since both instructions are write operation. The order of this instruction does not affect Ti and Tj. But the value obtained by the next read (Q) instruction of S is affected. We sat that Ii and Ij conflict if they are operation by different transaction on the same data item, and at least one of these instructions is a write operation.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 46

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Serial schedule is defined as the all the instruction of any transaction executes together.

If a schedule S can be transformed into a schedule S by a series of swaps of noconflicting instruction, we say that S and s are conflict equivalent. The concepts of conflict equivalent leads to the concepts of conflict Serializability, we say that a schedule S is conflict Serializable if it is conflict equivalent to a serial schedule. Such analysis is hard to implement and computationally expensive. We will consider one such definition.

5.6.2.

View Serializability

It is similar to conflict Serializability and based on the only read and write operations of transactions. Consider the two schedule S and S, where the same set of transaction participates in both schedule. The schedule and S are said to view Serializability, is they satisfy the following three conditions:

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 47

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


1. 1.for each data item Q, if transaction Ti reads the initial value of Q in schedule S, then the transaction Ti must be in schedule S, also read the initial value of Q. 2. for each data item Q, if transaction Ti executes the read (Q) in schedule S and that the value was produced by transaction Tj, then transaction Ti must be in schedule S also read the value of Q that was produced by the Tj. 3. for each data items Q, the transaction that performs the final write (Q) operation in schedule S must performs the final write(Q) operation in schedule S. 5.7. Recoverability Still we are discussing about which schedule will ensure the consistency of the database and which will not. With assuming that there is no transaction failure now, we address the effect of transaction failure during concurrent execution. Transaction Ti that fails, for what ever reason and we need to undo the effect of Ti to ensure atomicity property. In a system that allows concurrent execution. Tj that is dependent upon on Ti.( Tj reads the data item written by the Ti) also aborted. Thats why we need to place some restrictions on that schedules. 5.7.1. Recoverable schedule

Most database system requires that all schedules be recoverable. A recoverable schedule is one where, for each pairs of transaction Ti and Tj such that Tj reads a data item previously written by Ti. The commit operation of Ti appears before the commit operation of Tj. 5.7.2. Cascade less schedule

Consider the example

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 48

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

T10 writes a value that is read by T11. Suppose T10 fails, T10 must be rolled back. Since T11 is dependent on T10, T11 and T12. Then all the remaining transaction must be rolled back. The phenomenon in which a single transaction failure leads to a series of transaction rollbacks is called Cascading roll back. It is desirable that cascading roll backing should not be occurs in a schedule. Such schedules are called as cascade less schedule. For every pairs of transactions, such as Ti and Tj, where Tj reads the data item written by the Ti, the execution of Ti must finish before Tj. Then it is easy to identify that recoverable schedule is cascade less schedule. 5.8. Testing for Serializability Every schedule must be Serializable, we first understand to determine a given particulars schedule S is Serializable or not. Let S be a schedule. We construct a directed graph (precedence graph) from S. G= (V, E) Where V set of vertices E set of edges Vertices: consists all the transactions that are participating in a schedule. Edges: Ti Tj for which one of the following condition hold. 1. Ti executes write(Q) before Tj executes read(Q) 2. Ti executes read(Q) before Tj executes write(Q) 3. Ti executes write(Q) before Tj executes write(Q) For any particular schedule S T1---------T2 All the instructions of T1 executes before the first instruction of T2.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 49

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

5.9. Precedence graph By using precedence graph scheme it is not conflict serializable. But it is view serializable. There is an edge T4 T3 are called useless writes. To test view serializability, we develop a scheme for deciding whether an edge is need to be inserted in a precedence graph. Schedule S Tj reads a value written by Ti { Ti Tj} If schedule S is view serializable then any schedule S i.e equivalent to schedule S. Tk executes write(Q) Then in schedule S Tk Ti Either Tj Tk It can not appear between Ti and Tj.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 50

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

To test view serializability, we need to extend the precedence graph to include labeled edges. This types of graph termed as label precedence graph. Rules for inserting labled edges in precedence graph: Let us consider a schedule S having transaction s (T1, T2.Tn) Let Tb and Tf two transactions Tb issues write(Q) for each Q accessed in S Tf issues read(Q) for each Q accessed in S Now, we construct a new schedule S from S by inserting Tb at the beginning of S Tf at the end of S. We construct the labeled precedence graph for schedule S as follows. 1. Add an edge Ti Tj. If Tj reads the value of a data item Q written by Ti. 2. Remove all edges incident on useless transactions. A transaction Ti is useless if there exsist no path in the precedence graph, from Ti Tf. 3. for each data item Q such that Tj reads a value of Q written by Ti and Tk executes write(Q) and TkTb, do the followings: a. Ti=Tb and Tj Tf then insert an edge Tj Tk. b. If Ti Tb and Tj=Tf then insert an edge Tk Ti. c. If Ti Tb and TjTf then insert an edge Tk Ti. And Tj Tk in the labled precedence graph. Where P= unique number.

6.

CONCURRENCY CONTROL

When several transactions executes concurrently in the database, the isolation property may no longer preserved. It is necessary for the system to control the interaction among concurrent transaction. These types of controls are termed as concurrency control schemes. 6.1. Lock based protocols One way to ensure the serializability is to require that access to data item be done in a mutually exclusive manner. I.e while one transaction is accessing a data item, no other transaction can modify that data item. One way to implement this requirement is to allow a transaction to access a data item if it is currently holding a lock on that data item. 6.1.1. Locks There are various modes in which a data item may be locked. Share mode: if a transaction Ti has share mode lock (denoted by S) on the data item Q, then Ti can read but can not write Q. Exclusive mode: (denoted by X) then Ti can perform both read and write on Q.
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 51

TRAJECTORY EDUCATION
Example: T1: LOCK-X(B) READ(B) B:=B+50; WRITE(B) UNLOCK(B) LOCK-X(A) READ(A) A:=A-50; WRITE(A) UNLOCK(A); T2: LOCK-S(A) READ(A) UNLOCK(A) LOCK-S(B) UNLOCK(B) DISPLAY(A+B);

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Initial amount A=100$ B=200$ Case1. Case 2. Case 3. T1 followed by T2 T2 followed by T1

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 52

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

This schedule will display the ( A+B) as 250$.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 53

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

This situation is called deadlock. When deadlock occurs, the system must roll back on of the two transactions. The data item that was locked by that transaction is unlocked. These data items were available to other transactions. 6.1.2. Granting of locks When a transaction requests a lock on the data item in particular mode, and no other transaction has a lock on the same data item in a conflicting mode. The lock can be granted. Suppose Transaction T2 has a lock and T1 request (has to wait) for T2 release the exclusive mode lock. T1 will wait T2 lock-S(Q)( has) T1 lock-X(Q) (wait) T3 lock-S(Q)( request) T4 lock-S(Q)( request) T1 is still waiting. This situation is called as starvation where a particular transaction continuously waiting for a particular lock on the same data item.

6.1.3.

Avoiding starvation of transaction by granting locks

When a transaction Ti request a lock on data item Q in particular mode M, the lock is granted provided that 1. There is no other transaction holding a lock on Q in a mode that conflict with M. 2. There is no other transaction that is waiting for a lock on Q and that made its lock request before Ti. 6.2. Two phase locking protocol One protocol that ensures serializability is the two phase locking protocol. This protocol requires that each transaction issue locks and unlock request in two phases. 1. Growing phase: a transaction may obtain locks but not release any locks. 2. Shrinking phase: a transaction may release locks but may not obtain any new locks. When a transaction has obtained its final locks is called the lock point of the transaction.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 54

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Cascading rollbacks can be avoided by a modification of two phase locking called the strict two-phase locking protocol. This protocol requires that all exclusive locks taken by the transaction must be held until that transaction commits. This requirement ensures that any data item written by an uncommitted transaction are locked in exclusive mode until the transaction commits, preventing any other transaction from reading the data. Another type of protocols is the rigorous two-phase locking protocol. Which requires all lock to be held until the transaction commits. It can be easily verified that transaction can be serialized. 6.3. Graph based protocol If we wish to develop protocol that is not two-phase, we need additional information on how each transaction will access the database. In this model we have prior knowledge about the order in which the database item will be accessed. To acquire such prior knowledge a particular order on the set D = (d1, d2..dn) of all data items. If di dj. Then any transaction di and dj must access di before accessing dj. This ordering can be shown as a directed acyclic graph, called database graph. We restricted to employee only exclusive locks. In the tree protocol, the only lock allowed is lock-X. Each transaction action Ti can lock data item at most once and must follows the rules. a. The first lock by Tj may be on any data item. b. Subsequently, a data item Q can be locked by Ti only if the parent of Q is currently locked by Ti. c. Data item may be unlocked at any time. d. A data item that has been locked and unlocked by Ti can not be subsequently be relocked by Ti. Advantages: 1. Unlocking may access earlier that leads to shorter waiting time and to increase concurrency. 2. Protocol is deadlock free, no roll backs are required. Disadvantages: 1. 1. Locking results in increased locking 2. 2. Additional waiting time 3. 3. Potential decrease in concurrency 6.4. Time-stamp based protocol

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 55

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

These types of locking protocol, we use for ordering between every pairs of conflicting transaction is determines at execution time. Time-stamp: With each transaction Ti in the system, we associated a unique fixed time stamp denoted by TS (Ti). This time stamp assigned by the database system before the transaction Ti starts execution. If TS(Ti) T0 transaction Ti

New entered transaction Ti TS(Tj)There are two simple methods for implementing this scheme. 1. Use the value of system clock as the time stamp. Thats a transaction time stamp is equal to the value of the clock when the transaction enters the system. 2. Use a logical counter that is incremented after a new time-stamp has been assigned. Transaction time-stamp is equal to the value of the counter. 6.5. Validation based protocol In some cases, where the majority of the transactions are read-only transaction, rate of conflicts among transaction may be low. But we do not know in advance which transaction will be involved in a conflict. To gain that we need to scheme for monitoring the system, we assume that each transaction Ti executes in two phases. 1. Read phase: during this phase, the execution of transaction Ti takes place, the value of the various data item are read and are stored in variable local to Ti. All write operations are performed on temporary local variable, without updating the actual database. 2. Validation phase: transaction Ti performs a validation test to determine whether it can copy to the database. The temporary local variable that holds the result of write operation without causing a violation of serializability. 3. if transaction Ti succeeds in validation ( step 2). Then the actual updates are applied to the database. Otherwise Ti is roll back. a. Start (Ti), the time when Ti started its execution. b. Validation (Ti), the time when Ti finished read phase and started its validation phase. c. Finish (Ti), the time, when Ti finished its write phase. 6.6. Recovery system

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 56

TRAJECTORY EDUCATION
6.6.1. Failure Classification

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

There are various types of failure that occurs in the system. Each of which deals with in a different manners. Simple failure: does not loss of information in a system. Difficult failure: of information in a system. Here we consider only the following types of failure: 6.6.1.1. Transaction failure

There are two types of error that may cause transaction to fail. Logical error: transaction can no longer proceed with its normal execution. Due to such as bad input, data not found, overflow or resource limit exceeded. System errors: The system has entered in undesirable state ( deadlock) as a result of which a transaction cannot continue with its normal execution. For this transaction re-execute after. System crash: Such as bug in the database software, operating system fails, that causes loss of contents of volatile storage. Disk failure: Disk blocks loses its contents, either head crash. To recover this types of failure , tapes are used.

6.6.2. Log based recovery: The most widely structure for recording database modification is the log. The log is a sequence of log records and maintains a record of all the update in the database. Log records having the following fields: Transaction identifier: It is a unique identifier of the transaction that performs write operation. Data item identifier: It is identifier of the data item. Basically it is the location of the data item on the disk. Old value: Value of the data item prior to the write operation. New value: Value of the data item will have after the write operation. Log record exist to record significant events during transaction processing. < Ti, start> transaction Ti has started.
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 57

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

<Ti,Xj,V1,V2> transaction Ti performed write operation on the data item Xj, has the value V1 before the write , will has V2 after the write. <Ti, commit> transaction Ti has committed. <Ti, abort> transaction ti has aborted. 6.7. Deferred Database Modification In this scheme, when a transaction is partially commits, the information on the log associated with the transaction is used in executing the deferred writes. If the system crashes before the transaction completes. Its execution or if the transaction aborts then the information on the log is simply ignored. T0: READ(A) A:=A-50; WRITE(A) READ(B) B:=B+50; WRITE(B) T1: READ(C) C:=C-100; WRITE(C)

6.8. Immediate Database Modification

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 58

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

The immediate update technique allows database modification to be output to the database while the transaction still in the active state. Database modifications written by active transaction are called uncommitted modification. In the event of a crash or transaction failure, the system must use the old value field of the log records to restore the modified data item. < T0, start> < T0,A,1000,950> < T0,B,2000,2050> < T0,commit> < T1, start> < T1,C,700,600> < T1,commit>

7.

CENTRALIZED AND DISTRIBUTED DATABASE

In the traditional enterprise computing model, an Information Systems department maintains control of a centralized corporate database system. Mainframe computers, usually located at corporate headquarters, provide the required performance levels. Remote sites access the corporate database through wide-area networks (WANs) using applications provided by the Information Systems department. Changes in the corporate environment toward decentralized operations have prompted organizations to move toward distributed database systems that complement the new decentralized organization. Todays global enterprise may have many local-area networks (LANs) joined with a WAN, as well as additional data servers and applications on the LANs. Client applications at the sites need to access data locally through the LAN or remotely through the WAN. For example, a client in Tokyo might locally access a table stored on the Tokyo data server or remotely access a table stored on the New York data server. Both centralized and distributed database systems must deal with the problems associated with remote access: Network response slows when WAN traffic is heavy. For example, a mission-critical transaction-processing application may be adversely affected when a decision-support application requests a large number of rows. A centralized data server can become a bottleneck as a large user community contends for data server access. Data is unavailable when a failure occurs on the network.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 59

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

7.1. Distributed Database System A distributed database system is a collection of data that belongs logically to the same system but is physically spread over the sites of a computer network. 7.2. Some advantages of the DDBMS are as follows: 1. Distributed nature of some database application: some database application arte naturally distributed over the different sites. 2. Increased reliability and availability: there are two most common advantages for any database. Reliability is broadly defined as the probability that a system is up at a particular moments. Availability is the probability that the system is continuously available during a time interval. 3. Allowing data sharing while maintaining some measures of local controls: it is possible to control the data & software locally at each site. However the certain data can be accessed by users at other remote site through the DBMS software. This allows the controlled sharing of data through out the distributed system. 4. Improved performance: when a large data is distributed over the multiple sites, smaller data base exist at each site. As a result, local queries & transaction accessing data at a single site have better performance because of the smaller local database. If all the transaction are submitted to a single centralized database, than the performance will be decreased. 7.3. Some additional properties: 1. The ability to access remote sites and transmit queries and data among the various sites via a communication network. 2. The ability to decide on which copy of a replicated data item to access. 3. The ability to maintain the consistency of copies of a replicated data item. 4. The ability to recover from individual site crashes and from new types of failure such as the failure of the communication links.

7.4. Physical hardware level The following main factors distinguish a DDBMS from a centralized system: 1. There are multiple computers called site or nodes. 2. These sites must be connected by some types of communication network to transmit data and command among the site.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 60

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

The site may be within the same building or group of adjacent building via local area network or they may be geographically distributed over the large distance and connected via a long haul network. Local area network typically uses cables. Whereas long haul network use telephone lines or satellites it is also possible to use a combination of the two types of network. Networks may have different topologies that define the different communication among sites. 7.5. Client Server Architecture The client server architecture has been developed to deal with new computing environment in which a large no. of personal computers, workstations, file servers, peripherals and others equipments are connected together via a network. The idea is to define specialized covers with specific functionalities.

The instruction between client and server might proceed as follows during processing of an SQL query. 1. The client passes a users query and decomposition it into a number of independent site queries. Each site query is sent to the appropriate receiver site. 2. Each server processes the local query and sends the resulting relation to the client site. 3. The client site combines the result of the sub queries to improve the result of the originally submitted query. In this approach SQL server has called a database processor (DP) or a back-end machine whereas the client has been called as application processor (AP) or front-end machine. The DDBMS, it is to divide the software modules into the three levels. 1. The server software is responsible for local data management at a site.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 61

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


2. The client software is responsible for most of the distribution function. It accesses the data distribution information from the DDBMS catalog and processes all request that require access to more than one site. 3. The communication software provides the communication primitives that are used by the client to transmit command and data among the various sites as needed.

7.6. Data fragmentation If relation r is fragmented, r is divided into a number of fragments r1, r2rn. These fragments contains the sufficient information to allow reconstruction of the original relation r. this reconstruction can take place through the application of either the union operation or special types of join operation on the various fragments. There are three different types of schemes for fragmenting a relation: I. Horizontal fragmentation II. Vertical fragmentation III. Mixed fragmentation 7.6.1. Horizontal fragmentation In this each tuple of r is fragment into one or more fragments horizontally. A relation r is partitioned into a number of subsets r1, r2rn. Each tuple of the relation r must belong to at least one of the fragments so that the original relation can be reconstructed. These fragments can be defined as a selection operation. For reconstruction we uses union operation R=r1Ur2U.rn 7.6.2. Vertical fragmentation In this each column of r is fragment into one or more fragments vertically. Vertical fragmentation r(R) involves the subset of attributes R1,R2..Rn of the schema R such that R=R1UR2U.Rn Each fragments of r is defined by project operation For reconstruction we uses join operation R=r1r2.rn 7.6.3. Mixed fragmentation Either the horizontal fragments or vertical fragments. A relation r is divided into a number of fragments R1,R2..Rn. each fragments is obtained as the

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 62

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

result of applying either the horizontal fragmentation or vertical fragmentation scheme on relation r or a fragments of r which was obtained previously. 7.7. Data Replication If r relation is replicated, a copy of relation r is stored in two or more sites. If we have full replication in which a copy is stored in every site in the system. Availability: If one site fails then the relation may found on the other site. This system may continue the process. Increased parallelism: where the majority of access to the relation r result in only the reading the relation. Then the several sites can process the queries involving r in parallel. Then there is the chance that needed data is found when the transaction is executing. Increased overhead on update: The system must ensure that all replicas of a relation r are consistent; otherwise error ness computations may result. Whenever r is updated, the update must be propagating to all sites containing replicas. 7.8. Deadlock handling A system is in deadlock state if there exist a set of transaction such that every transaction in a set is waiting for the transaction in the set. Suppose a set of waiting transaction { T0,T1Tn} T0 is waiting for a data item held by T1 . . . . Tn is waiting for a data item held by T0 No any transaction can make progress in this situation. There are two principal methods for dealing with deadlock problems: a. Deadlock prevention: this protocol ensures that system will never enter in deadlock state. b. Deadlock detection and recovery: we allow a system to enter in deadlock state and then they try to recover. 7.8.1. Deadlock prevention There are two approaches to deadlock prevention Approach1: i. No cyclic waits can occurs ii. All locks to be acquired together. Approach2: i. This approach is closer to deadlock recovery

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 63

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


ii. We rollback transactions instead of waiting for deadlock under the first approach

7.8.1.1.

The first approach

Each transaction locks all the data item before it begin it execution. Disadvantages: i. It is often hard to predict, before the transaction begins, what data item need to be locked. ii. Data item utilization will be very low, since many of data items may be locked but unused for a long time. 7.8.1.2. The second approach For preventing the deadlock is to use preemption method and transaction rollback. In preemption: T2 request lock by T1 The lock granted to T1 may be preempted by roll backing back of T1 and granting of lock to T2. To control preemption we assign a unique time stamp to each transaction. The system uses these time stamp only to decide whether a transaction should wait or roll back. Two different deadlock prevention schemes are proposed: 1. wait die: this scheme is based on a non-preemption techniques. Ti request a data item held by Tj ti is allowed to wait if Ti( time stamp)< Tj ( time stamp) 2. wound-wait: preemption techniques and is a counter part to the wait-die scheme. Ti request a data item held by tj Ti is allow to wait only if Ti( time stamp) >Tj ( time stamp)

7.8.1.3. Time out based scheme Another simple technique is based on the lock time outs. In this approach, the transaction that has requested a lock waits for at most a specified amount of time. If the lock has not been granted within that time, the transaction is said to be time out and it rolled back itself and restarts. Disadvantages: i. One or more transactions involved in deadlock.
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 64

TRAJECTORY EDUCATION
ii. iii. iv.

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


Short a wait result in transaction rollback, even there is no deadlock. Leading to wasted resources. Starvation is also possibility with this scheme.

7.8.2. Deadlock detection and recovery i. If a system does not employ that ensures deadlock freedom, then a detection & recovery scheme must be used. ii. In this schemes system determines iii. Whether a deadlock has occurred, if one has system must attempt to recover from the deadlock. To do this, system must i. Maintains information about the current allocation of data item to transaction as well as resulting data item. ii. Develop an algorithm that uses this information to determine whether the system has entered a deadlock state. iii. Recover from deadlock, if deadlock exists. 7.8.2.1. Deadlock detection To describe deadlock we use directed graph called wait-for graph. Graph consist G=(V,E) V set of vertices (all the transaction in the system) E set of edges 7.8.2.1.1. Directed graph Ti Tj Ti is waiting for transaction Tj to release data item that it needs. A deadlock exists in a system if and only if the wait for graph contains a cycle. Each transaction in the cycle is said to be deadlocked. To detect deadlock, the system maintains the wait-for graph and there search the cycle in the graph.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 65

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

7.8.2.2. Recovery from the deadlock When system determines that a deadlock exists. The system must recover from deadlock. The most common solution is to roll back one or more transaction to break deadlock. The following actions need to be taken: 1. Select a victim: which one transaction is to be rollback. a. How long the transaction has completed the task. b. How many data items the transaction has used. c. How many more data item the transaction needs for it to complete. d. How many transactions will be involved in the rollback 2. Rollback: once we have decided that the particular transaction must be roll back. We must determine how far this transaction should be rolled back. But for these methods, system requires to maintain the information about the state of all running transaction. 3. Starvation: when a system determines that a particular transaction never completes its designated task. This situation is called starvation.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 66

TRAJECTORY EDUCATION
8.

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


SQL (STRUCTURED QUERY LANGUAGE)

SQL (Structured Query Language) is a database sublanguage for querying and modifying relational databases. It was developed by IBM Research in the mid 70's and standardized by ANSI in 1986. The Relational Model defines two root languages for accessing a relational database -- Relational Algebra and Relational Calculus. Relational Algebra is a low-level, operator-oriented language. Creating a query in Relational Algebra involves combining relational operators using algebraic notation. Relational Calculus is a high-level, declarative language. Creating a query in Relational Calculus involves describing what results are desired. SQL is a version of Relational Calculus. The basic structure in SQL is the statement. Semicolons separate multiple SQL statements. 8.1. DDL Statements DDL stands for data definition language. DDL statements are SQL Statements that define or alter a data structure such as a table. DDL statements are used to define the database structure or schema. Some examples: CREATE - to create objects in the database ALTER - alters the structure of the database DROP - delete objects from the database TRUNCATE - remove all records from a table, including all spaces allocated for the records are removed COMMENT - add comments to the data dictionary RENAME - rename an object

8.1.1.

Implicit commits

In Oracle, a (successful) DDL statement implicitly commits a transaction; SQL> create table ddl_test_1 (a number); SQL> insert into ddl_test_1 values (1); SQL> commit;

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 67

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

SQL> insert into ddl_test_1 values (2); SQL> create table ddl_test_2 (a number); SQL> rollback; The create table statement implicitly committed the transaction. The insertion of the value 2 into ddl_test_1 cannot be rolled back any more. SQL> select * from ddl_test_1; A ---------1 2 8.1.2. Data dictionary

Since DDL changes definitions of database objects, a DDL is always reflected in the data dictionary. 8.2. DML Data Manipulation Language (DML) statements are used for managing data within schema objects. Some examples: SELECT - retrieve data from the a database INSERT - insert data into a table UPDATE - updates existing data within a table DELETE - deletes all records from a table, the space for the records remain MERGE - UPSERT operation (insert or update) CALL - call a PL/SQL or Java subprogram EXPLAIN PLAN - explain access path to data LOCK TABLE - control concurrency

8.3. Language Structure

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 68

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

SQL is a keyword based language. Each statement begins with a unique keyword. SQL statements consist of clauses which begin with a keyword. SQL syntax is not case sensitive. The other lexical elements of SQL statements are:

names -- names of database elements: tables, columns, views, users, schemas; names must begin with a letter (a - z) and may contain digits (0 - 9) and underscore (_) literals -- quoted strings, numeric values, date-time values Delimiters -- + - , ( ) = < > <= >= <> . * / || ? ;

Basic database objects (tables, views) can optionally be qualified by schema name. A dot -- ".", separates qualifiers: schema-name. table-name Column names can be qualified by table name with optional schema qualification. 8.4. Basic SQL Queries There are 3 basic categories of SQL Statements: SQL-Data Statements -- query and modify tables and columns o SELECT Statement -- query tables and views in the database o INSERT Statement -- add rows to tables o UPDATE Statement -- modify columns in table rows o DELETE Statement -- remove rows from tables SQL-Transaction Statements -- control transactions o COMMIT Statement -- commit the current transaction o ROLLBACK Statement -- roll back the current transaction SQL-Schema Statements -- maintain schema (catalog) o CREATE TABLE Statement -- create tables o CREATE VIEW Statement -- create views o DROP TABLE Statement -- drop tables o DROP VIEW Statement -- drop views o GRANT Statement -- grant privileges on tables and views to other users o REVOKE Statement -- revoke privileges on tables and views from other users SQL data statements SELECT Statement
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 69

8.4.1. 8.4.1.1.

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

The SQL SELECT statement queries data from tables in the database. The statement begins with the SELECT keyword. The basic SELECT statement has 3 clauses:

SELECT FROM WHERE

The SELECT clause specifies the table columns that are retrieved. The FROM clause specifies the tables accessed. The WHERE clause specifies which table rows are used. The WHERE clause is optional; if missing, all table rows are used. For example, SELECT name FROM s WHERE city='Rome'

8.4.1.2. INSERT Statement The INSERT Statement adds one or more rows to a table. It has two formats:
INSERT INTO table-1 [(column-list)] VALUES (value-list)

and,
INSERT INTO table-1 [(column-list)] (query-specification)

The first form inserts a single row into table-1 and explicitly specifies the column values for the row. The second form uses the result of query-specification to insert one or more rows into table-1. The result rows from the query are the rows added to the insert table. Note: the query cannot reference table-1. INSERT Examples
INSERT INTO p (pno, color) VALUES ('P4', 'Brown')

Before pno descr color P1 Widget Blue P2 Widget Red P3 Dongle Green =>

After pno descr color P1 Widget Blue P2 Widget Red P3 Dongle Green P4 NULL Brown

INSERT INTO sp SELECT s.sno, p.pno, 500 Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 70

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


FROM s, p WHERE p.color='Green' AND s.city='London'

Before sno pno qty S1 P1 NULL S2 P1 S3 P1 S3 P2 200 1000 200 =>

After sno pno qty S1 P1 NULL S2 P1 S3 P1 S3 P2 S2 P3 200 1000 200 500

8.4.1.3. UPDATE Statement The UPDATE statement modifies columns in selected table rows. It has the following general format: UPDATE table-1 SET set-list [WHERE predicate] The optional WHERE Clause has the same format as in the SELECT Statement. The set-list contains assignments of new values for selected columns. UPDATE Examples
UPDATE sp SET qty = qty + 20

Before sno pno qty S1 P1 NULL S2 P1 S3 P1 S3 P2 200 1000 200 =>

After sno pno qty S1 P1 NULL S2 P1 S3 P1 S3 P2 220 1020 220

UPDATE s SET name = 'Tony', city = 'Milan' WHERE sno = 'S3'

Before sno name city => S1 Pierre Paris S2 John London

After sno name city S1 Pierre Paris S2 John London

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 71

TRAJECTORY EDUCATION
S3 Mario Rome

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


S3 Tony Milan

8.4.1.4. DELETE Statement The DELETE Statement removes selected rows from a table. It has the following general format:
DELETE FROM table-1 [WHERE predicate]

The optional WHERE Clause has the same format as in the SELECT Statement. DELETE Examples
DELETE FROM sp WHERE pno = 'P1'

Before sno pno qty S1 P1 NULL S2 P1 S3 P1 S3 P2 200 1000 200 =>

After sno pno qty S3 P2 200

DELETE FROM p WHERE pno NOT IN (SELECT pno FROM sp)

Before pno descr color P1 Widget Blue P2 Widget Red P3 Dongle Green =>

After pno descr color P1 Widget Blue P2 Widget Red

8.4.2.

SQL-Transaction Statements

SQL-Transaction Statements control transactions in database access. This subset of SQL is also called the Data Control Language for SQL (SQL DCL).

8.4.2.1. COMMIT Statement The COMMIT Statement terminates the current transaction and makes all changes under the transaction persistent. It commits the changes to the database. The COMMIT statement has the following general format:
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 72

TRAJECTORY EDUCATION
COMMIT [WORK]

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

WORK is an optional keyword that does not change the semantics of COMMIT. 8.4.2.2. ROLLBACK Statement The ROLLBACK Statement terminates the current transaction and rescinds all changes made under the transaction. It rolls back the changes to the database. The ROLLBACK statement has the following general format:
ROLLBACK [WORK]

WORK is an optional keyword that does not change the semantics of ROLLBACK.

8.4.3.

SQL-Schema Statements

SQL-Schema Statements provide maintenance of catalog objects for a schema -tables, views and privileges. This subset of SQL is also called the Data Definition Language for SQL (SQL DDL). 8.4.3.1. CREATE TABLE Statement The CREATE TABLE Statement creates a new base table. It adds the table description to the catalog. A base table is a logical entity with persistence. The logical description of a base table consists of:

Schema -- the logical database schema the table resides in Table Name -- a name unique among tables and views in the Schema Column List -- an ordered list of column declarations (name, data type) Constraints -- a list of constraints on the contents of the table

The CREATE TABLE Statement has the following general format:


CREATE TABLE descr|constraint}]...) table-name ({column-descr|constraint} [,{column-

table-name is the new name for the table. column-descr is a column declaration. constraint is a table constraint.

8.4.3.2. CREATE VIEW Statement The CREATE VIEW statement creates a new database view. A view is effectively a SQL query stored in the catalog. The CREATE VIEW has the following general format:
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 73

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


CREATE VIEW view-name [ ( column-list ) ] AS query-1 [ WITH [CASCADED|LOCAL] CHECK OPTION ]

view-name is the name for the new view. column-list is an optional list of names for the columns of the view, comma separated. query-1 is any SELECT statement without an ORDER BY clause. The optional WITH CHECK OPTION clause is a constraint on updatable views. column-list must have the same number of columns as the select list in query-1. If column-list is omitted, all items in the select list of query-1 must be named. In either case, duplicate column names are not allowed for a view. 8.4.3.3. DROP TABLE Statement The DROP TABLE Statement removes a previously created table and its description from the catalog. It has the following general format:
DROP TABLE table-name {CASCADE|RESTRICT}

table-name is the name of an existing base table in the current schema. The CASCADE and RESTRICT specifiers define the disposition of other objects dependent on the table. A base table may have two types of dependencies:

A view whose query specification references the drop table. Another base table that references the drop table in a constraint - a CHECK constraint or REFERENCES constraint.

RESTRICT specifies that the table not be dropped if any dependencies exist. If dependencies are found, an error is returned and the table isn't dropped. CASCADE specifies that any dependencies are removed before the drop is performed:

Views that reference the base table are dropped, and the sequence is repeated for their dependencies. Constraints in other tables that reference this table are dropped; the constraint is dropped but the table retained.

8.4.3.4. DROP VIEW Statement The DROP VIEW Statement removes a previously created view and its description from the catalog. It has the following general format:
DROP VIEW view-name {CASCADE|RESTRICT}

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 74

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

view-name is the name of an existing view in the current schema. The CASCADE and RESTRICT specifiers define the disposition of other objects dependent on the view. A view may have two types of dependencies:

A view whose query specification references the drop view. A base table that references the drop view in a constraint - a CHECK constraint.

RESTRICT specifies that the view not be dropped if any dependencies exist. If dependencies are found, an error is returned and the view isn't dropped. CASCADE specifies that any dependencies are removed before the drop is performed:

Views that reference the drop view are dropped, and the sequence is repeated for their dependencies. Constraints in base tables that reference this view are dropped; the constraint is dropped but the table retained.

8.4.3.5. GRANT Statement The GRANT Statement grants access privileges for database objects to other users. It has the following general format:
GRANT privilege-list ON [TABLE] object-list TO user-list

privilege-list is either ALL PRIVILEGES or a comma-separated list of properties: SELECT, INSERT, UPDATE, DELETE. object-list is a comma-separated list of table and view names. user-list is either PUBLIC or a comma-separated list of user names. The GRANT statement grants each privilege in privilege-list for each object (table) in object-list to each user in user-list. In general, the access privileges apply to all columns in the table or view, but it is possible to specify a column list with the UPDATE privilege specifier:
UPDATE [ ( column-1 [, column-2] ... ) ]

If the optional column list is specified, UPDATE privileges are granted for those columns only. The user-list may specify PUBLIC. This is a general grant, applying to all users (and future users) in the catalog. Privileges granted are revoked with the REVOKE Statement.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 75

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

The optional specifier WITH GRANT OPTION may follow user-list in the GRANT statement. WITH GRANT OPTION specifies that, in addition to access privileges, the privilege to grant those privileges to other users is granted. GRANT Statement Examples
GRANT SELECT ON s,sp TO PUBLIC GRANT SELECT,INSERT,UPDATE(color) ON p TO art,nan GRANT SELECT ON supplied_parts TO sam WITH GRANT OPTION

8.4.3.6. REVOKE Statement The REVOKE Statement revokes access privileges for database objects previously granted to other users. It has the following general format:
REVOKE privilege-list ON [TABLE] object-list FROM user-list

The REVOKE Statement revokes each privilege in privilege-list for each object (table) in object-list from each user in user-list. All privileges must have been previously granted. The user-list may specify PUBLIC. This must apply to a previous GRANT TO PUBLIC. REVOKE Statement Examples
REVOKE SELECT ON s,sp FROM PUBLIC REVOKE SELECT,INSERT,UPDATE(color) ON p FROM art,nan REVOKE SELECT ON supplied_parts FROM sam

8.5. Union, Intersect and Except The UNION, EXCEPT, and INTERSECT operators all operate on multiple result sets to return a single result set:

The UNION operator combines the output of two query expressions into a single result set. Query expressions are executed independently, and their output is combined into a single result table. The EXCEPT operator evaluates the output of two query expressions and returns the difference between the results. The result set contains all rows returned from the first query expression except those rows that are also returned from the second query expression.
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 76

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


The INTERSECT operator evaluates the output of two query expressions and returns only the rows common to each.

The following figure illustrates these concepts with Venn diagrams, in which the shaded portion indicates the result set. Result Set

UNION combines the rows from two or more result sets into a single result set. EXCEPT evaluates two result sets and returns all rows from the first set that are not also contained in the second set. INTERSECT computes a result set that contains the common rows from two result sets. UNION, INTERSECT, and EXCEPT operators can be combined in a single UNION expression. In statements that include multiple operators, the default order of evaluation (precedence) for these operators is left to right; however, the INTERSECT operator is evaluated before UNION and EXCEPT. The order of evaluation can be modified with parentheses. 8.5.1. ALL If ALL is specified, duplicate rows returned by union_expression are retained. If two query expressions return the same row, two copies of the row are returned in the final result. If ALL is not specified, duplicate rows are eliminated from the result set. In statements with multiple UNION, EXCEPT, and INTERSECT operators, in which the ALL keyword is used, the order of evaluation can affect the results. The placement of the ALL keyword in relation to the query evaluation determines the duplicates that are retained and eliminated. If the last operation performed does not contain the ALL keyword, any duplicates retained from previous evaluations are eliminated. Examples:

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 77

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

This example contains a simple UNION operation that combines data from the City and Hq_City columns in the Store and Market tables: select hq_city as ca_cities from market where hq_state like 'CA%' union select city from store where state like 'CA%' CA_CITIES Cupertino Los Angeles Los Gatos Oakland San Francisco San Jose This example adds the ALL keyword, so duplicate Hq_City and City entries are retained: select hq_city as ca_cities from market where hq_state like 'CA%' union all select city from store where state like 'CA%' CA_CITIES San Jose San Francisco Oakland Los Angeles Los Gatos San Jose Cupertino Los Angeles San Jose The following example replaces the UNION operator with EXCEPT; consequently, only those California cities that are in the Market table but not the Store table are returned:

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 78

TRAJECTORY EDUCATION
select hq_city as ca_cities from market where hq_state like 'CA%' except select city from store where state like 'CA%' CA_CITIES Oakland San Francisco

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

The following example replaces the UNION operator with INTERSECT; consequently, only those California cities that are in both the Market table and the Store table are returned: select hq_city as ca_cities from market where hq_state like 'CA%' intersect select city from store where state like 'CA%' CA_CITIES Los Angeles San Jose The following query uses multiple INTERSECT operations to return a list of common key values in five different tables: select prodkey as common_keys from product intersect select classkey from class intersect select promokey from promotion intersect select perkey from period intersect select storekey from store COMMON_KEYS 1 3 4 5 12

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 79

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

The following example uses parentheses to force the order of evaluation in a query that contains a UNION operation and an INTERSECT operation. The parentheses force the UNION operator to be evaluated first; without them, the INTERSECT operator would take precedence and the result set might differ. (select prod_name from product natural join sales_canadian union select prod_name from product natural join sales_mexican) intersect select prod_name from product natural join sales Because parentheses override the default order of evaluation for UNION, EXCEPT, and INTERSECT operations, they sometimes determine whether duplicate rows are retained or eliminated from the final result set. The following two queries illustrate this point 8.6. Cursors Every SQL statement executed by the RDBMS has a private SQL area that contains information about the SQL statement and the set of data returned. In PL/SQL, a cursor is a name assigned to a specific private SQL area for a specific SQL statement. There can be either static cursors, whose SQL statement is determined at compile time, or dynamic cursors, whose SQL statement is determined at runtime. Static cursors are covered in greater detail in this section. Dynamic cursors in PL/SQL are implemented via the built-in package DBMS_SQL.

8.6.1.

Explicit Cursors

Explicit cursors are SELECT statements that are DECLAREd explicitly in the declaration section of the current block or in a package specification. Use OPEN, FETCH, and CLOSE in the execution or exception sections of your programs. 8.6.1.1. Declaring explicit cursors

To use an explicit cursor, you must first declare it in the declaration section of a block or package. There are three types of explicit cursor declarations:

A cursor without parameters, such as:

CURSOR company_cur Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 80

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


IS SELECT company_id FROM company;

A cursor that accepts arguments through a parameter list:

CURSOR company_cur (id_in IN NUMBER) IS SELECT name FROM company WHERE company_id = id_in;

A cursor header that contains a RETURN clause in place of the SELECT statement:

CURSOR company_cur (id_in IN NUMBER) RETURN company%ROWTYPE IS SELECT * FROM company;

This technique can be used in packages to hide the implementation of the cursor in the package body. 8.6.2. Implicit Cursors

Whenever a SQL statement is directly in the execution or exception section of a PL/SQL block, you are working with implicit cursors. These statements include INSERT, UPDATE, DELETE, and SELECT INTO statements. Unlike explicit cursors, implicit cursors do not need to be declared, OPENed, FETCHed, or CLOSEd. SELECT statements handle the %FOUND and %NOTFOUND attributes differently from explicit cursors. When an implicit SELECT statement does not return any rows, PL/SQL immediately raises the NO_DATA_FOUND exception and control passes to the exception section. When an implicit SELECT returns more than one row, PL/SQL immediately raises the TOO_MANY_ROWS exception and control passes to the exception section. Implicit cursor attributes are referenced via the SQL cursor. For example:
BEGIN UPDATE activity SET last_accessed := SYSDATE WHERE UID = user_id; IF SQL%NOTFOUND THEN INSERT INTO activity_log (uid,last_accessed) VALUES (user_id,SYSDATE); END IF END;

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 81

TRAJECTORY EDUCATION
8.7. Triggers

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Triggers are programs that execute in response to changes in table data or certain database events. There is a predefined set of events that can be "hooked" with a trigger, enabling you to integrate your own processing with that of the database. A triggering event fires or executes the trigger. 8.7.1. Creating Triggers

The syntax for creating a trigger is:


BEFORE | AFTER | INSTEAD OF trigger_event ON [ NESTED TABLE nested_table_column OF view ] | table_or_view_reference | DATABASE trigger_body;

INSTEAD OF triggers are valid on only Oracle8 views. Oracle8i must create a trigger on a nested table column. Trigger events are defined in the following table. Trigger Event INSERT UPDATE Description Fires whenever a row is added to the table_reference. Fires whenever an UPDATE changes the table_reference. UPDATE triggers can additionally specify an OF clause to restrict firing to updates OF certain columns. See the following examples. Fires whenever a row is deleted from the table_reference. Does not fire on TRUNCATE of the table.

DELETE

CREATE (Oracle8i) Fires whenever a CREATE statement adds a new object to the database. In this context, objects are things like tables or packages (found in ALL_OBJECTS). Can apply to a single schema or the entire database. ALTER (Oracle8i) Fires whenever an ALTER statement changes a database object. In this context, objects are things like tables or packages (found in ALL_OBJECTS). Can apply to single schema or the entire database. Fires whenever a DROP statement removes an object from the database. In this context, objects are things like tables or packages (found in ALL_OBJECTS). Can apply to a single schema or the entire database.

DROP (Oracle8i)

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 82

TRAJECTORY EDUCATION
Trigger Event SERVERERROR (Oracle8i) LOGON (Oracle8i) LOGOFF (Oracle8i) STARTUP (Oracle8i) SHUTDOWN (Oracle8i)

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


Description Fires whenever a server error message is logged. Only AFTER triggers are allowed in this context. Fires whenever a session is created (a user connects to the database). Only AFTER triggers are allowed in this context. Fires whenever a session is terminated (a user disconnects from the database). Only BEFORE triggers are allowed in this context. Fires when the database is opened. Only AFTER triggers are allowed in this context. Fires when the database is closed. Only BEFORE triggers are allowed in this context.

Triggers can fire BEFORE or AFTER the triggering event. AFTER data triggers are slightly more efficient than BEFORE triggers. 8.8. Dynamic SQL Dynamic SQL is a programming technique that enables you to build SQL statements dynamically at runtime. You can create more general purpose, flexible applications by using dynamic SQL because the full text of a SQL statement may be unknown at compilation. For example, dynamic SQL lets you create a procedure that operates on a table whose name is not known until runtime. Oracle includes two ways to implement dynamic SQL in a PL/SQL application:

Native dynamic SQL, where you place dynamic SQL statements directly into PL/SQL blocks. Calling procedures in the DBMS_SQL package.

Static SQL statements do not change from execution to execution. The full texts of static SQL statements are known at compilation, which provides the following benefits:

Successful compilation verifies that the SQL statements reference valid database objects. Successful compilation verifies that the necessary privileges are in place to access the database objects. Performance of static SQL is generally better than dynamic SQL.
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 83

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

9.

QBE

Stands for "Query By Example." QBE is a feature included with various database applications that provides a user-friendly method of running database queries. Typically without QBE, a user must write input commands using correct SQL (Structured Query Language) syntax. This is a standard language that nearly all database programs support. However, if the syntax is slightly incorrect the query may return the wrong results or may not run at all. The Query By Example feature provides a simple interface for a user to enter queries. Instead of writing an entire SQL command, the user can just fill in blanks or select items to define the query she wants to perform. For example, a user may want to select an entry from a table called "Table1" with an ID of 123. Using SQL, the user would need to input the command, "SELECT * FROM Table1 WHERE ID = 123". The QBE interface may allow the user to just click on Table1, type in "123" in the ID field and click "Search." QBE is offered with most database programs, though the interface is often different between applications. For example, Microsoft Access has a QBE interface known as "Query Design View" that is completely graphical. The phpMyAdmin application used with MySQL, offers a Web-based interface where users can select a query operator and fill in blanks with search terms. Whatever QBE implementation is provided with a program, the purpose is the same to make it easier to run database queries and to avoid the frustrations of SQL errors.

10. QUERY PROCESSING AND OPTIMIZATION SQL query processing requires that the DBMS identify and execute a strategy for retrieving the results of the query. The SQL query determines what data is to be found, but does not define the method by which the data manager searches the database. Hence, query optimization is necessary for high-level relational queries and provides an opportunity for the DBMS to systematically evaluate alternative query execution strategies and to choose an optimal strategy. 10.1. Query Processing The steps necessary for processing an SQL query are shown in Figure 2. The SQL query statement is first parsed into its constituent parts. The basic SELECT
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 84

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

statement is formed from the three clauses SELECT, FROM, and WHERE. These parts identify the various tables and columns that participate in the data selection process. The WHERE clause is used to determine the order and precedence of the various attribute comparisons through a conditional expression. An example query to determine the names and addresses of all patients of Doctor 1234 is shown as query Q1 below. The WHERE clause uses a conjunctive clause which combines two attribute comparisons. More complex conditions are possible. Q1: SELECT Name, Address, Dr_Name FROM Patient, Physician WHERE Patient.Doctor = Physician.Provider AND Physician.Provider = 1234 The query optimizer has the task of determining the optimum query execution plan. The term optimizer is actually a misnomer, because in many cases the optimum strategy is not found. The goal is to find a reasonably efficient strategy for executing the query. Finding the perfect strategy is usually too time consuming and can require detailed information on both the data storage structure and the actual data content. Usually this information is simply not available. Once the execution plan is established the query code is generated. Various techniques such as memory management, disk caching and parallel query execution can be used to improve the query performance. However, if the plan is not correct, then the query performance cannot be optimum.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 85

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

10.2. Query Optimizing There are two main techniques for query optimization. The first approach is to use a rule based or heuristic method for ordering the operations in a query execution strategy. The rules usually state general characteristics for data access, such as it is more efficient to search a table using an index, if available, than a full table scan. The second approach systematically estimates the cost of different execution strategies and chooses the least cost solution. This approach uses simple statistics about the data structure size and organization as arguments to a cost estimating equation. In practice most commercial database systems use a combination of both techniques. 10.3. Indexes Consider, for example, a rule-based technique for query optimization that states that indexed access to data is preferable to a full table scan. Whenever a single condition specifies the selection, it is a simple matter to check whether or not an indexed access path exists for the attribute involved in the condition. Queries Q2 and Q3 are two queries which, from a syntactic structure, are identical. However,
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 86

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

query Q2 uses an index on the patient number, and query Q3 does not have an index on the patient name. Assuming a balanced tree based index, query Q2 will at worst case access on the order of log2 (n) entries to locate the required row in the table. Conversely, query Q3 must search on average n/2 rows to find the entry during a full table scan, and n rows if the entry does not exist in the table. When n = 1,000,000 this is the difference between accessing 20 rows versus 500,000 rows for a successful search. Clearly, indexing can significantly improve query performance. However, it is not always practical to index every attribute in every table, thus certain types of user queries can respond quite differently from others. Q2: SELECT * FROM Patient WHERE Patient.SSN = 11111111 In this query, the SSN attribute is the primary key index for the Patient table.

Q3: SELECT * FROM Patient WHERE Patient.Name = Doe, John Q. In this query, no index exists on the Name attribute. This requires a full table scan. 10.4. Selectivities A more significant problem occurs when more than one condition is used in a conjunctive selection. In this case the selectivity of each condition must be considered. Selectivity is defined as the ratio between the numbers of rows that satisfy the condition to the total number of rows in the table. This is the probability that a row satisfies the condition, assuming a uniform distribution. If the selectivity is small, then only a few rows are selected by the condition, and it is desirable to use this condition first when retrieving records. To calculate selectivities, the database manager needs statistics on all table and attribute values. The heuristic rule states that, for multiple conjunctive conditions, the order of application is from smallest selectivity to largest. Queries Q4 and Q5 illustrate multiple conditions in a conjunctive selection on the Patient table. Consider the case where the selectivity on Age is 10,000/1,000,000 = 0.01 (Age is assumed to be uniformly distributed between 0 and 100). The selectivity on Gender is 500,000/1,000,000 = 0.5 (Gender is assumed to be either M or F). It is clear that by using age as the first retrieval condition, 10,000 rows are accessed for testing against the gender condition, versus accessing 500,000 rows if the gender attribute was chosen first. This is a 50 times
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 87

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

performance difference. Selectivities can be used only if statistics are maintained by the database manager. If this information is not available, then the order of condition testing often defaults to the order of conditions as specified in the WHERE clause.

Q4: SELECT * FROM Patient WHERE Age = 45 AND Gender = M In this query, the Age attribute is specified first.

Q5: SELECT * FROM Patient WHERE Gender = M AND Age = 45 This query specifies Gender first. 10.5. Uniformity In many cases the actual data does not follow a uniform distribution. Consider the case where 95% of the patients live in the province of New Brunswick and the remaining 5% live in 199 different states and countries of the world. In this case there are 200 different values for the Area attribute. The selectivity of the Area attribute, assuming a uniform distribution, is 5,000/1,000,000 = 0.005. Thus, this attribute will be accessed first given any query with a conjunctive clause relating Area and Age. In the example below, query Q6 selects Area based on the province of Ontario. We estimate that (5% of 1,000,000) / 199, or 251 patients live in Ontario. These rows are accessed first and then tested against the Age condition. Conversely, query Q7 selects patients in the province of New Brunswick. In this case, 950,000 patient rows are accessed, or more than 3,700 times the number of rows for the Ontario example. The distribution was skewed sufficiently to result in a poor choice by the query optimizer. Clearly, non-uniform data distributions can significantly affect query performance. Q6: SELECT * FROM Patient WHERE Area = Ontario AND Age = 45 A uniform distribution for out of province residents predicts that 251 patients live in Ontario.

Q7: SELECT *
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 88

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

FROM Patient WHERE Area = New Brunswick AND Age = 45 Actual data has 950,000 patients living in New Brunswick.

10.6. Disjunctive Clauses A disjunctive clause occurs when simple conditions are connected by the OR logical connective rather than AND. These clauses are much harder to process and optimize. For example, consider query Q8, which uses a disjunctive clause relating a specific doctor and the patient area of residence. With such a condition, little optimization can be done because the rows satisfying the query are the union of the rows satisfying each of the individual conditions. If any one of the search conditions does not have an access path, then the query optimizer is compelled to choose a full table scan to satisfy the query. Performance can only be improved if an access path exists on every condition in the disjunctive clause. In this case, row sets can be found satisfying each condition and then combined through applying a union operation across the result sets to eliminate duplicate rows. However, set union operations can also be expensive. The customary way to implement union operations is to sort the relations on the same attributes and then scan the sorted files to eliminate duplicate rows. Superficially, the differences between query Q8 and Q9 appear trivial, yet the queries can have profound differences in performance. In many cases the use of disjunctive clauses in queries results in either a brute force linear search of the table, or a sort of a potentially large amount of data.

Q8: SELECT * FROM Patient WHERE Doctor = 1234 OR Area = Ontario Group one doctors patients with Ontario patients.

Q9: SELECT * FROM Patient WHERE Doctor = 1234 AND Area = Ontario Identify only the Ontario patients of a particular doctor.

10.7. Join Selectivities

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 89

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

The JOIN operation is one of the most time consuming operations in query processing. A join operation matches two tables across domain compatible attributes. One common technique for performing a join is a nested (inner-outer) loop or brute force approach. In this case, for every row in the first table a scan of the second table is performed and every record is tested for satisfying the join condition. A second technique is to use an access structure or index to retrieve the matching records. In this case, for every row in the first table an index is used to access the matching records from the second table. One factor that significantly affects performance of the join is the percentage of rows in one table that will be joined with rows in the other table. This is called the join selection factor. This factor depends not only on the two tables to be joined, but also on the join fields if there are multiple join conditions between the two tables. For example, query Q10 joins each Physician row with the Patient rows. Each physician is expected to exist once in the Patient table (after all, a physician is also a patient), but 999,000 patient rows will not be joined. Suppose indexes exist on each of the join attributes. There are two options for performing the join. The first retrieves each Patient row and then uses the index into the Physician table to find the matching record. In this case, no matching records will be found for those patients who are not also physicians. The second option first retrieves each Physician row and then uses the index into the Patient table to find the matching Patient row. In this case, every physician will have one matching patient row. It is clear that the second option is more efficient than the first option. This occurs because the join selection factor of Physician with respect to the join condition is 1. Conversely, the Patient selection factor with respect to the same join condition is 1,000/1,000,000. Choosing optimum join methods requires that various table sizes and other statistics be used to compute estimated join selectivities. Q10: SELECT * FROM Patient, Physician WHERE Patient.SSN = Physician.Dr_SSN Q11: SELECT * FROM Patient, Physician WHERE Physician.Dr_SSN = Patient.SSN If join selectivities are not used, then these two queries can exhibit quite different performance. 10.8. Views A view in SQL is a single table that is derived from other tables. A view can be considered as a virtual table or as a stored query. A view is often used to specify a frequently used query. This is of particular benefit if tables must be joined or
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 90

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

restricted. One difficulty with views is that a view can hide the query complexity from the user. For example, view V1 describes a virtual table that contains the same number of rows as the Physician table. Query Q12 accesses the Patient, Provider, and Treatment tables through view V1 to determine the total cost of services that Opthamologists have rendered. Conversely, query Q13 accesses only the Physician table to retrieve (different) data on Opthamologists. The problem is that both Q12 and Q13 appear to be of the same order of complexity, given that knowledge of the view is hidden, yet each query will clearly have a different performance profile. V1: CREATE VIEW DrService (Dr, Specialty, Age, TotCost) AS SELECT Provider, Specialty,Age,Sum(Cost) FROM Patient, Physician, Treatment WHERE SSN = Dr_SSN AND DrNum = Provider GROUP BY Provider This view matches the Physician table to the Treatment table, and then joins the result to the Patient table. Q12: SELECT * FROM DrService WHERE Specialty = Opthamologist This query performs a three-way join, through the view. Q13: SELECT * FROM Physician WHERE Specialty = Opthamologist This query simply scans one table.

11. OODBMS An object-oriented database management system (OODBMS), sometimes shortened to ODBMS for object database management system), is a database management system (DBMS) that supports the modelling and creation of data as objects. This includes some kind of support for classes of objects and the inheritance of class properties and methods by subclasses and their objects. There is currently no widely agreed-upon standard for what constitutes an OODBMS, and OODBMS products are considered to be still in their infancy. In the meantime, the object-relational database management system (ORDBMS), the idea that object-oriented database concepts can be superimposed on
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 91

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

relational databases, is more commonly encountered in available products. An object-oriented database interface standard is being developed by an industry group, the Object Data Management Group (ODMG). The Object Management Group (OMG) has already standardized an object-oriented data brokering interface between systems in a network. An object-oriented database system must satisfy two criteria: It should be a DBMS, and it should be an object-oriented system, i.e., to the extent possible, it should be consistent with the current crop of objectoriented programming languages. The first criterion translates into five features: persistence, secondary storage management, concurrency, recovery and an ad hoc query facility. The second one translates into eight features: complex objects, object identity, encapsulation, types or classes, inheritance, overriding combined with late binding, extensibility and computational completeness. 11.1. Characteristics of Object-Oriented Database Object-oriented database technology is a marriage of object-oriented programming and database technologies. Given figure illustrates how these programming and database concepts have come together to provide what we now call object-oriented databases.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 92

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Perhaps the most significant characteristic of object-oriented database technology is that it combines object-oriented programming with database technology to provide an integrated application development system. There are many advantages to including the definition of operations with the definition of data. First, the defined operations apply ubiquitously and are not dependent on the particular database application running at the moment. Second, the data types can be extended to support complex data such as multi-media by defining new object classes that have operations to support the new kinds of information. 11.2. Advantage of OODBMS The OODBMS has many advantages and benefits. First, object-oriented is a more natural way of thinking. Second, the defined operations of these types of systems are not dependent on the particular database application running at a given moment. Third, the data types of object-oriented databases can be extended to support complex data such as images, digital and audio/video, along with other multi-media operations. Different benefits of OODBMS are its reusability, stability, and reliability. Another benefit of OODBMS is that relationships are represented explicitly, often supporting both navigational and associative access to information. This translates to improvement in data access performance versus the relational model. Another important benefit is that users are allowed to define their own methods of access to data and how it will be represented or manipulated. The most significant benefit of the OODBMS is that these databases have extended into areas not known by the RDBMS. Medicine, multimedia, and high-energy physics are just a few of the new industries relying on object-oriented databases. 11.3. Disadvantage of OODBMS As with the relational database method, object-oriented databases also have disadvantages or limitations. One disadvantage of OODBMS is that it lacks a common data model. There is also no current standard, since it is still considered to be in the development stages.

12. ORACLE The Oracle Database (commonly referred to as Oracle RDBMS or simply Oracle) consists of a relational database management system (RDBMS) produced and marketed by Oracle Corporation. As of 2009[update], Oracle remains a major presence in database computing.

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 93

TRAJECTORY EDUCATION
12.1. Storage

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

The Oracle RDBMS stores data logically in the form of tablespaces and physically in the form of data files. Tablespaces can contain various types of memory segments, such as Data Segments, Index Segments, etc. Segments in turn comprise one or more extents. Extents comprise groups of contiguous data blocks. Data blocks form the basic units of data storage. Oracle database management tracks its computer data storage with the help of information stored in the SYSTEM tablespace. The SYSTEM tablespace contains the data dictionary and often (by default) indexes and clusters. A data dictionary consists of a special collection of tables that contains information about all user-objects in the database.

12.2. Database Schema Oracle database conventions refer to defined groups of object ownership (generally associated with a "username") as schemas. Most Oracle database installations traditionally came with a default schema called SCOTT. After the installation process has set up the sample tables, the user can log into the database with the username scott and the password tiger. The name of the SCOTT schema originated with Bruce Scott, one of the first employees at Oracle (then Software Development Laboratories), who had a cat named Tiger. The SCOTT schema has seen less use as it uses few of the features of the more recent releases of Oracle. Most recent examples supplied by Oracle Corporation reference the default HR or OE schemas. Other default schemas include:

SYS (essential core database structures and utilities) SYSTEM (additional core database structures and utilities, and privileged account) OUTLN (utilized to store metadata for stored outlines for stable queryoptimizer execution plans. BI, IX, HR, OE, PM, and SH (expanded sample schemas containing more data and structures than the older SCOTT schema).

12.3. Memory architecture

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 94

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Each Oracle instance uses a System Global Area or SGA a shared-memory area to store its data and control-information. Each Oracle instance allocates itself an SGA when it starts and de-allocates it at shut-down time. The information in the SGA consists of the following elements, each of which has a fixed size, established at instance startup:

the database buffer cache: this stores the most recently-used data blocks. These blocks can contain modified data not yet written to disk (sometimes known as "dirty blocks"), unmodified blocks, or blocks written to disk since modification (sometimes known as clean blocks). Because the buffer cache keeps blocks based on a most-recently-used algorithm, the most active buffers stay in memory to reduce I/O and to improve performance. the redo log buffer: this stores redo entries a log of changes made to the database. The instance writes redo log buffers to the redo log as quickly and efficiently as possible. The redo log aids in instance recovery in the event of a system failure. the shared pool: this area of the SGA stores shared-memory structures such as shared SQL areas in the library cache and internal information in the data dictionary. An insufficient amount of memory allocated to the shared pool can cause performance degradation.

12.3.1. Library cache The library cache stores shared SQL, caching the parse tree and the execution plan for every unique SQL statement. If multiple applications issue the same SQL statement, each application can access the shared SQL area. This reduces the amount of memory needed and reduces the processing-time used for parsing and execution planning. 12.3.2. Data dictionary cache The data dictionary comprises a set of tables and views that map the structure of the database. Oracle databases store information here about the logical and physical structure of the database. The data dictionary contains information such as:

user information, such as user privileges integrity constraints defined for tables in the database names and datatypes of all columns in database tables information on space allocated and used for schema objects

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 95

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

The Oracle instance frequently accesses the data dictionary in order to parse SQL statements. The operation of Oracle depends on ready access to the data dictionary: performance bottlenecks in the data dictionary affect all Oracle users. Because of this, database administrators should make sure that the data dictionary cache has sufficient capacity to cache this data. Without enough memory for the data-dictionary cache, users see a severe performance degradation. Allocating sufficient memory to the shared pool where the data dictionary cache resides precludes these particular performance problems. 12.3.3. Program Global Area The Program Global Area or PGA memory-area of an Oracle instance contains data and control-information for Oracle's server-processes. The size and content of the PGA depends on the Oracle-server options installed. This area consists of the following components:

stack-space: the memory that holds the session's variables, arrays, and so on. session-information: unless using the multithreaded server, the instance stores its session-information in the PGA. (In a multithreaded server, the session-information goes in the SGA.) private SQL-area: an area in the PGA which holds information such as bind-variables and runtime-buffers. sorting area: an area in the PGA which holds information on sorts, hashjoins, etc.

12.4. Configuration Database administrators control many of the tunable variations in an Oracle instance by means of values in a parameter file. This file in its ASCII default form ("pfile") normally has a name of the format init<SID-name>.ora. The default binary equivalent server paramater file ("spfile") (dynamically reconfigurable to some extent) defaults to the format spfile<SID-name>.ora. Within an SQL-based environment, the views V$PARAMETER and V$SP_PARAMETER give access to reading parameter values.

13. OBJECTIVE QUESTIONS 1. The advantages of Standard Query Language (SQL) include which of the following in relation to GIS databases? a. It is widely used. b. It is good at handling geographical concepts.
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 96

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


c. It is simple and easy to understand. d. It uses a pseudo-English style of questioning. 2. Which of the following are characteristics of an RDBMS? a. Tables are linked by common data known as keys. b. It cannot use SQL. c. Queries are possible on individual or groups of tables. d. Keys may be unique or have multiple occurrences in the database. e. Data are organized in a series of two-dimensional tables each of which contains records for one entity. 3. What is a 'tuple'? a. An attribute attached to a record. b. Another name for the key linking different tables in a database. c. A row or record in a database table. d. Another name for a table in an RDBMS. 4. Which of the following are issues to be considered by users of large corporate GIS databases? a. The need for multiple copies of the same data and subsequent merging after separate updates. b. The need for manual transfer of records to paper. c. The need for concurrent access and multi-user update. d. The need to manage long transactions. e. The need for multiple views or different windows into the same databases. 5. Which of the following are features of the object-oriented approach to databases? a. The ability to develop more realistic models of the real world. b. The ability to develop databases using natural language approaches. c. The ability to represent the world in a non-geometric way. d. The need to split objects into their component parts. e. The ability to develop database models based on location rather than state and behavior. 6. Redundancy is minimized with a computer based database approach. a. True b. False 7. The relational database model is based on concepts proposed in the 1960s and 1970s. a. True b. False
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 97

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


8. A row in a database can also be called a domain. a. True b. False 9. A first step in database creation should be needs analysis. a. True b. False 10. In entity attribute modelling a many to many relationship is represented by M:M. a. True b. False 11. In a networked web based GIS all communications must go through an internet map server. a. True b. False 12. In an OO database approach 'object = attributes + behaviour'. a. True b. False 13. In an OO database objects may inherit some or all of the characteristics of other objects. a. True b. False

14. Referring to the following table, what type of relationship exists between the Product table and the Manufacturer table? PRODUCT ======= Product ID Product Description Manufacturer ID MANUFACTURER ============ Manufacturer ID Manufacturer Name

a. Product - Many; Manufacturer - Many


Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 98

TRAJECTORY EDUCATION

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE


b. Product - One or Many; Manufacturer - One or Many c. Product - Many; Manufacturer - One d. Product - One; Manufacturer - One e. Product - One; Manufacturer - Many 15. You are writing a database application to run on your DBMS. You do not want your users to be able to view the underlying table structures. At the same time you want to allow certain update operations. Referring to the above scenario, what structure will you deploy? a. Cursor table b. Table filter c. Dynamic procedure d. View e. Summary table 16. You are defining the operational process of your RDBMS. Referring to the scenario above, which one of the following is a valid ongoing "operational process?" a. OS requirement b. User analysis c. Performance monitoring d. Data dictionary specification e. System requirement 17. You have been asked to construct a query in the company's RDBMS. You have deployed a Right Outer Join operation. Referring to the scenario above, what will happen to the final results when there is NO match between the tables? a. The right table will return ALL rows. b. The right table will return NULL. c. Both tables will return NULL. d. The left table will return ALL rows. e. The left table will return NULL. 18. Which phase of the data modeling process contains security review? a. Structure b. Design issue c. Data source
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 99

TRAJECTORY EDUCATION
d. Storage issue e. Operational process

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

19. Which one of the following is NOT a characteristic of metadata? a. Data about data b. Describes a data dictionary c. Self-describing d. Includes user data e. Supports its own structure 20. Which one of the following capabilities do you expect to see in a majority of RDBMS extensions to ANSI SQL-92? a. Encryption key management b. Graphical User Interface Widgets c. Thread creation, execution, & coordination d. Network socket creation/operation e. If/Then, for, do/while statements 21. What can a mandatory one to one relationship indicate? a. More entities are needed. b. The model should be denormalized. c. The tables are not properly indexed. d. The model cannot be implemented physically. e. More attributes are needed.

22. For performance, you denormalize your database design and create some redundant columns. Referring to the scenario above, what RDBMS construct can you use to automatically prevent the repeated columns from getting out of sync? a. Cursors b. Constraints c. Views d. Stored procedures e. Trigger 23. You are running a query against a relational database. Referring to the scenario above, what clause or command do you use in the query to help avoid a costly tablescan?
Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 100

TRAJECTORY EDUCATION
a. GROUP BY clause b. INDEX command c. HAVING clause d. FROM clause e. WHERE clause [/quote]

SCHOOL OF COMPUTER SCIENCE THE NO 1 INSTITUTE FOR UGC-NET IN COMPUTER SCIENCE

Main Office, 126 2nd Floor, Kingsway Camp, Delhi-09, 011-47041845, www.trajectoryeducation.com/socm Page no. 101

You might also like