Assignment

Name Registration No. Learning Center Institute,Dwarka : Ankit Rathi : 581112441 : Apar India

Learning Center Code: 02009 Course Subject
System

: MBA : MB0034 – Database Management : III

Semester

Sikkim Manipal University – MI0034

MBA SEMESTER III MI0034 - Database Management System Assignment Set- 1 (60 Marks)

Q1. Differentiate between Traditional File System & Modern Database System? Describe the properties of Database & the Advantage of Database?

Traditional File Systems Vs Modern Database Management Systems

Traditional File System

Modern Database Management Systems

Traditional File system is the system that was followed This is the Modern way which has replaced the before the advent of DBMS i.e., it is the older way. older concept of File system.

In Traditional file processing, data definition is part of Data definition is part of the DBMS the application program and works with only specific application. Application is independent and can be used with any application. One extra column (Attribute) can be added without any difficulty Minor coding changes in the Application program may be required.

File systems are Design Driven; they require design/coding change when new kind of data occurs. E.g.: In a traditional employee the master file has Emp_name, Emp_id, Emp_addr, Emp_design, Emp_dept, Emp_sal, if we want to insert one more column Emp_Mob number then it requires a complete restructuring of the file or redesign of the application code, even though basically all the data except that in

referred to as system catalog.: Employee names might exist in separate files like Payroll Master File and also in Employee Benefit Master File etc. and each of these files may be in different formats. which stores data about everything it holds. This problem is completely solved here. In a File system data is scattered in various files. the name might be changed in the pay roll master file but not be changed in Employee Benefit Master File etc. This data is also referred to as Meta data. Security features are to be coded in the Application Program itself. For e. The DBMS has a data dictionary. This might result in the loss of Data Consistency. Properties of Database The following are the important properties of Database: .g. structure. a data base management system is the software that manages a database. loss of Data Consistency. Hence. This might result in the in DBMS if properly defined.Sikkim Manipal University – MI0034 one column is the same. locations and types. Traditional File system keeps redundant [duplicate] Redundancy is eliminated to the maximum extent information in many locations. Coding for security requirements is not required as most of them have been taken care by the DBMS. such as names. security. integrity. and is responsible for its storage. Now if an employee changes his or her last name. recovery and access. making it difficult to write new application programs to retrieve the appropriate data. concurrency.

E.g. Field Name Stud_name Class Type Character Alpha numeric Description It is the students name It is the class of the student 3. BSK II stage. E. Emp_name Prasad Emp_id 100 Emp_addr Shubhodaya.g. A database consists of both data as well as the description of the database structure and constraints. 4th main Chamrajpet. Bangalore Software engineer 10000 Lecturer 30000 . Bangalore #12. A database can have any size and of various complexity. Manipal Towers. If we consider the above example of employee database the name and address of the employee may consists of very few records each with simple structure. Stud_name Vijetha Class Class II Rank obtained 5th 2. E. Near Katariguppe Big Bazaar. Student studying in class II got 5th rank. A database is a logical collection of data having some implicit meaning.Sikkim Manipal University – MI0034 1. If the data are not related then it is not called as proper database. Bangalore Emp_desig Project Leader Emp_Sal 40000 Usha Nupur 101 102 #165.g.

6. 5. In fact it is stored in a single database. The DBMS is considered as general-purpose software system that facilitates the process of defining. constructing and manipulating databases for various applications. For E. The data in the database is used by variety of users for variety of purposes. Multiple user DBMS must allow the data to be shared by multiple users simultaneously. Integrity (accuracy) can be maintained 4. Advantages of using DBMS 1. In this case the data are stored separately for the different users. This property is nothing but multiple views of the database. For this purpose the DBMS includes concurrency control software to ensure that the updating done to the database by variety of users at single time must get updated correctly. Security features protect the Data from unauthorized access .Sikkim Manipal University – MI0034 Peter 103 Syndicate house. Redundancy is reduced 2. Data located on a server can be shared by clients 3.g. 4. when you consider a hospital database management system the view of usage of patient database is different from the same used by the doctor. Data abstraction is a feature that provides the integration of the data source of interest and helps to leverage the physical data however the structure is. A database provides insulation between programs. Manipal IT executive 15000 Like this there may be n number of records. 7. This property explains the multiuser transaction processing. data and data abstraction.

Consistency of Data is maintained 8. and each one of them might use it for different purposes. say. This is especially dangerous if the file contents are being altered (changed. every READ statement brings 52 bytes into the memory. These programs will assume that the file consists of LINES and . the program will certainly read some pieces of information into the memory but the after the first READ statement. DBMS supports multiple views. some meaningless pieces of records will be brought into memory and the program will start processing some physical records which contain logically meaningless data. 52 byte records.e the total length of the FD structure is 40 bytes). That is. You must be careful when declaring record structures for files. If the record structure that the programmer has declared is 52 bytes. The records are discriminated from one another using the record length declared in the associated FD statement of the FILE-SECTION. the records of the file are stored one after another both physically and logically. Modern DBMS support internet based application. the whole file will appear as a single LINE of character and would be impsossible to process with regular text editors. depending on requirement. Since the records are simply appended to each other when building SEQUENTIAL files. Any mistake you make in record sizes will cause your program to read/write erroneous information. It is the programmer's responsibility to take care of the record sizes in files. As DBMS has many users. 6. text editors are good in reading/writing/modifying text files. A record of a sequential file can only be accessed by reading all the previous records. blocks of 52 byte data (records) are assumed to placed one after another in the file. Q2. As you should know by now. If this string does not contain any "Carriage Return/Line Feed" control characters in it.Sikkim Manipal University – MI0034 5. record with sequence number 16 is located just after the 15th record. For example. If the programmer is reading the data in a sequential file. What is the disadvantage of sequential file organization? How do you overcome it? What are the advantages & disadvantages of Dynamic Hashing? In this file organization. In DBMS the application program and structure of data are independent. updated). but the programmer tries to read this file with a program which has declared 40 byte records (i. you simply end up with a STREAM of byte. If the file contains. and may require to view and manipulate only on a portion of the database. 7.

Note : You must NOT provide record fields for the extra two CR/LF bytes in record descriptions of LINE SEQ files. Please note that LINE SEQUENTIAL files have two extra characters for each record. all you can do is to modify the contents of the record so that it contains some special values that your program will recognize as deleted (remember to open the file in I-O mode and REWRITE a new record). LINE SEQUENTIAL files are much easier to use while developing programs because you can always use a simple text editor to see the contents of your sequential file and trace/debug your program. COBOL has a special type of sequential file organization. 02 M-NAME PIC X(16). which is called the LINE SEQUENTIAL ORGANIZATION which places a CR/LF pair at the end of each record while adding records to a file and expect such a pair while reading. 02 M-BIRTHDATE. this might use up a significant amount of disk space. you must provide FD blocks for each file. 02 M-SURNAME PIC X(16). SELECT MYFILE-2 ASSIGN TO DISK "C:\DATADIR\MYFILE2.Sikkim Manipal University – MI0034 expect the lines to separated from each other by a pair of control characters called "Carriage Return/Line Feed" (or CR/LF). 03 M-BD-DAY PIC 99. which have millions of records. 03 M-BD-YEAR PIC 9999. If you do not want a specific record to be kept in a seq file any more. SEQUENTIAL files have only one ACCESS MODE and that is "sequential access". It is NOT possible to delete records of a seq file. Therefore you need not specify an ACCESS MODE in the SELECT statement. 03 M-BD-MONTH PIC 99. these two extra bytes are automatically taken in consideration and added for all new records that are added to a file. Typical SELECT statements for SEQUENTIAL files are : SELECT MYFILE ASSIGN TO DISK "MYFILE. 01 MYFILE-REC. In the FILE-SECTION. For files. .DAT" ORGANIZATION IS SEQUENTIAL.TXT" ORGANIZATION IS LINE SEQUENTIAL. Once you declare the file to be a LINE SEQ file. hence for a sequential file you could have something like : FD MYFILE.

Hence it is difficult to expand or shrink the file dynamically. If you need to read record number N. Let us consider a hash function h that maps the key value k to the value h(k). The basic terms associated with the hashing techniques are: 1) Hash table: It is simply an array that is having address of records. 2) Hash function: It is the transformation of a key into the corresponding location or address in the hash table (it can be defined as a function that takes key as input and transforms it into a hash table index). you must first read the previous N-1 records. In hashing technique or direct file organization. In this there are a number of unnecessary comparisons. The VALUE h(k) is used as an address. the key value is converted into an address by performing some arithmetic manipulation on the key value. Especially no good for programs that make frequent searches in the file. The different hashing techniques are:  Internal Hashing  Dynamic hashing  Extendable hashing Dynamic Hashing Technique A major drawback of the static hashing is that address space is fixed. . 3) Hash key: Let 'R' be a record and its key hashes into a key value called hash key. which provides very fast access to records.Sikkim Manipal University – MI0034 Can be only processed sequentially. To overcome these disadvantages some of the following hashing techniques are in use: One disadvantage of sequential file organization is that we must use linear search or binary search to locate the desired record and that results in more i/o operations.

Each leaf node holds a bucket address. the number of buckets is not fixed [as in regular hashing] but grows or diminishes as needed. 1. Leaf nodes: It holds a pointer to a bucket a bucket address. since only the records in one bucket are redistributed to the two new buckets. and the second bucket contains those whose hash value starts with 101. The main advantage of extendable hashing is that performance does not degrade as the file grows. In this. rather buckets can be allocated dynamically. The space overhead of the directory table is negligible. and a new record is inserted. 3. The records are distributed among the two buckets based on the value of the first [leftmost] bit of their hash values. the bucket overflows and is slit into two buckets. Records whose hash values start with a 0 bit are stored in one bucket. Advantages of dynamic hashing: 1. If a bucket overflows. a binary tree structure called a directory is built. the access structure is built on the binary representation of the hash value. Internal nodes: Guide the search. The index tables grow rapidly and too large to fit in main memory. and a right pointer corresponding to a 1 bit. once that bucket is full. The directory has two types of nodes. The levels of a binary tree can be expanded dynamically. When part of the index table is stored on secondary storage. Disadvantages: 1. and those whose hash values start with a 1 bit are stored in another bucket. 2. .Sikkim Manipal University – MI0034 In dynamic hashing. At this point. then all records whose hash value starts with 100 are placed in the first split bucket. it requires extra access. for example: a new record is inserted into the bucket for records whose hash values start with 10 and causes overflow. The main advantage is that splitting causes minor reorganization. The file can start with a single bucket. each has a left pointer corresponding to a 0 bit. The main space saving of hashing is that no buckets need to be reserved for future growth. 2.

rather than how to perform the operation. It is the standard command set used to communicate with the RDBMS. DML(Data Manipulation Language) . where SQL was developed in the late 1970's. Create or delete a table. 1. The directory must be searched before accessing the bucket. resulting in two-block access instead of one in static hashing. It is a non-procedural language. What is relationship type? Explain the difference among a relationship instance. meaning that SQL describes what data to retrieve delete or insert. DDL(Data Definition Language) 2. Insert. California. The history of SQL began in an IBM laboratory in San Jose. It can be command to do one of the following. modify or delete rows. A SQL query is not-necessarily a question to the database. 3.Sikkim Manipal University – MI0034 2. A disadvantage of extendable hashing is that it involves an additional level of indirection. THE SQL STATEMENT CAN BE GROUPED INTO FOLLOWING CATEGORIES. Search several rows for specifying information and return the result in order. Modify security information. relationship type & a relation set? Answer later Q4. What is SQL? Discuss. SQL stands for structured Query Language. SQL stands for Structured Query language The Structured Query language is used for programming the database. Q3.

DCL(Data Control Language) 4. insert new records into a database table. SQlDBA>Grant ALL on EMP to Akash with Grant option. Update on EMP to L.Suresh. SQL DBA>Revoke all on emp from Akash TCL: (Transaction Control Language) It is used to control transactions. Lock certain Permission for the user. sequences etc.Sikkim Manipal University – MI0034 3. . indexes.sURES. SQL DBA>rEOKE UPDATE. altering. renaming objects. and commands for dropping. Eg: Commit The DDL statement provides commands for defining relation schema i.e for creating tables. SQL DBA>Grant select. Revoke: Revoke takes out privilege from one or more tables or views. INSERT and DELETE statements alter existing rows in a database tables. SQL DBA>Grant all on emp to public. TCL(Transaction Control Language) DDL: Data Definition Language DML: (Data Manipulation Language) The DML statements are used to alter the database tables in someway. or remove one or more records from the database table. SQL DBA>Revoke Import from Akash. DCL: (Data Control Language) The Data Control Language Statements are used to Grant permission to the user and Revoke permission from the user. The UPDATE. DELETE FROM l.

DATA TYPE CHAR (sizs) VARCHAR2(size) DATE DESCRIPTION Fixed length character. Max = 2000 Variable length character. . if your SQL commands are saved in a file (typically in note pad) you can execute this file using an "at" @command.C to. Max=4000 Date.4712 A. For example. DEC 31.D. valid range is from jan1. shows the complete listing of the data types allowed in oracle. similarly there are a number of such commands: @<filename> Runs the command file stored in <filename> DATA TYPES IN ORACLE 8i SQL: The fig.4712 B.Sikkim Manipal University – MI0034 SQL* COMMANDS: This subsection discusses the often used commands in sql environment.

Normalization is the process of building database structures to store data. Raw binary data. Max=2G.B. Max. the . range=1.d) DECIMAL FLOAT INTEGER SMALLINT Pointer to binary OS file Character data of variable size. size =40 digits Numbers.0E-130 to 9. In this unit we will study how to normalize the data in the database. What is Normalization? Discuss various types of Normal Forms? Introduction to Normalization In Unit 8 you learnt about how to create database using SQL.9E125 Same as NUMBER. BFILE LONG LONG RAW NUMBER (size) NUMBER(size. Size /d can't be specified Same as NUMBER Same as NUMBER Size /d can't be specified Same as NUMBER Q5. If the data structures are poorly designed. Rest is same as long Numbers. because any application ultimately depends on its data structures.Sikkim Manipal University – MI0034 BLOB CLOB Binary large object Max =4GB Character large object Max=4G.B.

London. The dormain of D location contains multivalues. Remove the attribute D. D. This will require a lot more work to create a useful and efficient application. There is a technique to achieve the first normal form. Normalization serves as a tool for validating and improving the logical design. Delhi) Bangalore Consider the figure that each dept can have number of locations. i.Name R&D HRD Figure A D. To transform the un-normalized table (a table that contains one or more repeating groups) to first normal form. simpler and well-structured relations. E. Normalization is the formal process for deciding which attributes should be grouped together in a relation.g. Normal forms Based on Primary Keys A relation schema R is in first normal form if every attribute of R takes only single atomic values. This is not in first normal form because D. we identify and remove the repeating groups within the table. We can also define it as intersection of each row and column containing one and only one value.Sikkim Manipal University – MI0034 application will start from a poor foundation.location that violates the first normal form and place into separate relation Dept_location .No 5 4 D. it eliminates redundancy and promotes integrity. location [England. In the normalization process we analyze and decompose the complex relations into smaller.e.location is not an atomic attribute. so that the logical design avoids unnecessary duplication of data. Dept.

marks. Given a relation R. in R. But the combination of (Sid. (X->Y) if and only if each value of X is associated with one value of Y. A functional dependency is the constraint between the two sets of attributes in a relation from a database. address.Sikkim Manipal University – MI0034 Functional dependency: The concept of functional dependency was introduced by Prof. . Codd in 1970 during the emergence of definitions for the three normal forms. a set of attributes X in R is said to functionally determine another attribute Y. Cid) is the primary key which uniquely retrieves Sname. Therefore (Sid. Similarly (Cid) course id cannot be primary key. STUDENT_COURSE In the STUDENT_COURSE database (Sid) student id does not uniquely identifies a tuple and therefore it cannot be a primary key. Cid) uniquely identifies a row in STUDENT_COURSE. X is called the determinant set and Y is the dependant attribute.: Consider the example of STUDENT_COURSE database. which are dependent on the primary key. course. For eg.

Sikkim Manipal University – MI0034 Second Normal Form (2 NF) A second normal form is based on the concept of full functional dependency. A relation is in second normal form if every non-prime attribute A in R is fully functionally dependent on the Primary Key of R. Emp_Project:Emp_ProjectFigure 9. (a) Normalizing EMP_PROJ into 2NF relations .2: 2NF and 3 NF.

. which results in anomalies when the table is updated. A functional dependence [FD] x->y in a relation schema 'R' is a transitive dependency. z->y is transitive. It creates a redundancy in that relation. Dnum is neither a key nor a subset [part] of the key.Sikkim Manipal University – MI0034 Normalizing EMP_DEPT into 3NF relations A Partial functional dependency is a functional dependency in which one or more non-key attributes are functionally dependent on part of the primary key. because they lead to update anomalies. If there is a set of attributes 'Z' Le x->. The dependency SSN->Dmgr is transitive through Dnum in Emp_dept relation because SSN->Dnum and Dnum->Dmgr. Third Normal Form (3NF) This is based on the concept of transitive dependency. We should design relational schema in such a way that there should not be any transitive dependencies.

Suppose the following two additional functional dependencies hold in LOTS. and between the second value and the 3rd value.Sikkim Manipal University – MI0034 According to codd's definition. Example 2: Consider a relation schema 'Lots' which describes the parts of land for sale in various countries of a state.lot#}. but property_ID numbers are unique across countries for entire state.Lot} we know that functional dependencies FD1 and FD2 hold. we can normalize the above table by decomposing into E1 and E2. Note: Transitive is a mathematical relation that states that if a relation is true between the first value and the second value. that is. then it is true between the 1st and the 3rd value. a relational schema 'R is in 3NF if it satisfies 2NF and no no_prime attribute is transitively dependent on the primary key. Suppose there are two candidate keys: property_ID and {Country_name. Based on the two candidate keys property_ID and {country name. Emp_dept relation is not in 3NF. FD3: Country_name -> tax_rate FD4: Area -> price . lot numbers are unique only within each country.

because tax_rate is partially dependent upon candidate key { Country_namelot#} Due to this. A given project or hobby is associated with any number of employees. The employees projects and hobbies are independent of one another. An employee can work in more than one project and can have more than one hobby.lots1 and lots 2. The drawback of EMPLOYEE relation is redundant data. This redundant data leads to update anomaly. then we must . with every value of the other attributes to keep the relation state consistent. A relation schema R is in 3NF when it satisfies the conditions below. FD4 says that price . Consider a table employee that has the attribute name. because price is transitively dependent on candidate key of Lots1 via attribute area. If we have two or more multi valued independent attributes in the same relation. and to maintain independence among the attributes involved. This constraint is specified by a Multi valued dependency. so that employ B is handling. . FD3 says that the tax rate is fixed for a given country of a Lot is determined by its area. it decomposes lots relation into two relations . The Lots relation schema violates 2NF. Lots1 violates 3NF. It is non_transitively dependent on every key of 'R' Fourth Normal Form (4NF) Multi valued dependencies are based on the concept of first normal form. 1. we get into a situation where we have to repeat every value of one of the attributes. which prohibits attributes having a set of values. To keep the Relation State consistent we must have separate tuples to represent every combination of employee's project and employees hobbies. For example. if we wish to add one more project on Sybase. Hence we could decompose LOTS1 into LOTS1A and LOTS1B.Sikkim Manipal University – MI0034 Here. project and hobby. It is fully functionally dependent on every key of 'R' 2.

One way to remove redundancy is to decompose EMPLOYEE relation into two relations PROJECT AND HOBBY. NOW. if we wish to insert Sybase in PROJECT relation. for each value of X there is a set of values for Y. and a set of values for Z.Sikkim Manipal University – MI0034 add two more tuples for each hobby.Y. However. Then such FD is called Multivalued Dependency (MVD) and is represented by double arrows We can also define MVD as. This redundancy is undesirable. Multivalued dependency can be avoided using the fourth normal form. a multivalued dependency arises. then we say if the set of Y "X multi-determines y" or "y is multi-dependent on x".Z) is said to have multivalued dependency values for a given [X. the set of values for Y and Z are independent of each other. Definition (MVD): A relation R(X. but depends only on X. then there is only one entry required. EMPLOYEE NAME A A A A B B PROJECT Microsoft Oracle Microsoft Oracle INTEL Sybase HOBBY Cricket Music Music Cricket Movies Reading . So wherever two independent one_to_many relationships (A:B and A:C) are mixed on the same relation. The values Reading and Movie of hobby are repeated with each value of project.Z] pair does not depend on Z.

Sikkim Manipal University – MI0034 B B INTEL Sybase Reading Movies Decomposed relation to reduce redundancy PROJECT NAME A A B B HOBBY PROJECT Microsoft Oracle Intel Sybase NAME A A B B PROJECT Cricket Music Movie Reading .

one AB = R or .Sikkim Manipal University – MI0034 Fourth Normal Form (4NF): The definition of 4NF is violated when a relation has undesirable multivalued dependencies. or that holds over R. and hence identify such relations and decompose into 4NF relations. Alternate definition: A relation R is said to be in 4NF if for every MVD of the following is true: B A (trivial).

Sikkim Manipal University – MI0034 .

i. Shared locks allow concurrent transaction to read (SELECT) a data. or "unlocked". Shared locks are released as soon as the data has been read. now has three possible states: "read locked". No other transaction can read or modify data when locked by an exclusive lock. Exclusive Locks: Exclusive locks are used for data modification operations.G. Exclusive locks are held until transaction commits or rolls back since those are used for write operations. because other transactions are allowed to read the item. SELECT statement:. .. used for operations that do not change or update the data. LOCK(X). To make this relation into 4NF you have to decompose EMPLOYEE to PROJECT AND HOBBY. It ensures that multiple updates cannot be made to the same resource simultaneously.e. because a single transaction exclusive holds the lock on the item.Sikkim Manipal University – MI0034 A is a super key The Employee relation is not in 4NF because of the non-trivial MVDs (project and hobby attributes of employee relation are independent of each other) and NAME is not a super key of EMPLOYEE. DELETE and INSERT. such as UPDATE. write_lock(X).. What do you mean by Shared Lock & Exclusive lock? Describe briefly two phase locking protocol? Shared Locks: It is used for read only operations. E. A read-locked item is also called share-locked. and unlock(X). Q6. "write-locked". There are three locking operations: read_lock(X). whereas a write-locked item is called exclusive-locked. A lock associated with an item X. No other transactions can modify the data while shared locks exist.

no_of_reads. goto B end. if LOCK(X)='unlocked' Then begin LOCK(X)"read-locked" No_of_reads(x)1 end else if LOCK(X)="read-locked" then no_of_reads(X)no_of_reads(X)+1 else begin wait(until)LOCK(X)="unlocked" and the lock manager wakes up the transaction). . goto B end. locking_transaction(s)>.Sikkim Manipal University – MI0034 Each record on the lock table will have four fields: <data item name. write_lock(X): B: if LOCK(X)="unlocked" Then LOCK(X)"write-locked". read_lock(X): B. The value (state) of LOCK is either read-locked or write-locked. else begin wait(until LOCK(X)="unlocked" and the lock manager wkes up the transaction). LOCK.

Sikkim Manipal University – MI0034 unlock(X): if LOCK(X)="write-locked" Then begin LOCK(X)"un-locked". Therefore this phase is also called as resource acquisition activity. The Two Phase Locking Protocol The two phase locking protocol is a process to access the shared resources as their own without creating deadlocks. Shrinking phase: In this phase the transaction may release locks. Here two activities are grouped together to form second phase. This process consists of two phases. . if any end else if LOCK(X)=read-locked" then begin no_of_reads(X)no_of_reads(X)-1 if no_of_reads(X)=0 then begin LOCK(X)=unlocked". wakeup one of the waiting transactions. but may not acquire any new locks. This includes the modification of data and release locks. but may not release any locks. if any end end. 2. 1. Wakeup one of the waiting transctions. Growing Phase: In this phase the transaction may acquire lock.

Sikkim Manipal University – MI0034 IN the beginning. Whenever lock is needed the transaction acquires it. As the lock is released. . transaction is in growing phase. transaction enters the next phase and it can stop acquiring the new lock request.

Sign up to vote on this title
UsefulNot useful