Q. 1. What do you mean by database?

Ans. A database is a collection of occurrence of multiple record types containing the relationship between records, data aggregate and data items. A database may be defined as A database is a collection of interrelated data store together without harmful and unnecessary redundancy (duplicate data) to serve multiple applications The data is stored so that they are independent of programs, which use the data. A common and control approach is used in adding the new data, modifying and retrieving existing data or deletion of data within the database A running database has function in a corporation, factory, government department and other organization. Database is used for searching the data to answer some ueries. A database may be design for batch processing, real time processing or on line processing. DATA!A"# "$"T#% Database "ystem is an integrated collection of related files along with the detail about their definition, interpretation, manipulation and maintenance. &t is a system, which satisfied the data need for various applications in an organization without unnecessary redundancy. A database system is based on the data. Also a database system can be run or executed by using software called D!%" (Database %anagement "ystem). A database system controls the data from unauthorized access. 'oundation Data (oncept A hierarchy of several levels of data has been devised that differentiates between different groupings, or elements, of data. Data are logically organized into)

(haracter

&t is the most basic logical data element. &t consists of a single alphabetic, numeric, or other symbol.

'ield

&t consists of a grouping of characters. A data field represents an attribute (a characteristic or uality) of some entity (ob*ect, person, place, or event).

+ecord

The related fields of data are grouped to form a record. Thus, a record represents a collection of attributes that describe an entity. 'ixed,length records contain, a fixed number of fixed,length data fields. -ariable,length records contain a variable number of fields and field lengths.

'ile

A group of related records is .nown as a data file, or table. 'iles are fre uently classified by the application for which they ar primarily used, such as a payroll file or an inventory file, or the type of data they contain, such as a document file or a graphical image file. 'iles are also classified by their permanence, for example, a master file versus a transaction file. A transaction file would contain records of all transactions occurring during a period, whereas a master file contains all the permanent records. A history file is an obsolete transaction or master file retained for bac.up purposes or for long,term historical storage called archival storage.

Database

&t is an integrated collection of logically related records or ob*ects. A database consolidates records previously stored in separate files into a common pool of data records that provides data for many applications. The data stored in a database is independent of the application programs using it and o the /type of secondary storage devices on which it is stored.

Q. 2. What are the various characteristics of DBMS?

Ans. The ma*or characteristics of database approach are) 0 "elf,describing 1ature of a Database "ystem 0 &nsulation between 2rograms and Data, and Data Abstraction 0 "upport of %ultiple -iews of the Data 0 "haring of Data and %ulti user Transaction 2rocessing

Q. 3. What are the various characteristics of DBMS approach?

Ans. 3. "elf,contained nature D!%" system contains data plus a full description of the data (called 4metadata5) 4metadata5 is data about data , data formats, record structures, locations, how to access, indexes metadata is stored in a catalog and is used by D!%" software to .now how to access the data. (ontrast this with the file processing approach where application programs need to .now the structure and format of records and data. 6. 2rogram,data independence Data independence is immunity of application programs to changes in storage structures and access techni ues. #.g. adding a new field, changing index structure, changing data format, &n a D!%" environment these changes are reflected in the catalog. Applications aren7t affected. Traditional file processing programs would all have to change, possibly substantially. 8. Data abstraction A D!%" provides users with a conceptual representation of data (for example, as ob*ects with properties and inter,relationships). "torage details are hidden. (onceptual representation is provided in terms of a data model. 9. "upport for multiple views D!%" may allow different users to see different 4views5 of the D!, according to the perspective each one re uires. #.g. a subset of the data , 'or example: the people using the payroll system need not;should not see data about students and class schedules. #.g. data presented in a different form from the way it is stored , 'or example someone interested in student transcripts might get a view which is formed by combining information from separate files or tables.

This brings a number of advantages) (a) reduces redundancy (b) avoids inconsistencies (c) data can be shared (d) standards can be enforced (e) security restrictions can be applied (f) integrity can be maintained a. (entralized control of the data resource The D!%" provides centralized control of data in an organization. b. +edundancy and &nconsistencies .<.

g. organizational.. if there is no redundancy). hours wor. (ontrolled redundancy) "ome redundancy may be desirable (for efficiency).eep student name. "haring 0 1eed concurrency control 0 %ultiple user views d. (urrently D!%"s provide limited integrity chec. two different addresses for a given student number.g. 'or example if accounts department and registration department both ..g. . record structures. naming. documentation &nternational. not yet currently supported.s on all accesses. +edundancy wastes space and duplicates effort in maintaining the data. f. and ta. data formats..g. student number must be uni ue e. standards e.e care of propagating updates to all copies of a data item.s on all updates. every course must have an instructor e.s. "tandards #. restricting unauthorized access D!%" should perform security chec. &nconsistency cannot occur if data is represented by a single entry (i. A D!%" should be aware of it. &nconsistent data is data which contradicts itself .g.e.ed cannot be more than 3<= These things are expressed as constraints. departmental .g. +edundancy also leads to inconsistency. This is an ob*ective. c. &ntegrity %aintaining validity of data: e. D!%" should perform integrity chec. "ecurity . number and address. e. employee numbers must be in some range e.+edundancy is unnecessary duplication of data.

databases. the set of interconnected multimedia pages at a website is a database of interrelated hypermedia page elements. Types of Databases (ontinuing developments in information technology and its business applications have resulted in the evolution of several ma*or types of databases. rather than interrelated data records. inventory databases. etc. 3. and other databases containing data generated by business operations Distributed Databases %any organizations replicate and distribute copies or parts of databases to networ. transaction databases. What do you mean by DBMS? . and with or without charge from many sources on the &nternet. audio segments. privately owned online databases or data ban. video clips. #xternal Databases Access to external. sewers at a variety of sites.s is available for a fee to end users and organizations from commercial online services. personnel databases. @ypermedia Databases &t consists of hyperlin.ed pages of multimedia (text. Distributed databases may be copies of operational or analytical.). These distributed databases can reside on networ. and photographic images. graphics. What are the various types of databases? Ans. 'rom a database management point of view. especially the ?eb.using organizations include) >perational Databases The databases store detailed data needed to support the operations of the entire organization. on corporate &ntranets or extranets. servers on the ?orld ?ide ?eb. Q.Q. +eplication and distribution of databases is done to improve database performance and security. hypermedia or discussion databases. They are also called sub*ect area databases ("AD!). or on other company networ. or any other type of database. "everal ma*or conceptual categories of databases that may be found in computer. . and production databases) #xamples are customer databases.s.

The D!%" is regarded as dynamic because it is through the D!%" that all database processing ta. @owever.e data security. data integrity. the same data item A D!%" is a set of programs that is used to store and manipulation data that include the following) 0 Adding new data. data sharing. for example adding details of new student. A D!%" provides various functions li. for example deleting the details of students who have completed course.Access do not provide the same set of functions. @ow the D!%" presents data to the user is called the view structure. data recovery etc. A D!%" is best described as a collection of programs that manage the database structure and that control shared access to the data in the database. and %". The transactional is processed for updating values in the database variables. for example modifying the fee paid by the student. !. The main function of a D!%" is to provide efficient and reliable methods of data retrieval to many users. the data structure is regarded as static because a database cannot /process anything. What are the various components of DBMS? Ans.Ans. A database is the information to be stored whereas the database management system is the system used to manage the database. !asic (omponents) A database system has four components. called the physical structure. Q. data concurrence. This structure may be regarded in terms of its hardware implementation. . There are two general modes for data use) ueries and transactions.e care of defining the re uired access paths to those components A database management system (D!%") is the combination of data. !oth forms use the D!%" for processing. These four . hardware.et li. or this structure may be regarded independently of its hardware implementation.e "ybase. The uery is processed for presentation in views and none of these processes are written to the database. #fficient data retrieval is an essential function of database systems. (urrent D!%"es also store the relationships between the database components: they also ta. >racle. all database management systems that are now available in the mar. D!%" must be able to deal with several users who try to simultaneously access several items and most fre uently. These updates are written to the database. software and users to help an enterprise manage its operational data. 0 (hanging existing data. 0 Deleting unwanted data. data independence. &n either case. though all are meant for data management.es place. called the logical structure.

9. &t is the other side of hardware.e . which is given a name and referred as a whole. organized and stored. "oftware "oftware is another ma*or part of the database system. !y using software. Asers 3. Data As we have discussed above. i.. magnetic dis. Asers .eyboard. data is raw hand information collected by us.redundant form and so these have uni ue resistance. 8.defined manner with no redundancy. .e all the operating systems. and magnetic drum) etc. They go side by side. 6. monitor. Also data must be in the integrated form.) and second one is an application software (payroll. The definition of @ardware is 4which we can touch and see5. "oftware is a system. all the languages and system pac. @ardware @ardware is also a ma*or and primary part of the database. ?ithout hardware nothing can be done.output and storage devices li. All physical uantity or items are in this category.e. "o if by chance data lost or damaged at one place. 'irst type is system software (li. Data is made up of data item or data aggregate. "oftware 9. "oftware only can execute. These are) 3. floppy dis. but names in class may be in the redundant form and can create lot of problems later on in using and accessing the data. Data stored in a system is partition into one or two databases. then it can be accessed from the second place by using the sharing facility of data base system. @ardware 8. Data can be collected orally or written. electricity billing. A Data item is the smallest unit of named data) &t may consist of bits or bytes. for example +oll number in a class is non. mouse. it has physical existences.e. ?e can define software as which we cannot touch and see.). are commonly used with a computer system. &ntegration means data should be in uni ue form i. A Data aggregate is the collection of data items within the record. Data 6. A database can be integrated and shared. @ardware and software are two sides of a coin. scanner..components are important for understanding and designing the database system. hospital management and hostel administration etc. data collected by using a well. 'or example. "o a shared data also cane be reused according to the user7s re uirement..ages etc. A Data item is often referred to as field or data element. "oftware are further subdivided into two categories. storage devices (hard dis. all the hardware input. data can be manipulated.

?ithout user all of the above said components (data, hardware B software) are meaning less. Aser can collect the data, operate and handle the hardware. Also operator feeds the data and arranges the data in order by executing the software. >ther components 3. 2eople , Database administrator: system developer: end user. 6. (A"# tools) (omputer,aided "oftware #ngineering ((A"#) tools. 8. Aser interface , %icrosoft Access: 2ower!uilder. 9. Application 2rograms , 2ower!uilder script language: -isual !asic: (CC: (>!>D. <. +epository , "tore definitions of data called %#TADATA, screen and report formats, menu definitions, etc. E. Database , "tore actual occurrences data. F. D!%" , 2rovide tools to manage all of this , create data, maintain data, control security access to data and to the repository, etc.

Q. ".What are the various functions of DBMS?

Ans. These functions will include support for at least all of the following) 0 Data definition) The D!%" must be able to accept data definitions (external schemas, the conceptual schema, the internal schema, and all associated mappings) in source form and convert them to the appropriate ob*ect form. 0 Data manipu3ation) The D!%" must be able to handle re uests from the users to retrieve, update, or delete existing data the database, or to add new data to the database. &n other words, the D!%" must include a data manipulation language (D%D) processor component. 0 Data security and integrity) The D!%" must monitor user re uests and re*ect any attempt to violate the security and integrity rules defined by the D!A. 0 Data recovery and concurrency) The D!%" , or else some other related software component, usually called the transaction manager , must enforce certain recovery and concurrency controls. 0 Data Dictionary) The D!%" must provide a data dictionary function. The data dictionary can be regarded as a database in its own right (but a system database, rather than a user database). The dictionary contains 4data about the data5 (sometimes called metadata) , that is, definitions of other ob*ects in the system , rather than *ust5raw data.5 &n particular, all the various schemas and mapping (external, conceptual, etc.) will physically be stored, in both source and

ob*ect form, in the dictionary. A comprehensive dictionary will also include cross, reference information, showing, for instance, which programs use which pieces of the database, which users re uire which reports, which terminals are connected to the system, and so on. The dictionary might even , in fact, probably should G be integrated into the database it defines, and thus include its own definition. &t should certainly be possible to uery the dictionary *ust li.e any other database, so that, for example, it is possible to tell which programs and or users are li.ely to be affected by some proposed change to the system. 2erformance) &t goes without saying that the D!%" should perform all of the functions identified above as efficiently as possible.

Q#. What are the advanta$es and disadvanta$es of a database approach?

Ans. AD-A1TAH#" >' D!%" >ne of the ma*or advantages of using a database system is that the organization can be handled easily and have centralized management and control over the data by the D!A. "ome more and main advantages of database management system are given below) The main advantages of D!%" are) 3. (ontrolling +edundancy &n a D!%" there is no redundancy (duplicate data). &f any type of duplicate data arises, then D!A can control and arrange data in non,redundant way. &t stores the data on the basis of a primary .ey, which is always uni ue .ey and have non,redundant information. 'or example, +oll no is the primary .ey to store the student data. &n traditional file processing, every user group maintains its own files. #ach group independently .eeps files on their db e.g., students. Therefore, much of the data is stored twice or more. +edundancy leads to several problems)

0 Duplication of effort 0 "torage space wasted when the same data is stored repeatedly 'iles that represent the same data may become inconsistent (since the updates are applied independently by each users group).?e can use controlled redundancy.

6. +estricting Anauthorized Access A D!%" should provide a security and authorization subsystem. 0 "ome db users will not be authorized to access all information in the db (e.g., financial data). 0 "ome users are allowed only to retrieve data. 0 "ome users are allowed both to retrieve and to update database. 8. 2roviding 2ersistent "torage for 2rogram >b*ects and Data "tructures Data structure provided by D!%" must be compatible with the programming language7s data structures. #.g., ob*ect oriented D!%" are compatible with programming languages such as (CC, "%ADD TADI, and the D!%" software automatically performs conversions between programming data structure and file formats. 9. 2ermitting &nferencing and Actions Asing Deduction +ules Deductive database systems provide capabilities for defining deduction rules for inferencing new information from the stored database facts. <. &nconsistency can be reduced &n a database system to some extent data is stored in, inconsistent way. &nconsistency is another form of delicacy. "uppose that an em3oyee 4Japneet5 wor. in department 4(omputer5 is represented by two distinct entries in a database. "o way inconsistent data is stored and D!A can remove this inconsistent data by using D!%". E. Data can be shared &n a database system data can be easily shared by different users. 'or example, student data can be share by teacher department, administrative bloc., accounts branch arid laboratory etc. F. "tandard can be enforced or maintained !y using database system, standard can be maintained in an organization. D!A is overall controller of database system. Database is manually computed, but when D!A uses a D!%" and enter the data in computer, then standard can be enforced or maintained by using the computerized system. K. "ecurity can be maintained 2asswords can be applied in a database system or file can be secured by D!A. Also in a database system, there are different coding techni ues to code the data i.e. safe the data from unauthorized access. Also it provides login facility to use for securing and saving the data either

(ost A significant disadvantage of D!%" is cost.ey for integration of data. Also additional cost occurs due to migration of data from one environment of D!%" to another environment. These are explained as) 3.by accidental threat or by intentional threat. L. &n addition to the cost of purchasing or developing the software. driven interfaces. the organization M333 also purchase or upgrade the hardware and so it becomes a costly system. 33. forms. +epresenting (omplex +elationships Among Data &t is used to represent (omplex +elationships Among Data 38. it has some disadvantages. 2roviding %ultiple Aser &nterfaces 'or example uery languages. &ntegrity can be maintained &n a database system. /Data can be accessed if it is compiled in a uni ue form. but due to some ma*or problem arise in using the D!%".ey ad some secondary . . programming languages interfaces. menu.s are incorporated in the D!%" to provide data integrity. up and recovery features. D!A select the best file structure and accessing strategy to get better performance for the representation and use of the data.defined manner by D!A. (entralized control can also ensure that ade uate chec. D&"AD-A1TAH#" >' D!%" Database management system has many advantages. 3=. 2roviding !ac. &ntegration means unification and se uencing of data. etc.e primary . data can be written or stored in integrated way. ?e can ta. "o there is no confliction between the databases. data can be written or arranged in a well. (onfliction can be removed &n a database system. 36.up and +ecovery The D!%" also provides bac. "ame recovery procedure can be also maintained to access the data by using the D!%" facility. &n other words it can be defined as 4the data contained in the data base is both accurate and consistent5.

#nterprise -ulnerability (entralizing all data of an enterprise in one database may mean that the database becomes an indispensable resource. %ost. The enterprise therefore becomes vulnerable to the destruction of the database or to unauthorized modification of the database. 2roblems associated with centralization (entralization also means that data is accessible from a single source. (onfidentiality. possibly. so there is no security of data from unauthorized access and data can be damaged or lost.now the centralized data can be accessed by each user. K. ?ith increased number of users accessing data directly. The survival of the enterprise may depend on reliable information being available from its database. 9. technical safeguards are necessary to ensure that the data remain correct during operation. ade uate controls are needed to control users updating data and to control data uality. E.up and recovery are fairly complex in D!%" environment. As in a D!%". defined.e technical. "uch systems are often 4tuned5 to run efficiently for the processes that they were designed for. the data uality may be compromised.a database concurrently.user database system which is in operation. there are enormous opportunities for users to damage the data. F.up floppy. databases store valuable information that must be protected against deliberate trespass and destruction.e a bac. Data &ntegrity "ince a large number of users could be using . The main threat to data integrity comes from several different users attempting to update the same data at the same time. but iterate duplicacy in loading to the concurrent multi. preplanned processes. (omplexity of bac. As we . To reduce the chances of unauthorized users accessing sensitive information. Damage database can be recovered from the bac. legal measures. The database therefore needs to be protected against inadvertent changes by the users. 8.up of the data then it may affect the multi. 2rivacy and "ecurity ?hen information is centralized and is made available to users from remote locations.up and recovery !ac. Anless there are suitable controls. if you ta. the possibilities of abuse are often more than in a conventional system.user database system. administrative and.6. Data Nuality "ince the database is accessible to users remotely. it is necessary to ta. <. Although the conventional systems are usually fairly . The (ost of using a D!%" (onventional data processing systems are typically designed to run a number of well.

&ist five si$nificant differences bet'een a fi(e)processin$ system and a DBMS. then there be a problem occur in file based system to use the data by direct or random addressing. The database approach on the other hand provides a flexible alternative where new applications can be developed relatively inexpensively. its employees. %. ?e cannot change and access file. !ut a database system made it simple and fast.ly. 'ile oriented operating system provide only a password mechanism for security. As we . <. Asing standardized software is almost always less machine efficient than specialized software. sale and purchase order etc. "ome programs are too lengthy and complex which cannot store large amount of data in the files related to the operating systems.s. 6.e magnetic dis.or expensive to run. so we have re uirement a type of system which can be used to access the large amount of data concurrently.inflexible in that new applications may be difficult to implement and. but centralized database management solve such type of problem. As we try to deal with this type of data management problem by storing the data in a collection of operating system files. As we have not 3===H! main memory (primary memory) to store the data. Det us consider an organization have a huge amount (collection) of data on its different departments.now such type of data is accessed simultaneously by different and several employees. there be need to understand the D!%" and its component. but this is not successful in case of number of users are accessing the same data by using the same login. . its products.oriented system fails in primary memory cases and we apply data base management system to store the data files permanently. which are discussed as below) 3. "o file. "uch type of techni ues creates number of problems or drawbac. so we store the data in some permanent storage device (secondary memory) li. Also we cannot recall or recover the file. 8. "o there be need a database program to identify the data. 9. Q. Ans.oriented data simultaneously. then it will create a problem of slow processing. E. &f data is stored in the files. !efore differentiating between file and database systems. 1ow some users apply number of ueries and want answers uic.oriented data. Also we cannot call more then 6H! or 9Hb of data direct to the primary memory at a time. "uppose if we have such a large amount of primary memory on a 3E bit or 86 bit computer system. they are usually very efficient for the applications they are designed for. The flexible approach is not without its costs and one of these costs is the additional cost of running applications that the conventional system was designed for. or magnetic tape etc.

file containing 6== records then file activity ratio is 3==. file .r Discuss the DBMS and -i(e processin$ system A(so $ive the (imitations of fi(e processin$ system Ans. 'ile activity ratio The number of records processed one run divided by total number of records. at the starting of computerization or a file which rarely changes. 9. 'or example) if we changes 3== records from a ban. 6. 'ollowing terms was commonly used in this approach or the features of 'ile oriented system.At end we can sat that a D!%" is a piece of software that is designed to ma. so we can say it was a file. 8. &n the traditional file processing approach. T+AD&T&>1AD '&D# 2+>(#""&1H Data are organized.6== =. Transaction file A file that is created repeatedly after regular interval of time.oriented system. and processed in independent files of data records. Q * Describe ma+or advanta$es of a database system over fi(e system . 3. master file the account no. 'or example) &n a ban. 'or example) the payroll file of employee is updated at the end of every month. ?or.e the processing faster and easier. each business application was designed to use one or more specialized data files containing only specific types of data records T+AD&T&>1AD '&D# "$"T#% >+ '&D# >+&#1T#D A22+>A(@ The business computers of 3LK= were used in processing of business records and produce information using file oriented approach or file processing environment At that time that system was reliable and faster than the manual system of record . <. name and balance are entered only once and less fre uently changes. &t should be noted that this ratio of master file is less.e.eeping and processing &n this system the data is organized in the form of different files. %aster file The file that is created only once i. The file volatility ratio of transaction file is very high. 'ile volatility ratio &t is the number of records updated in a transaction file divided by total number of records.<. "ince that system was the collection of files . stored.

a user might need data from two separate files. set of policies. &f a data item is changed the all the files containing that data item need to be changed and updated properly.ing rules and report formats. Then applications could be written in a third generation language to process and extract the needed data. "o some of the fields or files are stored more than once.e a decision. A file is said to be have data integrity .A temporary file that helps in sorting and merging of records from one file to other. &ndexed se uential F. 0 2oor data integrity A collection of data has integrity. 'irst. &f the structure of file changes. D!%" have data consistency. analysts and programmers to determine the specific data re uired from each file and the relationships between the data evaluated the files. Data island &n this system each dept has its own files designed for local applications. E. it means a item is not be stored in duplicate manner. "e uential 6. data is not consistent. of inconsistency. 0 "eparated and isolated data To ma. &magine the wor. involved if data from several files was neededO 0 &nconsistency &n this system. &t means programs were depending on the file structure or format of file. #ach department has its own data processing staff. @ence duplicacy is more in case of file approach but dbms has controlled duplicacy. 0 #very operation is programmable . 'ile organization &t means the arrangement of records in a particular order. wor. &t has been seen that file oriented system have poor data integrity control. These days the file oriented approach is still used but has following limitations) D&%&TAT&>1" >' '&D# >+&#1T#D A22+>A(@ 0 Duplicate data "ince all the files are independent of each other. the program has also to be changed. Data integrity has been achieved in D!%". Direct 8. There were three types of file organizations 3. &f all the files are not updated properly there may be high ris.

2+>!D#%" >' '&D# 2+>(#""&1H . 0 2oor data security All the files are stored in the flat form or text files. The change in file means change in program and vice versa. 0 Difficult to represent the complex ob*ects) "ome the ob*ects may be of variable length records can be computerized using this approach. 0 (oncurrency problem &t means using a same record at same time. often data from various files must be combined. 0 Difficulty in representing data from the user7s view To create useful applications for the user. 0 (an not support heavy databases) The databases on the &nternet can be handled by the files system . >n the other hand the D!%" have following advantages.e oracle is used for heavy data base applications. but D!%" li. deletion etc should have separate programs. &t means there were no functions available for these operations.data interdependency and data isolation limited the flexibility of file processing systems in providing users with ad hoc information re uests. 1o such problem in D!%". This problem was common in file approach but can be controlled in D!%".e searching.The processing tas.s li.length records as well as variable. !ecause designing applications was so programming. 0 Application programs are dependent on the file format) &n file processing system the physical formats of the files are entered in the programs. %&" department staff usually restricted information re uests Therefore. D!%" have ready. 0 Data inflexibility 2rogram. D!%" has capability to handle fixed. users often resorted to manual methods to obtain needed information. These files can be easily located and trapped because file approach. editing. has no data security.length records.made commands for such operations.intensive. &n file processing it was difficult to determine relationships between isolated data in order to meet user re uirements.

time. their physical locations of storage hardware. the organization of files. since separate file maintenance programs had to be developed and coordinated to ensure that each file was properly updated. 0 Dac. and costly for some organizations that it was impossible to provide end users or management with such information. This program maintenance effort was a ma*or burden of file processing systems. 0 >ther 2roblems &t was easy for data elements to be defined differently by different end users and applications.consuming. This data redundancy caused problems when data had to be updated. &ntegrity of the data was suspect because there was no control over their . 0 Data Dependence &n file processing systems. and the application software used to access those files G depended on one another in significant ways. Anfortunately. &t was replaced by the database management approach. costly. a lot of inconsistencies occurred among data stored in separate files. and inflexible to supply the information needed to manage modem businesses. (hanges in the format and structure of data and records in a file re uired that changes be made to all of the programs that used that file. "pecial computer programs had to be written to retrieve data from each independent file. This was so difficult.The file processing approach finally became too cumbersome. ma*or components of the system . 'ile processing systems had the following ma*or problems) 0 Data +edundancy &ndependent data files included a lot of duplicated data: the same data was recorded and stored in several files. of Data &ntegration @aving independent files made it difficult to provide end users with information for ad hoc re uests that re uired accessing data stored in several different files.

(b) #nd Aser or Data #ntry >perators Data entry operators are preliminary computer based users. (d) D!A (Data !ase Administrator) D!A is a ma*or user. Application programmer wor.nowledge of computer operations. which have little bit . These are used at the end form. What are the various types of database uses? Ans.ing of application programmer is to develop a new pro*ect i. &nternet using non.nowledge or computer system. Also operator feeds the data and arranges the data in order by executing the software.use and maintenance by authorized end users. An automated teller machine (AT%s) user are in these categories. on any simple HA& base menu driven system. who ta. Aser can collect the data. D!A is also called .es care about the ban. The wor. These user communicate the database directly via an on line terminal or indirectly via a user interface. money and not use it. deigned forms. D!A either a single person or a group of persons. 1aPve user can wor. Q. program for a particular application or modify an existing program. These user are also called >n line users. D!A is only the custodian of the business firm or organization but not the owner of the organization. These are li.e 'ortran. The function of data entry operators are only to operate the computer (startO stop the computer) and feed or type the collected information (data) in menu driven application program and to execute it according to the analyst7 re uirement. hardware B software) are meaning less.s according to some instructions given by database administrator (D!A).computer based person are in this form.e. As ban. These users are mainly used for collecting the data on the noteboo. but re uire complete ..nowledge of database system and its any supporting software. (c) Application programmer @e is also called simple programmer. Asers are of mainly of four types. These are) (a) 1aPve user 1aPve user has no .e a layman..1. ?ithout user all o the above said components (data. dbase etc.s or on the pre. These users re uire certain amount of expertise in the computer programming language. manager is the D!A of a ban. (obol. operate and handle the hardware. Application programmer can handle all the programming language li. >nly D!A can handle the information collected by end user and give the instructions to the application programmer for developing a new program or modifying an existing program.

D!A also recovers the damaged data. 3. <.e decision about the storage structure. D!A should be good decision.es bac. D!A compiles the whole data in a particular order and se uence.up storage device so that if data can be lost then it can be again recovered and compiled. hardware. D!A communicates with the user by appropriate meeting.e. Discuss the architecture of database mana$ement system. <. D!A liaise with the user in the business to ta. @e. +#"2>1"&!&D&T&#" >' D!A As we . 8.her ma*or responsibilities are as follows) 3.s as D!A. so it has number of duties.operates with user. @e decides the suitable database file structure for arrangement of data. D!A also changes the environment according to user or industry re uirement and monitor the performance. D!A also define and.s and validation procedures. 6. D!A co. D!A can control the data. D!A decides where data can be stored i. D!%" A+(@&T#(TA+# .now D!A is the overall commander of a computer system. 8. The decision ta.e confidence of the customer about availability of data.up of the data on a bac. 9. end user and naive user. D!A decides the information contents of the database."he uses the proper DDD techni ues. ta.an overall controller of the organization. E. Q11. 6. apply authorized chec. and software and gives the instructions to the application programmer.en by D!A should be correct. 9.er. D!A decides which access strategy and techni ue should be used for accessing the data. but some of his. &n other words D!A is the overall controller of complete hardware and software. Ans. D!A also ta. E. accurate B efficient. &n computer department of a firm either system analysts or an #D2 (#lectronic Data 2rocessing) %anager wor. D!A should have leadership uality.ma.

According to this approach. This concern has led to the design of complex data structures for the representation of data in the database. a final report about database is submitted by A1"& (American 1ational "tandard &nstitute) and "2A+( ("tandard 2lanning And +e uirement (ommittee). The generalized architecture of a database system is called A1"&. three levels of a database system was suggested and they are) 0 #xternal view (&ndividual user view) 0 (onceptual -iew (Hlobal or community user view) 0 &nternal level (physical or storage view). have been suggested for the D!%" over the last several year. to simplify users7 interactions with the system."2A+( (American 1ational "tandards &nstitute. &n 3LF6. 'or the system to be usable. "ince many database systems users are not computer trained. These three views or levels of the architecture are as shown in the diagram as follows) >!J#(T&-#" >' T@+## D#-#D A+(@&T#(TA+# . it must retrieve data efficiently.There are many different framewor."tandards 2lanning and +e uirements (ommittee) model. developers hide the complexity from users through several levels of abstraction.

the view level of abstraction is defined. %a. Despite the use of simpler structures at the logical level. 6. D!A can change the internal structure without effecting the user7s view. &nstead. &t is independent of all hardware and software.ing of database system. %any users of the database system will not be concerned with all this information. <. The data should reside at same place and all the users can access it as per their re uirements. F. 'or example changes in hardware. such users need to access only a part of the database. 8. 9. "o that their interaction with the system is simplified. The external view or user views do not depend upon any change made ii other view. because of the large size of the database. .-iew level The highest level of abstraction where only those parts of the entire database are included which are of concern to a user. The users of database should not worry about the physical implementation and internal wor. E.e the changes easy in database when some changes needed by environment. The database should be simple and changes can be easily made.The database views were suggested because of following reasons or ob*ectives of levels of a database) 3. some complexity remains. The system may provide many views for the same database. All the three levels are shown below #xternal. operating system or internal view should not change the external view.

Database systems have several schemas. length of each data item.s. the library department is only interested in boo. There is only one conceptual view per database. 0 &t is the global view seen by many users. At the lowest level is the physical schema: at the intermediate level is the logical schema and at the highest level is a subschema. 0 >nly limited access (read only. staff and students etc.s to retain data consistence and integrity. staff etc is present at one place and every department can use it as per need. who must decide what information is to be . The overall design of the database is called the database schema. types. use this level of abstraction. 0 @ere only one portion of database will be given to user. accounts. The collection of information stored in the database at a particular moment is called an instance of the database. >ne conceptual view represents the entire database. &t also includes features that specify the chec.ept in the database. 'or example) The head of account department is interested only in accounts but in library information. The features are) 0 The conceptual or logical view describes the structure of many users.Databases change over time as information is inserted and deleted. 0 %any users and program can use the interested part of data base. if at all. !ut all such data li.e student. (onceptual. The description of data at this level is in a format independent of its physical representation. partitioned according to the levels of abstraction that we discussed. 0 >nly D!A can be defined it.Dogical level Database administrators. The create table commands of >racle creates this view. write only etc) can be provided in this view. boo. The features of this view are 0 The external or user view is at the highest level of database architecture. "chemas are changed infre uently. . 0 &t is defined by defining the name. 0 !y creating separate view of database. we can maintain security.s. 0 &t is represented at middle level out of three level architecture. 0 >ne portion may have many views.

eys.e) 0 !. 0 &t stores the concepts li. and what relationships exist among those data.internal mapping) o defines conceptual and internal view correspondence 0 specifies mapping from conceptual records to their stored counterparts o An external.2hysical level The lowest level of abstraction describes how the data are stored in the database. The entire database is thus described in terms of a small number of relatively simple structures.tree and @ashing techni ues for storage of data. the user of the logical level does not need to be aware of this complexity.level structures. pointers. 0 2rimary . 0 Data compression techni ues.eys. se uences for data search.conceptual mapping) 0 defines a particular external and conceptual view correspondence . although implementation of the simple structures at the logical level may involve complex physical. #1A%# !$T#(6<))U %apping between views 0 The conceptual. The features are ) 0 &t describes the actual or physical storage of data. 0 &t stores the data on hardware so that can be stored in optimal time and accessed in optimal time.0 &t is independent of all hardware and software. 0 &t is the third level in three level architecture. secondary . &nternal. 0 &t is represented as '&D# #%2 Q &1D#R >1 #%21> '&#DD S T (#%21>) !$T# (9).

"ub. achieving physical data independence. 3. 2ascal etc. to search the data according to the user re uirement. 2rogramming Danguage All the programming language li.0 A change to the storage structure definition means that the conceptual. All the business. so these all commonly used to solve general and scientific problems. "ome data manipulation operations are also called NA#+$7 or0 NA#+$ >2#+AT&>1". This is extension of the program of the language used to manipulate data in the database.internal mapping must be changed accordingly.e (obol.external mapping must be changed accordingly. so that the conceptual schema may remain invariant. (.nown ."chema Description Danguage E. Q.e. so that the external schema may remain invariant. Data %anipulation Danguage 8. 0 A change to the conceptual definition means that the conceptual. This language creates interface (lin.age) between user and application program. (CC. 12. Data Definition Danguage 9. 2rogramming Danguage 6. Ans. 6. "ome main types of languages and facilities are provided by D!%". A Nuery is a statement in D%D that re uest the retrieval of data from the database i. has syntax and semantics. These all have structured and logical structure. insertion of new data into the database and deletion or modification of the existing data. 'ortran. "ND ("tructured Nuery Danguage) 3. Write a note on Database &an$ua$e And /nterfaces. The subset of the D%D used to operate the uery is . "chema Description Danguage <. D%D involves7 retrieval of data from the database. achieving logical data independence. D%D "ome language that gives instructions to the programming language and other languages is called data manipulation language (D%D).oriented problems can be solved by the three HD and 'ourth Ht.

nown as Data Definition Danguage or data description language (DDD). provide such type of facility with the help of database management system. (obol. ?e can say that we relate whole college data logically.teaching) data and other concept of data related to the college is called "chema of the college. The commands have different syntax for different programming language. to update B to delete the records. (ommands used in the D%D are to insert. which is called schema. "chema is also referred to as conceptual model or global view (community view) of data. The complete and overall description of data is referred to as schema. ( etc. ?e can say that there is another language . "uppose a complete description of collected data having all classes and student data.ed in a application program directly by procedural calls or by processors statement. &n both the cases the D!%" optimize the exact answer by using D%D. The logical and physical database descriptions are used by D!%" software. The data manipulation function provided by D!%" can be invo. The definition includes all the entity sets and their associated attributes as well as the relationship among the entities set. D%D provides commands to select B retrieve data from the database. DDD can be used to define conceptual schema (Hlobal) and also give some details about how to implement this schema in the physical devices used to store the data. "chema Description Danguage ("DD) or "chema &t is necessary to describe the organization of the data in a formal manner. all employees (teaching B non. .data have data dictionary. D"D S D%D C DDD 9. &f the D%D is non. 'ortran. %eta. system catalog to describe data about data. DDD is used for the help of D%D.as Nuery Danguage. The schema and subschema words are brought into D!%" by (>DA"$D ((onference on data system language committee) and also by the (>DA"$D7s database tas.data (it is data about the data in database). The D!%" maintain the information on the file structure and also used some access method to access the data efficiently. 'or example. The D%D can become procedural language according to the user re uirement. Data "ub Danguage (D"D) which is the combination of both D%D and DDD. DDD also have some meta. group. This procedure can be done by the compiler. The definitions also have some constraints which are used in D%D.procedural than user will indicate only what is to be retrieved. 8. directory. The dictionary contains the information about the data stored in the database and it is consulted by D!%" before any data manipulation operations. DDD Database management system provides a facility .

"ND resembles relational algebra and relational calculus in a relational system approach. purchase. An application programmer does not use whole data i. Two or more than two application.schema supplier. D!%" &1T#+'A(#" Types of interfaces provided by the D!%" include) %enu. schema purchase. e.< "ub "chema Description language The term schema is used to mean an overall chart of the data items. 'orms. E. The term sub. This language was helpful for both DDD and D%D for the system +.schema is the part of schema.!ased interfaces for ?eb (lients or !rowsing 0 2resent users with list of options (menus) 0 Dead user through formulation of re uest 0 Nuery is composed of selection options from menu displayed by system. "ome commonly used "ND are >+A(D#.!ased &nterfaces 0 Displays a form to each user 0 Aser can fill out form to insert new data or fill out only certain entries.schemas can be derived from one schema. Their operations and views are different according to their own sub. The purpose of this language is to provide such non. Hraphical Aser &nterfaces . "tructured Nuery Danguage ("ND)) "ND organized with the system +.schema of the whole schema description of the purchase department in the hole industry. "ND is also called "tructure Nuery Danguage.e.ey. types and record type stored in a database. "ystem + means it is relational language.schemas on the basis of a common . >ne person named A uses the sub.procedural commands which are used for validation of the data and for searching the data.order for the maintenance department is the sub. "ome "ND are also called +elational languages and used in a commercial +D!%". programmers use the different sub.g.order whereas programmer ! uses the sub. 0 Designed and programmed for naPve users as interfaces to canned transactions. &1H+#". This language was developed in 3LF9 at &!%7s "an Jose +esearch (enter. "$!A"# etc. As in an organization. %any different sub.schema refers to an application programmer7s view of data he uses.schema but both combined these two sub. "ub. "ND is sometimes named by "NAA+# language.schemas. !y using this language we can do any uery about the data. full schema.

Q. All the ueries from user about the data stored in the database will be handled by D!%". Access etc.e d!ase. &n the hierarchical model the data were arranged in the form of trees. 0 &nterface has its own schema. 0 >ften a small set of commands included to minimize the number of . Ans. authorizing accounts. 0 Analysts and programmers design and implement a special interface for each class of naPve users. changing the schema. model the data was arranged in the form of pointers and networ.13.eystro. 'ox2ro. in networ.eys) &nterfaces for the D!A 0 "ystems contain privileged commands only for D!A staff. A D!%" is a software component or logical tool to handle the databases. There are many D!%"s available in mar. 0 &nclude commands for creating accounts.e @ierarchical model. 1atural Danguage &nterfaces 0 Accept re uests in written #nglish or other languages and attempt to understand them. 'or this purpose three data models were developed li. >racle. and in relational model the data was . Describe the 0(assification of Database Mana$ement Systems. (&. and a dictionary of important words. HA&s use both forms and menus. (ategories of D!%" D!%" (Database %anagement "ystem) &t is software to manage many databases. 1etwor. function .e. Ases the schema and dictionary to interpret a natural language re uest. &nterfaces for 2arametric Asers 0 2arametric users have small set of operations they perform. setting parameters. %odel and +elational %odel.0 Displays a schema to the user in diagram form.es re uired. +D!%" (+elational Data !ase %anagement "ystem) #ach database system uses a approach to store and maintain the data. reorganizing the storage structures etc. Anify. 'ox!A"#.et li. The user can specify a uery by manipulating the diagram.

There was a duplication of hardware and facilities. "o.>riented Database %anagement "ystems (>>D!%"s) have been developed to support new . there is a single storage. maintain and understand. &n a centralized database system. it is necessary to use the mechanisms that are offered by the operating system. so when a client ob*ect has to invo.inds of applications for which semantic and content are represented more efficiently with the ob*ect model. The current trend is towards distributed systems. and these mechanisms do not became proper to the ob*ect oriented paradigm since they are oriented to communicate processes. an application implemented using the (CC language. operating systems and other resources may vary drastically. the no suitable abstractions of the operating systems. buildings that are connected using DA1. with the (CC ob*ect model. but in a centralized or networ. The data stored in the form tables is easy to stored. The +D!%" is based upon the rules given by Dr. >>D!%" (>b*ect >riented D!%") >b*ect. %any D!%" has been developed using approach of hierarchical and networ. employee data etc. &n order to solve this problem intermediate software is included (e. &n @D!%" we store the data in the database which is entirely different. the D!%" B data reside at a single place and all the control B location is limited to a single location. (odd7s +ules. while tables store stored in separately file and may or may not have identical structures. @D!%" (@eterogeneous D!%") &n +D!%" we store the information related to the same .oriented applications need to use the operating system services. 0 &nteroperability problem between ob*ect models) Although different system elements use the ob*ect. models. Therefore. Distributed system is parallel computing using multiple independent computers communicating over a networ. &t is similar to computer clustering with the main difference being a wide geographic dispersion of the resources 'or example an organization may have an office in a building and have many sub. &n +D!%" we can create relations among tables and can access the information from tables . the >>D!%"s present the two main problems) 0 &mpedance mismatch) &t is basically due to two reasons.oriented paradigm.arranged in the form of tables.g. programming languages. This is a centralized system connected to intelligent remote sites. The type of hardware. Any D!%" that uses the relational data model for data storage and modeling &s called +D!%". DD!%" (Distributed D!%") During 3L<=s B 3LE=s there was trend to use independent or decentralized system. an interoperability problem can exist between them.e student data. teacher data. #ach remote site have own storage and processing capabilities .ind of data li. an impedance mismatch is also caused every time that the ob*ect.nown as Dr.. 'irstly. to accomplish a common ob*ective or tas. (odd . (>% or (>+!A). but the 2(s are distributed geographically. and both ob*ects are not into the same address space. can easily interact .e a method that is offered by a server ob*ect.&n the second place.

ing system). Ans. Dogical data independence Data &ndependence The ability to modify schema definition in on level without affecting schema definition in the next higher level is called data independence. 6.et accounts are added to ban. 1 .table then it will not effect the external . There are two levels of data independence) 3. This feature was not available in file oriented approach. %odifications at the physical level are occasionally necessary to improve performance. &t means we can change the conceptual schema at one level without affecting the data at other level.level without affecting the conceptual or external view of the data. '>+T+A1. 2hysical data independence is the ability to modify the physical schema without causing application programs to be rewritten.oriented database an interoperability problem appears.mar. but the current trend is towards >>2 (>b*ect >riented 2rogramming). There are two types of data independence and they are) 3. Q. but when it wants to use ob*ects that have been created with another programming language or another ob*ect. when money. %odifications at the logical level are necessary whenever the logical structure of the database is altered (for example. &t means we change the physical storage.e >racle. >racle. The programming DA1HAAH#" li. "o the D!%" which follow >>2 approach is called >>D!%". >ne of the biggest advantages of database is data independence. !oth are selecting the empno and ename. &t is called the logical independence. &t means we can change the structure of a database without affecting the data re uired by users and program. Dogical data independence in the ability to modify the logical schema without causing application program to be rewritten. %any databases have been developed that follows this approach (>& approach) li. -isual !asic E use this approach. 'or example) consider two users A B !. 12p(ain the difference bet'een physica( and (o$ica( data independence.e (.with its ob*ects. Java. (V (( "harp). The new changes are absorbed by mapping techni ues. Dogical Data independence means if we add some new columns or remove some columns from table then the user view and programs should not changes. The languages li.e (CC. 2hysical data independence 6. 2A"(AD B '>+T+A1 use the 2>2 (2rocedure >riented Approach) to develop applications. &f user ! add a new column salary in his view.

Aser A7s #xternal -iew (-iew before adding a new column) Aser !7s external view (-iew after adding a new column salary) &t means if we change in view then program which use this view need not to be changed. 1!.view user: user A. &t means we change the physical storage. %apping techni ues absorbs the new changes. Q. 1ow user A can also print the salary. Dogical data independence is more difficult to achieve than is physical data independence. %odifications at the physical level are occasionally necessary to improve performance. What is physica( data independence? Ans.level without effecting the conceptual or external view of the data. The new changes are absorbed by mapping techni ues. 2hysical data independence is the ability to modify the physical schema without causing application programs to be rewritten.level without affecting the conceptual or external view of the data. . Dogical data independence means we change the physical storage. but internal view of database has been changed for both users A B !. since application programs are heavily dependent on the logical structure of the data that they access.

r What is the si$nificance of forei$n 3ey? . number and address.Q.e care of propagating updates to all copies of a data item. What do you man by database schema? Ans. 1". +edundancy also leads to inconsistency. What do you mean by data redundancy? Ans. &nconsistency cannot occur if data is represented by a single entry (i. &nconsistent data is data which contradicts itself . "chema is also referred to as conceptual model or global view (community view) of data. +edundancy is unnecessary duplication of data. 'or example if accounts department and registration department both . 1#. if there is no redundancy). 1%. not yet currently supported. This is an ob*ective. and ta. of the data in a formal manner. (ontrolled redundancy "ome redundancy may be desirable (for efficiency). group. Q. ?e can say that we relate whole college data logically.teaching) data and other concept of data related to the college is called "chema of the college.g. two different addresses for a given student number. "uppose a complete description of collected data having all classes and student data9 all employees (teaching B non. A D!%" should be aware of it.eep student name. The schema and subschema words are brought into D!%" by (>DA"$D ((onference on data system language3 committee) and also by the (>DA"$D7s database tas. . +edundancy wastes space and duplicates effort in maintaining the data. &t is necessary to describe the organization. which is called schema. The logical and physical database descriptions are used by D!%" software. 12p(ain the distinctions amon$ the terms primary 3ey4 candidate 3ey and super3ey.e. Q.r What are the various 3eys? . The complete and overall description of data is referred to as schema. e.

ey value can be null: (b) Attempts to change the value of a primary .ey is the tuple identifier. attributes of an entity that is used to identify one or more instances of the set.ey. 'or example. The second constraint above deals with changing of primary . primary . 336F9E and have paid fees 3<== on basis of +oll 1o.ely to arise.eys can be defined. 'rom the definition of candidate . Any tuple in a database may be identified by specifying relation name. D#(TA+#+. The first constraint is necessary because if we want to store information about some entity. then we must be able to identify it. The 3< is uni ue value and it gives uni ue identification of students "o here +oll 1o is uni ue attribute and such a uni ue entity identifies called 2rimary Iey. The relational data model therefore imposes the following two integrity constraints) (a) 1o component of a primary . (1>) has ("TA1>.ey values. but some commonly and mainly used . The primary .ey then allowing tuples li.ey. it must be identifiable and therefore it must have a primary . otherwise difficulties are li. it should be clear that each relation must have at least one candidate . 2rimary Iey A .ey and its value. D#(TA+#+) as the primary .ey. As discussed earlier. if a relation (DA"" ("TA1>. 3<. (odd has suggested three possible approaches) .eys are explained as below) 3.ey is important since it is the sole identifier for the tuples in a relation.ey and it would be possible to have a database state when integrity of the database is violated. Iashmir Avenue and phone no. Anfortunately most commercial database systems do not support the concept of primary .ey cannot be duplicate.eys.e 8368 1ADD 1ADD "%&T@ (28=6 (28=6 is going to lead to ambiguity since the two tuples above may or may not be identical and the integrity of the database may be compromised. the primary . Ieys) As there are number of . &t tells about student Amrita having address 3=3. Also for a tuple to exist in a relation. changing it needs very careful controls.ey must be carefully controlled.ey even if it is the combination of all the attributes in the relation since all tuples in a relation are distinct.ey is a single attribute or combination of two or more. "ince the primary . "ome relations may have more t one candidate .Ans. The attribute +oll V uni uely identifies an instance of the entity set "TAD#1T.ey of a relation is an arbitrarily but permanently selected candidate . 2rimary .

ey is an attribute or combination of attributes that not be primary . this . and student name are two different attributes then we combine these two attribute and form a single attribute +oll 1o. &n otherworlds secondary .ey. Also we can identify the data from the combination of the secondary . 6. address of the student.en in updating primary .ey in the D>! field.ey is called super . then 1ame of the student. then this combination is the candidate .ey.eys are called super .ey is used after the identification of the primary .ey.ing a distinction in altering the primary . (andidate Iey There may be two or more attributes or combination of attributes that uni uely identify an instance of an entity set These attributes or combination of attributes are called candidate . and about particular name. A secondary .eys.ey values. %a. >f course. then by adding some additional information about the day of the month . B 1ame. (andidate .ey. the tuple would first be deleted and then a new tuple with new primary .eys. if +oll 1o. A candidate is a combination of two or more attributes e.g.eys be made available. %ethod 8 A different command for updating primary .ey and another attribute of a relation would remind users that care needs to be ta.ey are less used in a small database file.ey.ey also gives uni ue identification.%ethod 3 >nly a select group of users be authorised to change primary . e g in a "TAD#1T entity if +oll 1umber is the primary . 8. "uper Iey &f we add additional attributes to a primary . <.ey "uper .ey.eys.ey 'or example. "econdary Iey The . &f it was necessary to change a primary . this field or attribute becomes more powerful and useful "uch type of .ey is therefore a minimum super .ey and have duplicate data. %ethod 6 Apdates on primary .eys.ey value but same other values would be inserted. the resulting combination would still uni uely identify an instance of the entity set "uch .ey and it is uni ue and gives uni ue identification about a particular roll no. 9. 2hone number of the student and the fees paid by the student all are secondary .eys A primary . 1ow these days it has less importance. Alternative Iey .ey comes with primary .ey. if D>! (date of birth field or attribute) is the primary . but due to its feature. this does re uire that the old values of attributes be remembered and be reinserted in the database.e3 which is not giving the uni ue identification and have duplicate infonnaEo is called secondary . (andidate .ey values be banned.ey gives the complete description of the database.

her ma*or responsibilities are as follows) 3.A candidate .ey then 2V is the 'oreign Iey. (ity). What are the ma+or functions of a database administrator? Ans.ey. @e decides the suitable database file structure for arrangement of data.e decision about the storage structure.eys.ey is called alternative .now entity "2 is defined as the relationship of the relation " and the relation 2.ey or 2V as the primary . then the Job V .ey.ey in the relation "' is called the 'oreign Iey. relation " ("V. 6. end user and naive user. D!A also ta. 21ame. D!A also recovers the damaged data.g.now D!A is the overall commander of a computer system. D!A decides the information contents of the database."he uses the proper DDD techni ues. 1*. &f "V is the primary . and 1ame combination is the candidate . F. "imilarly in the relation A""&H1%#1T.ey.ey in the candidate . . and software and gives the instructions to the application programmer. E. as primary .e.ey. " 1ame.e 2V as the primary . D!A decides which access strategy and techni ue should be used for accessing the data. 2V.ey is the 'oreign Ieys. D!A can control the data. 2rod V. but do not wor.ey. ?e . K. E 'oreign Iey "uppose there are some relations as) "2 ("V. D!A compiles the whole data in a particular order and se uence. status. D!A also define and apply authorized chec. other . D!A co. 8. so it has number of duties. "uppose if we ta. but in the relation "2 we can ta. hardware. @e.up storage device so that if data can then lost then it can be again recovered and compiled.es bac.ey. (olor.ey "V which is actually the primary . then other primary .ey which is not the primary . D!A decides where data can be stored i.operates with user. as the alternative . Q. ?eight. but some of his. then if +oll 1o.s and validation procedures.ey. <. e. ta.up of the data on a bac.e either V as the primary . is the primary . These two relations has sand 2V as the 2rimary Ieys in relation " and 2 respectively. Job V are given and if "V and 2V are the primary . +#"2>1"&!&D&T&#" >' D!A As we . if +oll 1o. city) and relation 2 (2V. D!A communicates with the user by appropriate meeting. attribute #mp V. 1ame attribute wor. NT$). 9.ey is 1ame.

many or parent.ey restraints.L.nown as a relationship. D!A liaise with the user in the business to ta. 9. <. and the parent table is referred to as the foreign table (from the viewpoint of the child).ey. These fields on the child table are .eys cannot be comprised of exactly the same fields otherwise the record could only ever point to itself.ey on the parent table.to. bac. Q. "ome database engines allow relationships to be managed by rules . +elationships) >ne table (relation) may be lin. "ome database engines allow a parent table to be lin. +elationships may be built into the database structure to facilitate the operation of relational *oins at runtime. 1ote that these two . 6. 8.nown as referential integrity or foreign .nown as a foreign .en by D!A should be correct. &t is possible for a record on the parent table to exist without corresponding records on the child table. A table may be the sub*ect of any number of relationships. A relationship is between two tables in what is .er. but it should not be possible for an entry on the child table to exist without a corresponding entry on the parent table. 3.. The decision ta.nown as an orphan. What do you mean by re(ationships? 12p(ain different types of re(ationships. 33. the primary .ed with another in what is .detail relationship where an occurrence on the /one7 or /parent7 or /master7 table may have any number of associated occurrences on the /many7 or /child7 or /detail7 table. D!A also changes the environment according to user or industry re uirement and monitor the performance. to the child table being bro.en.ed via a candidate .ey. to the primary .ey.ey which points bac. D!A should have leadership uality. These will prevent entries onchild tables from being . 2.ma. 36. 3=. D!A should be good decision. &t is possible for a table to be related to itself.e confidence of the customer about the availability of data. A child record without a corresponding parent record is . 'or this to be possible it needs a foreign . and it may be the parent in some and the child in others. but if this were changed it could result in the lin. Ans. child or master. 6.nown as a one. To achieve this. accurate B efficient. 3. the child table must contain fields which lin.

+elational Joins The *oin operator is used to combine data from two or more relations (tables) in order to satisfy a particular uery.(.#) onto the attributes (A. Data normalisation is concerned with decomposing a relation (e.g.D.D.(.g.created if the foreign .D. but this will not remove a data value from any attribute.#) respectively. Two relations may be *oined when they share at least one common attribute.() and (!. A row in relation +3 is *oined to a row in relation +6 when the value of the common attribute(s) is e ual in the two relations. The data values for attribute ! in this context will be identical in +3 and +6. The *oin of two relations creates a new relation. The *oin is implemented by considering each row in an instance of each relation. The *oin of relations +& and +6 is possible because ! is a common attribute. +(A. consider the following) 1ote that the instances of relation +& and +6 contain the same data values for attribute !. The instances of +3 and +6 are pro*ections of the instances of +(A. +3 and +6).!. or will deal with entries on child tables when the entry on the parent table is updated or deleted.!.ey does not exist on the parent table. The notation /+3 x +67 indicates the *oin of relations +3 and +6.!. A pro*ection will not eliminate data values duplicate rows are removed. The result of the *oin is) . The *oin of two relations is often called a binary *oin. 'or example.#) into smaller relations (e.

relations may share multiple common attributes. 'or example. All of these common attributes must be used in creating a *oin. The relations *oined in7 the preceding example shared exactly one common attribute. The two rows were *oined since each contained the same value for the common attribute !. the instances of relations +3 and +6 in the following example are *oined using the common attributes ! and () !efore the *oin) After the *oin) . @owever.The row (6 9 < F 9) was formed by *oining the row (6 9 <) from relation +3 to the row (9 F 9) from relation +6. The row (6 9 <) was not *oined to the row (E 6 8) since the values of the common attribute (9 and E) are not the same.

can create a new row that was not a member of the original relation.The row (E 3 9 L) was formed by *oining the row (E 3 9) from relation +3 to the row (3 9 L) from relation +6. The *oin was created since the common set of attributes (! and () contained identical values (3 and 9). The *oin operation provides a method for reconstructing a relation that was decomposed into two relations during the normalisation process. (3 9) in +3 and (3 6) in +6. however. Thus invalid information can be created during the *oin process. . The *oin of two rows. The row (E 3 9) from +3 was not *oined to the row (3 6 3) from +6 since the common attributes did not share identical values .

Another possibility for a *oin is +8 and +9 (*oined on &1"T+A(T>+). 0 Holdman (the #nglish teacher) teaches %aths.e the same class at the same time from two different instructors in two different rooms. 0 !oth instructors teach different courses at the same time. and !rown ta. The result would be) . +elations +3 and +9 contain the necessary information and can be *oined using the attribute @>A+.1ow suppose that a list of courses with their corresponding room numbers is re uired. The result of this *oin is) This *oin creates the following invalid information (denoted by the coloured rows)) 0 "mith. Jones. 0 Jen.ins (the %aths teacher) teaches #nglish.

A correct se uence is to *oin +3 and +8 (using (>A+"#) and then *oin the resulting relation with +9 (using both &1"T+A(T>+ and @>A+).This *oin creates the following invalid information) 0 Jen.ins teaches %ath & and Algebra simultaneously at both K)== and L)==. The result would be) #xtracting the (>A+"# and +>>% attributes (and eliminating the duplicate row produced for the #nglish course) would yield the desired result) .

This occurs because the relation must serve as a bridge between the other relations in the database. This relation may contain attributes that are not logically related to each other. however. 'or example. &n this case the solution re uires *oining all 9 relations. Then at least one se uence of *oins on the resulting relations exists that recreates the original instance with no invalid data created during any of the *oin operations. the lossless *oin relation will contain all attributes that appear only on the left side of a functional dependency. This uestion. valued dependencies. @owever. "ome se uences are more desirable since they result in the creation of less invalid data during the *oin operation. %any different se uences of *oins will recreate an instance of a relation. but these relations cannot be *oined directly. . which was probably not anticipated during database design. 'or example. "uppose that a relation is decomposed using functional dependencies and multi.The correct result is obtained since the se uence (+3 x r8) x +9 satisfies the lossless (gainlessW) *oin property A relational database is in 9th normal form when the lossless *oin property can be used to answer unanticipated ueries. the choice of *oins must be evaluated carefully. suppose that a list of grades by room number is desired. which is constructed to assure that any ad hoc in uiry7 can be answered with relational operators. >ther attributes may also be re uired. can be answered without creating invalid data by either of the following two *oin se uences) The re uired information is contained with relations +6 and +9. in developing the lossless *oin relation. The database may re uire a /lossless *oin7 relation.

8he office maintains data about each c(ass4 inc(udin$ the instructor4 the enro((ment and the time and p(ace of the c(ass meetin$s. (. This is accomplished by *oining relations +3 and +8 and then *oining the result to relation +6. a result of including attributes that are not logically associated with each other. This relation can be used to *oin attributes ! and D. is developed to represent a relationship among various relations. dia$ram. . A ! and ( D.5. +elations and are in 9th normal form. .r 0onstruct an 15 dia$ram for a university re$istrar7s office. The attributes within a lossless *oin relation often contain multi. !. The lossless *oin relation may be difficult to populate initially and difficult to maintain . D).(onsider relational schema + (A. 21. 1o invalid data is created during these *oins. The lossless *oin relation can sometimes be decomposed into smaller relations by eliminating the multi. . The lossless *oin relation. Q. A third relation however.r What is one to many re(ationship? 6ive e2amp(es.valued dependencies. Associate 'ith each patient a (o$ o1 the various tests and e2aminations conducted. These smaller relations are easier to populate and maintain. however. .r Dra' an 15 dia$ram for a (ibrary mana$ement system4 ma3e suitab(e assumptions.valued dependencies. is the lossless *oin A relation is usually developed by combining attributes about a particular sub*ect or entity. -or each student c(ass pair4 a $rade is recorded a(so desi$n a re(ationa( database for the said /. What is an 15)dia$ram? 0onstruct an 15 dia$ram for a hospita( 'ith a set of patients and a set of doctors. . The relation relation for this database design. is re uired to satisfy the lossless *oin property. Describe various symbo(s used in 15.r Discuss in detai( the 15 dia$ram. dia$ram. (onsideration of 9th normal form is important in this situation.

The entity. it is basically useful in the design of logical database model. is an entity. address) An item may be described by the attributes) (item code. A vendor supplying items are related in the sense that a vendor supplies an item. 6. for example. 8. ?e can thus define a vendor set and an item set. The item he supplies is another entity. ?e illustrate these terms with an example.relationship model for data uses three features to describe data. #ntities.+ is a generalization of these models. These are) 3. A vendor supplying items to a company.world items in an application. of the relationship as supply between vendor and item entities) The relationship supply can be described by the attributes) (order no. approach. Attributes. which specify properties of entities and relationships. #. vendor name. which specify distinct real. #arlier D!%" were based on hierarchical and networ. which connect entities and represent meaningful dependencies between them.Ans. An entity set is a collection of similar entities. normalizing relations and finally obtaining a relational database model. date of supply) . item name) +elationship also can be characterized by a number of attributes. #ach member of an entity set is described by some attributes. This analysis is then used to organize data as a relation. The act of supplying0 defines a relationship between a vendor and an item.+ model grew out of the exercise of using commercially available D!%" to model application database. 'or example. +elationships. Although it has some means of describing the physical database model. ?e can thin. a vendor may be described by the attributes) (vendor code. #.

many (3) %). The database structure. The 3) % relationship from entity set #3 to #6 indicates that for an occurrence of the entity from the set #3. there is no restriction to the number of entities in one set associated with an entity in the other set. &n the %) 1 relationship between entity sets #3 and #6. one or more entities from the entity set #6 associated with it. and 2+>J#(T .one (%) 3) or one. one. #%2D>$##. #ach entity in #6 is associated with at most one entity in the entity set #3. To illustrate these different types of relationships consider the following entity sets) D#2A+T%#1T.to.to.+elationship between #ntity "ets The relationship between entity sets may be many.to. there could be zero.to. The 3)3 relationship between entity sets #3 and #6 indicates that for each entity in either set there is at most one entity in the second set that is associated with it.relationship (#.+ model is usually shown pictorially using entity. many.+) diagram.one (3)3).many (%) 1). %A1AH#+. employing the #.

As we *ust pointed out. A reverse relationship. The relationship for %A1AH#+ to D#2A+T%#1T and from D#2A+T%#1T to %A1AH#+ is both 3)3.to. #ach entity is represented by a rectangle and the relationship between them is indicated by a direct line. there could be only one instance of the entity set %A1AH#+ to whom that employee reports (assuming that no employee reports to more than one manager). one: there is only one manager per department and a manager manages only one department. The relationship between entities is illustrated in 'igures shows some instances of this relationship. from #%2D>$## to %A1AH#+. there could be an occurrence of the entity type %A1AH#+ having zero occurrences of the entity type #%2D>$## reporting to him or her. 1ote that a one. &n the case of an organization. 'igure shows some instances of one.The relationship between a D#2A+T%#1T and a %A1AH#+ is usually one.to.many relationship exists from the entity %A1AH#+ to the entity #%2D>$## because there are several employees reporting to the manager. @owever. would be many to one. A one.one relationship between two entity sets does not imply that for an occurrence of an entity from one set at any time there must be an occurrence of an entity in the other set. since many employees may be supervised by a single manager. 'igure) 3)% +elationship . there could be times when a department is without a manager or when an employee who is classified as a manager may be without a department to manage. This relationship between entities is shown in 'igure.to.one relationships between the entities D#2A+T%#1T and %A1AH#+.to. given an instance of the entity set #%2D>$##.

many. relationships by a diamond.+ diagram for vendor. This relationship between #%2D>$## and 2+>J#(T is many. entities are represented by rectangles. 'igure) % ) 1 +elationship 'igure) &nstances of %)1 +elationship &n the entity. The following #. and a number of employees could be wor. &t is illustrated in 'igures shows some instances of such a relationship.relationship (#. .to.ing on a given pro*ect.+) diagram. item and their relationship is illustrated in 'igure (a).shaped box and attributes by ellipses or ovals.'igure) &nstances of 3) % +elationship The relationship between the entity #%2D>$## and the entity 2+>J#(T can be derived as follows) #ach employee could be involved in a number of different pro*ects.

#ach row contains a number of the entity set.+ diagrams.'igure (a)) #. Thus in the #. #ach table has a name.+ diagrams allow us to have an overview of the important entities for developing an information system and other relationship.+ diagram. #ach tuple in the above example describes one vendor. The name used is the entity name.nown as relation attributes. #ach table has a number of rows and columns. #ach property is identified by the title used for an Attribute . The titles used for the columns of a relation are .nown as a relation. -endor is the relation name. the next step is to replace each entity set and relationship set by a table or a relation. #ach row of a relation is called a tuple. the vendor entity is replaced by table below.+ diagram for vendor: item and their +elationship +epresentation of #ntity "ets in the form of +elations The entity relationship diagrams are useful in representing the relationship among entities they show the logical model of the database. @aving obtained #. #ach element of a tuple gives specific property of that vendor. Table) Table 'or the #ntity -endor The above table is also . #ach column corresponds to an attribute. #.

>n the other hand. department. which includes the identifiers of the entity sets.+ Diagram for Teacher. more importantly. department. which ensures good database design. 1o two rows can be identical. The raw relations obtained as a first step in the above examples are transformed into normal relations.+ diagram of 'igure are as shown) 'igure) #. "tudent and their relationship Teacher (Teacher. which are related. name. the relations and identifiers corresponding to the #. &t consists of the relation name and its attributes. address) >ne may as. then the identifier of the relationship entity is a composite identifier.id. The identifier of the relation is shown in bold face.id. the existence of a sound theory on relations. &f a relationship is %) 1. 'or example. why an entity set is being represented as a relation.column. address) Teaches (Teacher. A specified value of a relation identifier uni uely identifies the row of a relation. The rules for transformations called normalization are based on sound theoretical principles and ensure that . "tudent.id) "tudent ("tudent. &n a relation the rows may be in any order. "ince it is inconvenient to show the whole table corresponding to a relation.id.. The columns may also be depicted in any order. The main reasons are case of storing relations as flat files in a computer and. then the identifier of the relationship entity is the identifier of one of the entity sets in the relationship. name. if the relationship is 3)1. a more concise notation is used to depict a relation.

+elational %odel or +elational Approach &&. These are supplier number ("V). >ne of the main reasons for introducing this model was to increase the productivity of the application programmers by eliminating the need to change application programmer. . Approach &. Data structure used in the data model represented by both entities and relationship between them. &n this model relation is only constructed by setting the association among the attributes of an entity as well the relationship among different entities. 1etwor.r What is re(ationa( mode( compare and contrast it 'ith net'or3 and hierarchica( mode(. This model has advantages that it is simple to implement and easy to understand. deleted and simplify retrieval of re uired data. !ut some important and commonly used model are of four types. supplier name. %odel or 1etwor. The " table further has some fields or attributes. @ierarchical %odel or @ierarchical Approach &&&. when a change is mode to the database. ?e can explain relation view of data on relational approach on the basis of following example.now the exact physical structure. &n this user need not . D!%" has number of ways to represent the data. ensure that no mista. Q. "uppose there are three tables in which data is organized. These are) &. Database models are collection of conceptual tools for describing data semantics and data constraints. status of the supplier and the city in which the supplier resides. "hipment table of "2 table or "2 relation. among which three are mainly used. "imilarly 2 table . Discuss re(ationa( approach of database mana$ement system? 12p(ain 'ith the he(p of suitab(e re(ationa( operations to demonstrate insert4 de(ete and update functions. ?e can express ueries by using uery language in this model. +elational Data %odel +elational Data %odel has been developed from the research in deep and by testing and by trying through many stages.e occur when data are added or.the final normalized relations obtained reduce duplication of data. 2art table or 2 table or 2 relation. 22. Ans. These tables are "upplier tables or " table or " relation.

Dn is a subset of cartesian product D& x D6 xXXx Dn. D6. D6. deletion and modifications can be done easily. D6. "V. in " table . 'or example. D8.XX.XX. "name." table insertion. A relational table or relationship can be defined as) Definition) A relation represented by table having n column. part color. These three tables are called relational table. +ows of such table are called tuples.. s8 are domains. part number (2V) and the uantity which supplier can ship. These attributes are field name and in the form of column.Dn then + is relation on these n sets if these n sets are ordered in n tuples such that each value of attribute belong to Dl. we call it parent child relationship. part name. weight of the part and location where the part is stored. &&. &n this a single file has relation with many files and similarly we can say that it is the . Another definition is ) &t is collection of Dl. Also "2 table contains field supplier number ("V).. 2ool of values in a particular w and attributes called domain. These three relations are represented by diagram) " table (#ntity) or " +elation) 2 table (#ntity) or 2 +elation) As in the .. &t has one root and many branches. relation because it gives the relationship between different attributes. s6. @ierarchical %odel &t is a tree structure. status are the attributes and s3. ".Dn. defined on domain Dl. &n other words domain is a pool of values from which actual value appearing in a given column are drawn. #ach supplier s uni ue supplier number "V and similarly each part has uni ue part umber 2V. " table is also called ". .has field part number (2V).

ing under the manager are their children The representation of this model is expressed by lin. 1etwor. approach is more symmetric than the hierarchical structure. "ome times it will create ambiguity in designing and defining the association and relationship between "2 table (#ntity) or "2 +elation) &n hierarchical approach. 1etwor. approach starts from on point and after connecting similar type of data it returns bac. +A . 1etwor. ?e can insert only by creating a new record having lin. to the same record. Approach &t is a complex approach of D!%". "uch type of representation is better for a lin. &n an organization chart manager is the parent root and employees wor. Approaches) (A) +elational Approach) +elational Approach (+A) has relationship between different entities and attribute in a particular entity. "imilarly deletion is also complex if we delete any record than chain disconnect and whole structure vanish. approach is created when there are more than one relations in the database system. Difference between +elational.age have many relationships with one.age with other record. 1etwor.arrangement of individual data with group data. &n networ. all the records by using a chain or pointer. Apdation is also complex because we cannot change name or any data record because it connected with each other. &n this we lin. but you can7t delete a parent) parent has one or more child. &&&. &t has many to many relationships. Deletion and insertion is easy. @ierarchical and 1etwor.ing different tables. &n the parent child relationship updation in parent and child both are difficult. insertion can be done if a child has a parent and insertion on the child side is easy. model insertion at any point is very complex. +A is in tabular form.

(!) @ierarchical Approach) @ierarchical Approach (@A) creates a lin. 1A relationship is full or completely symmetric form because it has one chain symmetry. 6:/ disp(ay visua( e(ements such as icon4 Windo's and other $ad$ets Q. @A has parent child relationship. +. is sometime pronounced <$ooey= is a method of interactin$ 'ith a computer throu$h a metaphor of direct manipu(ation of $raphica( ima$es and 'id$ets in addition to te2t. updation is little difficult than the +A. &ngress. deletion. 2 . Q. %ore complex than +A B @A. deletion. (() 1etwor. >racle. A $raphica( user interface 96:/. @A relationship is in symmetric form by defining parent and their child.has one to one relationships. A%D is a graphical language for visualizing. +A is simple in nature. What is the usa$e of unified mode((in$ (an$ua$e 9:M&. 23. &nsertion.intensive system7s artifacts. 1A has many to many relationships. . Q. +elational approach creates relationship between different entities and different attributes in the same entity.? Ans.A has table in asymmetric form. @A has one to many relationships. "ybase.age between two or more entities. updation in + table is very easy. Approach) 1etwor. What are $raphica( user interfaces? Ans. Ans. constructing and documenting an ob*ect oriented software. Hroup) set hiving different classes B members. &nsertion. &t is the best approach to represent the data than the other models. &nsertion. updation is very difficult. 1A has D!TH (Database Tas. which is theoretical. 2!. &t is (omplex in nature. Approach (1A) has chain among many entities. The pointers that points to nothing is called dangling pointer. Define the term dan$(in$ pointer. Danguages used in +A are "ND. @A has &%" language. deletion. specifying. 1A has chaining techni ue or pointer techni ue.

Distin$uish bet'een 5DBMS and DBMS. Ans. 1. Write a short note on Mappin$. so that the external schema may remain invariant. 2#.internal mapping) defines conceptual and internal view correspondence specifies mapping from conceptual records to their stored counterparts 0 An external. achieving logical data independence.Q. 2". so that the conceptual schema may remain invariant. Q. Ans. %appings 0 The conceptual.internal mapping must be changed accordingly. achieving physical data independence.external mapping must be changed accordingly. What is re(ationa( a($ebra? .conceptual mapping) defines a particular external and conceptual view correspondence 0 A change to the storage structure definition means that the conceptual. Q. 0 A change to the conceptual definition means that the conceptual.

2. !asic >perations !asic operations are the traditional set operations) union. Algebra is e uivalent to +elational (alculus. "ome mainly used operations are *oin. +elational algebra is a procedural language. +elational Algebra is a collection of operations to manipulate relations.3 below. Thus relational completeness of a database language can also be established by showing that it can define any relation expressible in +elational Algebra. Three of these four basic operations . relations. %ore abstractly. &t specifies the operations to be performed on existing relations to derive relations. those for performing *oins. +elational Algebra is a procedural language. and division. This is illustrated in 'igure K. The former are the traditional set operations. of such an operation as a function that maps arguments from specified domains to a result in a specified range. selection. Q.oriented operations and relational. Therefore. selections. and differenceG . The former are the traditional set operations. pro*ection and division. the latter are *oins.r Discuss the basic operations that can performed usin$ re(ationa( a($ebra and SQ&. Define re(ationa( a($ebra.es one or more relations as its operands and produce another relation as its results. of such an operation as a function that maps arguments from specified domains to a result in a specified range. intersection.oriented operations. +elational Algebra comprises a set of basic operations. Ans. &t specifies the operations to be reformed on. the domain and range happen to be the same. we can thin. intersection and cartesian product.Ans. . the latter. &n this case. it defines the complete schema for each of the result relations. existing relations in derived result relations. selection and pro*ection. +elational Algebra and +elational (alculus are two approaches to specifying manipulations on relational databases. relations. in that every expression in one has an e uivalent expression in the other. i. The relational algebraic operations can be divided into basic set. The relational algebra operations can be divided into basic set oriented operations and relational oriented operations. i. we can thin. +elational Algebra comprises a set of basic operations.e. the domain and range happen to be the same. The distinction between them is somewhat analogous to that between procedural and declarative programming. &n this case. union. #ach operation ta. difference. An operation is the application of an operator to one or more source (or input) relations to produce a new relation as a result. 12p(ain the various traditiona( set operations and re(ationa( operations of it. pro*ection. %ore abstractly.e. An operation is the application of an operator to one or more source (or input) relations to produce a new relation as a result.

XX. Two relations are union compatible if they have the same parity and one.6. "ome basic operations used in +elational Algebra are) Traditional "et >perations) 'urther traditional set operations are subdivided as) (a) A1&>1 (b) &1T#+"#(T&>1 (c) D&''#+#1(# (d) (A+T#"&A1 2+>DA(T +elational "et >perators) "imilarly further +elational "et >perations subdivided as) (a) 2+>J#(T&>1 (b) "#D#(T&>1 (c) J>&1 (d) D&-&"&>1 Traditional "et >perations (i) A1&>1 (A)) The union of two relations A and ! is done by A1&>1 command as) A A1&>1 ! &t is the set of all types belonging to either A or ! or both.one correspondence of the attributes with the corresponding attributes defined over the same domain. 2nU and N S QN3...n) ?here Dom (2i) represents the domain of the attribute 2i.. i. Det us consider set A and ! as) Table A) . . if 2S 2 Q23..re uire that operand relations be union compatible.e. The cartesian product can be defined on any two relations..to.. Two relations 2 and N are said to be union compatible if both 2 and N are of the same degree n and the domain of the corresponding n attributes are identical. NnU then Dom (2i) S Dom (Ni) for i S (3.

) . then + is the resultant set by union operations. ?e can combine it as) A A ! (in mathematical form. Then A A1&>1 ! is the set of supplier samples for suppliers who are either located in Dondon city or supply part 23 (or both). The + is resented by) 'or example. let us consider A be the set of suppliers tuples for suppliers in Dondon and ! is the set of supplier who supply part 23.Table !) &f 2 and N are two sets. &t is denoted by the symbol A (union).

'or example. !.(ii) &ntersection The intersection operation selects the common tuples from the two . The difference between two relations A and ! be defined as) A %&1A" ! &t is the set of all tuples belonging to set A but not belonging to !. the difference between A and ! be represented as) . then intersection between these two are in +3. from the above said two A and ! sets. (iii) Difference (G)) The difference operation removes common tuples from t first relation. ?e can represent it as A . The intersection of two relations and ! is defined as) relations. if A and ! are two sets. &t is denoted by (.). &t is denoted by the symbol or A &1T#+"#(T ! 'or example.

The cartesian product of two relations A and ! is defined as) A T&%#" ! >r Ax! The extended cartesian or simply the cartesian product of two relations is the concatenation of tuples belonging to the two relations. part number pairs as) . Then A T&%#" ! is the set of all possible supplier number . A new resultant relation schema is created consisting of all possible combinations of the tuples is represented as) +S2xN 'or example.(iv) (artesian 2roduct) &t is denoted by /R7 or /x7 ((ross). let us consider A be the set of all supplier number and ! is set of all part number.

if 2 is the relation then + is the resultant table after selection on 2. Y 3=<. it yields a vertical subset of a relation.+elational "et >perations) The basic set operations. 'or example. e.g. from a relation. &t is horizontal subset of relation. &t reduces the number of tuples. then we can pro*ect on the field name as and get resultant pro*ected table) (ii) "election ) "election is the selection of some tuples based on some condition. if 2 is the table.e. which provide a very limited data manipulating facility have been supplemented by the definition of the following operations) (i) 2ro*ection (x)) The pro*ection of a relation is defined as the pro*ection of all its tuples over some set of attributes i. + S s 2 (+1 Y 3=<) . (ondition is to select all tuples having roll no. The pro*ection operation is used to either reduce the number of attributes in the resultant or the reorder attributes. &t is denoted by s.

# ui Join is the *oining of two tables of both having common e ual .ey of +. %ulti.ey field. .ey at least one attribute is overlapping. &t will produce a result relation of degree m. 1atural Join is the *oining of two tables without any condition and e uality. What are sin$(e)va(ued and mu(tiva(ued attributes? Ans. "uppose A is relational table of supplier having supplier number and ! is the relational tables of different types of parts.(iii) Join The *oin operator allows the combining of two relations to form a single new relations. 6. !(1' is a special case in 81'. Definition ) A normalized relation scheme +Y".(&T$. if " and 2 are two tables and these are *oined on (&T$ field as ".eys are there. R is a super . %ore than one candidate . &n each candidate . Iey attributes (candidate . 8. 3. 'or example. 'eatures) 3. then A D&-&D# !$ ! gives the resultant table +.eys) are composite (there is no single . Q. These are of three types) (i) Theta Join (ii) 1atural Join (iii) # ui Join Theta Join is the *oining of two tables on the basis of a condition. 'Z is in !(1' if for every nontrivial 'D in ' of the form R[A where R & " and A&".(&T$ and 2.ey which identify record). (iv) Division (C)) The division operator divides a dividend relation A of degree m C n by a divider relation ! of degree n.-alued Dependencies and 'ourth 1ormal 'orm &t was proposed as a sample form of 81' but it was found to be stricter than 81' because every relation in !(1' is also in 81': however a relation in 81' is not necessarily in !(1'.

if a set of attributes R which is subset of " and an attribute $ which belongs to Ds. "uper Iey) Adding primary .ey with any attribute is . . >ne of the following two conditions hold. (i) #ither $ belongs to DR ($. R) is a Trivial Attribute. <. A relation "chema +(". then we will say that this table is in !(1'.nown as super . @ence relation shown is in 81' and also in !(1'. ') is in !(1' ("Sset of Attributes.ey.ey. Trivial dependency) &f the right hand side is a subset of the left hand side is .Any table if follow above mentioned three features of !(1'.nown as trivial dependency. (ourse &D is such an attribute which is overlapping. 'SAll of functional dependency). ?hereas. (ii) >r R is a "uper .

3.ey attributes are dependent on the entire .ey. that is not a candidate . &t is not in !(1' because it has a determinant '1ame.A relation is in !(1' if every determinant is a candidate .ey. G &t is in &1' by definition. G &t is in 81' because it has no transitive dependencies. G &t is in 61' since any non . .ey.

nown as Data Definition Danguage or data description language (DDD).ey as its determinant. >r less formally. Q.AD. irreducible 'D has a candidate ."TA. !. D%D(Data %anipulation Danguage)) "ome language that gives instructions to the programming language and other languages is called data manipulation language (D%D). DDD can be used to define conceptual schema (Hlobal) and also give some details about how to implement this schema in the physical devices used to store the data.eys. Ans. Q. A relation is in !(1' if and only if the only determinant are candidate . Define the term data manipu(ation (an$ua$e. What is 5DBMS? .Iey) ("&D. .'1ame) AD-G"A!J Iey) ('1ame) +elations in !(1' 1ow we can say that a relation is in !(1' if and only if every nontrivial left. DDD (Data Definition Danguage)) Database management systems provide a facility .

+elational Database %anagement "ystem is used to store.age which manages a relational database. index.e >racle) in which the database is organized and accessed according to the relationships between data items. An +D!%" has the capability to recombine the data items from different files. &nterdependencies among these tables are expressed by data values rather than by pointers.the. and online analytical processing. Almost all full.Ans. providing powerful tools for data usage. +elationships may be created and maintained across and among the data and tables. &n other words +elational Database %anagement "ystem is a computer program that lets you store. the same database can be viewed in many different ways. retrieval. data warehousing. >ften used for transaction processing and data warehouses. &nformix.nown +D!%"7s include >racle.powered database engines such as >racleF has fostered the development of advanced 4enabling5 technologies including client. "ome of the best. (haracteristics of A +elational Database 0 +elational databases consist of one or more tables:. This allows a high degree of data independence.of. The simplest way to loo. &n a relational database. The evolution of high. based des. at an +D!%" is as a spreadsheet that multiple users can update. A database management system (li. +elational databases are powerful because they re uire few assumptions about how data is related or how it will be extracted from the database. optimized for rapid and flexible retrieval of data: also called a database engine. +D!%" has ability to access data organized in tabular files that can be related to each other by a common field (item). +D!%"s have been the wor.wide data processing and information management systems. 2ostgre"ND and %icrosoft Access. +elational Database %anagement "ystem is a software pac.server. providing sophisticated storage. The most important thing that an +D!%" does is provide transaction.. "ybase. these can be /*oined7 by the . As a result. (ompared to the file systems.scale database systems are +D!%"7s. and distribution functions to enterprise. database management systems (D!%") have established themselves as the primary means of data storage for information systems ranging from large commercial transaction processing applications to 2(. At the heart of most of today7s information systems is a relational database management system (+D!%").top applications. relational database management systems provide organizations with the capability to easily integrate and leverage the massive amounts of operational data into meaningful information systems. &n recent years.house for data management operations for over a decade and continue to evolve and mature. +elational Data !ase %anagement "ystems (+D!%") are database management systems that maintain data records and indices in tables. and retrieve tables of data. relationships between data items are expressed by means of tables. all of which comprise the core of today7s state.art information management systems. process and manage data arranged in relational tables.

Q. Asing a relational database. there is a need to ensure the accuracy of the values within database tables. 0 #ach field corresponds to a variable and is named to indicate its role. This is a logical result of the row being about one instance. 8. if you find a need for two values per cell .database software in ueries. and retrieval. The rows corresponding to matching values can be retrieved from different tables. The condition in which data is identically maintained during any operation. The integrity of the data in a relational database must be maintained as multiple users7 access and change the data. The preservation of data for their intended use. 'or example finds have a name and a size. 0 &f fields in different tables have the same range of values and are thus about the same ob*ect. This is important because otherwise it is not possible properly to search. ?henever data is shared. the a priori expectation of data uality. 0 #ach table consists of7 rows and fields. 0 #ach cell (where the fields and rows intersect) contains only one value. The term data integrity has the following meanings) 3. Thus each row is about one context. the results of searching are unpredictable. 0 #ach table is about one aspect (or sub*ect) of the database. ". 6. 0 #ach row corresponds to one instance of the sub*ect of the table. they are called /. +elative to specified operations.weight.eys7. such as transfer. &f you have duplicate rows. What do you mean by 5e(ationa( 0onstraints? Ans. Thus contexts and finds are different sub*ects and are in different tables. 0 #ach row must be uni ue. storage. the design has to be altered. there is an association Eetween the fields and thus the tables . .

and delete operations maintain the uni ueness and existence of all primary . all associated dependent data is . 6.ey (a column or set of columns) specifies that each row in the table can be uni uely identified by the values in the . that you can correctly and consistently navigate and manipulate the tables in the database.ey is used to identify a uni ue row in a relational table. !ecause a primary . The integrity rule re uires that insert. all associated dependent data is set to a default value. 0 (ascade) ?hen referenced data is updated. +eferential integrity also includes the rules that dictate what types of data manipulation are allowed on referenced values and how these actions affect dependent values.ey. 0 "et to 1ull) ?hen referenced data is updated or deleted.ey in a related table (the referenced value). then every value of the foreign .ey.nown.ey (a column or set of columns) in one table that guarantees that the values in that .ey value rule defined on a . update.ey is a primary .eys. 8. +eferential &ntegrity +ules) A referential integrity rule is a rule defined on a . The referential integrity rule states that if a relational table has a foreign .Another aspect of data integrity is the assurance that data can only be accessed and altered by those authorized to do so. Types of Data &ntegrity 3. in part.ey must either be null or match the values in the relational table in which that foreign . all associated dependent data is set to 1ADD. 0 "et to Default) ?hen referenced data is updated or deleted. Data integrity means.).ey. Ani ue (olumn -alues) A uni ue value rule defined on a column (or set of columns) allows the insert or update of a row only if it contains a uni ue value in that column (or set of columns). 1ull +ule ) A null rule is a rule defined on a single column that allows or disallows inserts or updates of rows containing a null (the absence of a value) in that column. The entity integrity rule states that the value of the primary .ey match the values in a . .ey can never be a null value (a null value is one that has no value and is not the same as a blan. +ules for +eferential &ntegrity The rules associated with referential integrity are) 0 +estrict) Disallows the update or deletion of referenced data. 2rimary Iey -alues) A primary . There are two basic rules to ensure data integrity: entity integrity and referential integrity. its value must always be specified and should never be un. 9.

or deletes of a row based on the value it contains for the column (or set of columns). you can enforce referential integrity in a distributed database using database triggers (see next section).ing) (omplex integrity chec.defined rule for a column (or set of columns) that allows or disallows inserts. Advantages of &ntegrity (onstraints This section describes some of the advantages that integrity constraints have over other alternatives. which include) 0 #nforcing business rules in the code of a database application . (>racle uses 1o Action as its default action) (omplex &ntegrity (hec. or at the end of the transaction if the constraint is deferred. &ntegrity (onstraints Description An integrity constraint is a declarative method of defining a rule for a column of a table.correspondingly updated. including) o Apdate and delete 1o Action o Delete (A"(AD# o Delete "#T 1ADD 0 (@#(I constraints for complex integrity rules $ou cannot enforce referential integrity using declarative integrity constraints if child and parent tables are on different nodes of a distributed database.ing is a user. updates.ed t the end of the statement. >racle supports the use of '>+#&H1 I#$ integrity constraints to define the referential integrity actions. @owever. 0 1o Action) Disallows the update or deletion of referenced data.ey constraints for the rule associated with uni ue column values 0 2+&%A+$ I#$ constraints for the rule associated with primary identification values 0 '>+#&H1 I#$ constraints for the rules associated with referential integrity. >racle supports the following integrity constraints) 0 1>T 1ADD constraints for the rules associated with nulls in a column 0 A1&NA# . all associated dependent rows are deleted. This differs from +#"T+&(T in that it is chec. ?hen a referenced row is deleted.

1ull means the absence of a value.ey constraint by implicitly . but the combination of a given area code and given telephone number cannot be duplicated in the table. you can define a 1>T 1ADD constraint to re uire that a value be input in the last name column for every row of the employees table. This eliminates unintentional duplication of a telephone number.ey.ey constraint are called the uni ue . 1>T 1ADD &ntegrity (onstraints.0 Asing stored procedures to completely control access to data 0 #nforcing business rules with triggered stored database procedures Types of &ntegrity (onstraints $ou can use the following integrity constraints to impose restrictions on the input of column values) 0 1>T. This A1&NA# . @owever. note that . Ani ue Ieys ) The columns included in the definition of the A1&NA# . 'or example >racle enforces the A1&NA# . A1&NA# Iey &ntegrity (onstraints A A1&NA# . A 1>T 1ADD constraint re uires a column of a table contain no null values. !y default. all columns in a table allow nulls. 'or example. then that group of columns is said to be a composite uni ue .ey) be uni ue G that is.ey is often incorrectly used as a synonym for the terms A1&NA# .ey constraint lets you enter an area. 1ADD &ntegrity (onstraints 0 A1&NA# Iey &ntegrity (onstraints 0 2+&%A+$ I#$ &ntegrity (onstraints 0 +eferential &ntegrity (onstraints 0 (@#(I &ntegrity (onstraints 3. no two rows of a table have duplicate values in a specified column or set of columns.ey. &f the A1&NA# . code and telephone number any number of times.ey consists of more than one column. Ani ue .ey refers only to the column or set of columns used in the definition of the integrity constraint.ey integrity constraint re uires that every value in a column or set of columns (.ey constraint or A1&NA# index. 6. A1&NA# Iey (onstraints and &ndexes ) >racle enforces uni ue integrity constraints with indexes.

2+&%A+$ I#$ &ntegrity (onstraints #ach table in the database can have at most one 2+&%A+$ I#$ constraint. That is.eys and 1>T 1ADD integrity constraints are common. every table should have a primary .ey and also eliminates the possibility that any new row7s data will ever conflict with an existing row7s data.ey.ey columns do not allow nulls.ey.ey) always satisfies a A1&NA# . composite A1&NA# . &n effect. Therefore. any number of rows can include nulls for columns without 1>T 1ADD constraints because nulls are not considered e ual to anything.ey columns in each row. Although it is not re uired. 0 The primary .ey so that) 0 #ach row in the table can be uni uely identified 0 1o duplicate rows exist in the table 2+&%A+$ I#$ (onstraints and &ndexes) >racle enforces all 2+&%A+$ I#$ constraints using indexes.creating a uni ue index on the composite uni ue . (ombine A1&NA# Iey /and 1>T 1ADD &ntegrity (onstraints) &n A1&NA# . 2rimary Ieys ) The columns included in the definition of a table7s 2+&%A+$ I#$ integrity constraint are called the primary .ey constraints allow the input of nulls unless you also define 1>T 1ADD constraints for the same columns.ey values. each row is named by its primary .ey constraint. (olumns with both uni ue . This combination forces the user to enter values in the uni ue .ey constraint created for the deptno column is enforced by the implicit creation of) 0 A uni ue index on that column .ey constraints have the same limitations imposed on composite indexes) up to 86 columns can constitute a composite uni ue . &n fact. 8. The >racle implementation of the 2+&%A+$ I#$ integrity constraint guarantees that both of the following are true) 0 1o two rows of a table have duplicate values in the specified column or set of columns. &n the primary . a value must exist for the primary . The values in the group of one or more columns sub*ect to this constraint constitute the uni ue identifier of the row. A null in a column (or in all columns of a composite A1&NA# .ey.

which is the same limitation imposed on composite indexes. and the rules that govern the relationship of the columns must be maintained. &f a usable index exists when a primary . The name of the index is the same as the name of the constraint..ey. you can specify the storage options for the index by including the #1A!D# clause in the (+#AT# TA!D# or ADT#+ TA!D# statement used to create the constraint. Also. 0 A 1>T 1ADD constraint for that column (omposite primary . +eferential integrity rules guarantee that these relationships are preserved. A referential integrity constraint re uires that for each row of a table. +eferential &ntegrity (onstraints Different tables in a relational database can be related by common columns. then the primary .ey constraint uses that index rather than implicitly creating a new one.ey constraints are limited to 86 columns. The following terms are associated with referential integrity constraints.ey constraint is created.ey matches a value in a parent . the value in the foreign . 9. .

ey value. This means any uery that could be expressed in +A could be expressed by formula in +(.eys either to match the referenced primary or uni ue . union.e *oin. intersection.ey references a parent . +A ueries are more powerful than the +(. or be null. <. difference. +elational algebra operations manipulate some relations and provide some expression in the form of ueries where as relational calculus are formed ueries on the basis of pairs of expressions. +( are formed ?''s where as +A does not form any formula. This integrity constraint eliminates the possibility of erroneous employee numbers in the mgr column. E."elf. #xpressive power of +A and +( are e uivalent. F +A formed the mathematical form and have no specific*uer3 language +( also has mathematical form but has one uery language NA#D. but not necessarily in the same row. +A is procedural language where as +( is non procedural uery system. Q. 1ulls and 'oreign Ieys) The relational model permits the value of foreign . K. pro*ection.null portions of the . . &n the referential integrity constraint ensures that every value in the mgr column of the emp table corresponds to a value that currently exists in the empno column of the same table. where as +( has tuples and domain oriented expressions. This type o.referential integrity constraint.ey is null.ey in the same table. Any I( formula is translated in Algebric uery. 8.ey. 3=. L. &f any column of a composite foreign . 9. +elational algebra is easy to manipulate and understand than +(. selection etc.+eferential &ntegrity (onstraints) Another type of referential integrity constraint is called a self. then the non. There is modification which is easy in ueries in +A than the +(. 3. foreign .ey do not have to match any corresponding portion of a parent . What is the difference bet'een the 5e(ationa( a($ebra and the 5e(ationa( 0a(cu(us? Ans. +A have operator li. 6. because every manager must also be an employee.#. division.

retrieve. Another way to say this is the "ND language is actually made up of 3) the Data Definition Danguage (DDD) used to create. 0 Data manipulation . ?arehouses. of a database application. etc. # uipment. Add text to the end of the current line of the "ND statement in the buffer. they enable a user to) 0 #nter. the table is the fundamental building bloc. . data is stored in structures called relations or tables. "tructured Nuery Danguage ("ND) is the language used to manipulate relational databases. and print uery results in the form of reports 0 Access and copy data between "ND databases The following is a list of "NDM2lus commands and their functions. "NDM2lus commands allow a user to manipulate and submit "ND statements. (ustomer >rders. &nserting new data. %. Defining tables and structures in the database (D!). and Nuerying the Database ( +etrieving existing data from the database). Apdating existing data. (olumns in the #mployee table. 2urchase >rders. edit. Deleting existing data. same as +A1 0 A((#2T . etc. @ire Date. Tables are used to store data on #mployees. perform calculations on. "ocial "ecurity 1umber. &n the relational model. %aterials. "ND is tied very closely with the relational model. might be Dast 1ame. store. &n relational databases. "pecifically. Ans. 'irst 1ame. and run "ND statements 0 Dist the column definitions for any table 0 'ormat. alter and drop scema ob*ects such as tables and indexes. for example. The most commonly used commands are emphasized in italics) 0 #xecute the current "ND statement in the buffer . and 6) The Data %anipulation Danguage (D%D) used to manipulate the data within those schema ob*ects. store.Q. Write a note on SQ& basic >ueries. Accept a value from the user and place it into a variable 0 A22#1D . "alary. #ach table has one or more attributes or columns that describe the table. "ND statements are issued for the purpose of) 0 Data definition .

"et the formatting behavior for the output of "ND statements 0 !T&TD# . Trace the execution plan of the "ND statement and gather statistics 0 !+#AI . Does calculations on rows returned from a "ND statement 0 (>11#(T . #xecute the current "ND statement in the buffer 0 "A-# . >btain help for a "NDM2lus command (&n some installations) 0 @>"T . Add one or more lines to the "ND statement in the buffer 0 D&"T . Drop to the operating system shell 0 &12AT . (lear the buffer 0 (>DA%1 . Delete the current line in the buffer 0 D#"(+&!# . 2lace a comment following the +#%A+I . "et an environment variable to a new value 0 "@>? . "ave the current "ND statement to a script file 0 "#T . (opy data from one table to another in the same or different databases 0 D#D . #xit the "NDM2lus program 0 +#%A+I . +eplace text on the current line of the "ND statement with new text 0 (D#A+ . "end the output from a "ND statement to a file . #xit the "NDM2lus program 0 H#T . Doad a "ND statement into the buffer but do not execute it 0 @#D2 . 2lace a title on the bottom of each page in the printout from a "ND statement 0 (@A1H# . "how the current value of an environment variable 0 "2>>D . (onnect to another >racle database or to the same >racle database under a different user name 0 (>2$ .eyword 0 +A1 .0 AAT>T+A(# . Dist the columns with data types of a table ((an be abbreviated as D#"() 0 #D&T . (hange the appearance of an output column from a uery 0 (>%2AT# . Dist the current "ND statement in the buffer 0 NA&T . #dit the current "ND statement in the buffer using an external editor such as vi or emacs 0 #R&T .

<.nown as a (A+">+. "ND 'eatures . all the rows affected by your statement are dealt with in one go as a set.s ). management and end users can use "ND. 8. What are the various features of SQ&? Ans. &t is non procedural. &t provides commands for the following tas. area that holds the set is . "ND encompasses a range of uses and users. ?hen you uery data. 1. application programmers.e Danguage using set #nglish phrases to manipulate the database @ow well it achieves this is uestionable 6. they are not dealt with separately. &t is meant to be an #nglish li. 0 uerying data 0 inserting. What is a tri$$er? . Doad a "ND statement located in a script file and then run that "ND statement 0 T&%&1H .. updating and deleting data 0 creating. modifying and deleting database ob*ects 0 controlling access to the database and database ob*ects 0 guaranteeing database consistency 0 monitoring database performance and configuration Q. Ased to time the execution of "ND statements for performance analysis 0 T&TD# . 3. 9. Delete a user defined variable Q *.2lace a title on the top of each page in the printout from a "ND statement 0 A1D#'&1# . $ou specify the information re uired not the navigation and operations re uired to access the data.0 "TA+T . #ach +D!%" has an inbuilt uery optimiser which parses your "ND statements and wor. The wor. D!A7s.s out the optimum path to the re uired data.

0 Typically retrieves individual records or ob*ects from the database and processes each separately. a D#D#T#. 1mp(oyee 9person)name4 street4 city. What is the difference bet'een a procedura( and a non)procedura( (an$ua$e? Ans. and are called set at a time or set oriented D%Ds. 0 D%!"s allow D%D statements to be entered interactively from a terminal. because the D%D often specifies what to retrieve. 12. 1on procedural 0 (an be used on its own to specify complex database operations. 11. or an A2DAT# operation. Triggers are special stored procedures that are executed when a table undergoes an &1"#+T. 2rocedural 0 %ust be embedded in a general purpose programming language. such as "ND can specify and retrieve many records in a single D%D statement. Q. or to be embedded in a programming language. 0 Therefore it needs to use programming language constructs such as loops. compiler and processed by the D!%".level D%Ds. rather than how to retrieve it. &f the commands are embedded in a general purpose programming language. !ecause they are fired by the event and not by choice they cannot have parameters Q. . >r Triggers are parameter. 0 Dow.less procedures that are triggered (fired) either before or after inserting. 0 @igh. 0onsider the fo((o'in$ emp(oyee database4 'here the primary 3eys are under(ined.level D%Ds are also called record at a time D%D" because of this. updating or deleting rows from a table. Wor3s 9person)name4 company)name4 sa(ary.level languages are often called declarative. Triggers often enforce referential integrity and can also call other stored procedures. 0 @igh.Ans. the statements must be identified so they can be extracted by a pre.

(ii) "elect person name. cooperation5 and salary Z 4=.. 'or3s 9person)name4 company)name4 sa(ary. -ind the names and cities of residences of a(( emp(oyees 'ho 'or3 for -irst Ban3 0orporation. cooperation5. street and city from employee where company name 4 first ban. 9i. Mana$ers 9person)name4 mana$er)name. Ans. (iii) "elect M from employee where compant O S 4first ban. 9v... -ind the names4 street address and cities of residences of a(( emp(oyees 'ho 'or3 for -irst Ban3 0orporation and earn more than @1. -ind the names of a(( emp(oyees 'ho 'or3 for -irst Ban3 0orporation. (iv) "elect Mfrom wor.=== 4. -ind the company that has the sma((est payro((.s where5 salary Y. 9iii. . -ind a(( emp(oyees 'ho do not 'or3 for -irst Ban3 0orporation. 13. mana$ers 9person)name4 mana$er)name.=== 4.. 9i. 9iv. 9ii. 9b. -ind a(( emp(oyees in the database 'ho do not (ive in the same cities and on the same streets as do their mana$ers. (v) "elect Mfrom employee where 4employee city nameO S manager city name Q.0ompany 9company)name4 city. 9a. 6ive an e2pression in SQ& for each of the fo((o'in$ >ueries. Se(ect person name from emp(oyee 'here company name A < first ban3 cooperation= and city A< &as ?e$as <. 0onsider the fo((o'in$ re(ationa( database and $ive an e2pression in re(ationa( a($ebra to e2press each of the fo((o'in$ >ueriesB emp(oyee 9person)name4 street4 city. -ind the names of a(( emp(oyees 'ho 'or3 for -irst Ban3 0orporation and (ive in &as ?e$as. company 9company)name4 city.

What are ro' tri$$ers? Ans. a row trigger is not executed at all. Ans. -ind names of a(( emp(oyees 'ho earn more than every emp(oyee of Sma(( Ban3 0orporation. cooperation5. a row trigger is fired once for each row affected by the update statement..s where company.. (iii) "elect person]name form wor. .5 (v) "elect M from wor. Q. &f a triggering statement affects no rows.. 2ostgre "ND 6. -ind names of a(( emp(oyees 'ho earn more than @1.s where company.g. if an update statement updates multiple rows of a table. 9d.s where salary Z 4=.===."ND. 9e. cooperation5. Q. -ind the names of a(( emp(oyees 'ho do not 'or3 for -irst Ban3 0orporation. per annum. 3. &ist any t'o procedura( pro$rammin$ (an$ua$es.. 2D.nameO 4first ban. (iv) "elect person]name from wor. 1 . Ans. cooperation5. (ii) "elect person]name and city from employee where company name S 4first ban.s where 4salary Z small ban. 1!.name S 4first ban. cooperation5. D!6 "ND 8. (i) "elect person]name from wor. A row level trigger is fired each time the table is affected by the triggering statement e.9c.

1". Ans. -iews are based on select statement which derive their data from real tables (+#AT# -&#? (reate a new view based on b3s in the database. Nuestion 1o. 1#. 0ompany 9company)name4 city. Ans.6. Define C?ie'7. 66. Ans.Q. "imilar Nuestion (hapter. 0onsider the fo((o'in$ re(ationa( database 1mp(oyee 9person)name4 street4 city. The table names must already exist. Define the term DD&.. 1*. 9b. 1%. Q. -ind names4 cities of residence of a(( emp(oyees 'ho 'or3 for first Ban3 corporation and earn more than @1. -ind the names of a(( emp(oyees 'ho 'or3 for first Ban3 corporation. A view is a method of organising table data to meet a specific need..6. (+#AT# -&#? has the following syntax) . Q. "imilar Nuestion (hapter. 38. Q. Ans. DDD is data definition languages.4. Wor3s 9person)name4 company)name4 sa(ary. Nuestion 1o. The new view name must not exist. Mana$er 9person)name4 mana$er)name.. 6ive re(ationa( a($ebra e2pression for each of the fo((o'in$ >ueriesB 9a.

That is. employees. This is especially true in cases where views *oining two or more tables together or when a view contains an aggregate function. one may uery a view but it is normally the case that views can not be operated on with &1"#+T. non. a chair.r What do you mean by 1ntities and Attributes? Ans. which is used for maintaining the data. Define entity and attribute. students. manager etc. . a building. academic staff. are all different entities. employee etc.g. The syntax for the D+>2 -&#? command is) D+>2 -&#?: Q. A2DAT# or D#D#T#. D+>2 -&#? Drop a view from the database. . *ob positions. #ntities have concrete existence or it contains ideas or concept e.academic staff. inventories of raw and finished products. #ntities and their Attributes) #ntities are the basic units in modeling classes of concrete (real) or abstract ob*ects. 1ote that an >+D#+ !$ clause may not be added to the s l select statement when defining a view. &n general.. The view name must already exist in the database. #xamples of entity sets are transactions. An entity type or entity set is a group of similar ob*ects of an organization. 2. a room.only. views are read.Additional information on the "#D#(T statement and "ND ueries can be found in the next section.

?e must have some properties as characteristics of employee that may be useful to the organization. employees of an organization are modeled by the entity set #%2D>$##. 0 +emoves many causes of anomalous data dependencies 0 Allows better chec. &n this section. The properties that characterize an entity set are called its attribute. Ans. 1ormalization theory is based on the concepts of normal forms. elementary item of ob*ect property. A person can be a student as well as a part time employee. name.ills and functions. we have to create a model for it. . 0 &s (usually) better for uery handling. A relational tab is said to be a particular normal form if it satisfied a certain set of constraints.An ob*ect can belong to different entity sets simultaneously. date element data field. 0 !ut computational penalties in some "ND operations. 1ormalization is a design techni ue that is widely used as a guide in design relational databases. "ome of these properties are employee. Q. 1. There currently five normal forms that have been defined. 0hapter 3 B Database Desi$n 8heory And Methodo(o$y 9Dart 1. item. The individual numbers of the flight crew have different s.ill and employees pay.ill with possible values. What is norma(iEation? Discuss various Forma( forms 'ith the he(p of e2amp(es. To store data on an entity set. employee address.step process that puts data in tabular form by removing repeating groups and then removes duplicated data fro the relational tables. '.ill.s for consistency. "o entity set of #%2D>$## has relationship with attribute s. employee no. "o the entity set #%2D>$## add the attribute s. 1ormalization is essentially a two. 'or example.. (odd "ignificance of 1ormalization 0 &mproves update efficiency. &t consists of a group of individuals employed by an organization who belong to the entity sets #%2D>$## B 2#+">1. we will cover first three normal forms that were defined by #. employee s. An attribute is also referred to by the term data item. (onsider the modeling of flight crew.

update and deletion anomalies 9. (annot update information without changing information in many places. To obtain powerful relational retrieval using relational operator 8.down fashion by evaluating each relation against the criteria for normal form decomposing relation. 9. To ma. The process. A stronger definition of 81'Gcalled !oyce (odd normal form (!(1')Gwas proposed later by !oyce (odd. second and third normal form. A record about an entity cannot be inserted into the table without first inserting information about another entity . a 91' and <1' were proposed. which he called first. >ne place to begin getting this information is from a printed document used by the system. All these normal forms are based on the functional dependencies among the attributes of a relation. (annot enter a customer without a sales order 0 Delete Anomaly . !egin with a list of all of the fields that must appear in the database. 0 Duplication of Data G The same data is listed in multiple lines of the database 0 &nsert Anomaly . Additional attributes besides those for the entities described on the document can be added to the database. &nitially. Do not include computed fields 8. (annot delete a sales order without deleting all of the customer7s information. Thin. 0 Apdate Anomaly . (odd proposed three normal forms. To update customer information. 6. as necessary. To free relation from undesirable insertion. it must be updated for each sales order the customer has placed !efore 1ormalization 3. Dater. can thus be considered as relational design by analysis. ta. A record cannot be deleted without deleting a record about a related entity.1ormalization is also significant due to following reason 3. To reduce the need for restructuring the relations as new data types are introduced 1ormalization Avoids.es a relation schema through a series of tests to 4certify5 whether it satisfies a certain normal form. of this as one big table. which proceed in a top. . respectively. 1ormal 'orms The normalization process as first proposed by (odd (3LF6).e feasible represent any relation in the database 6. base on the concept of multivalued dependencies and *oin dependencies.

of values or a list of values. deletion. only one value is associates with each attribute and the value is not a set. 6. %inimizing redundancy.ed upon as a process of analyzing the given relation schemas based on their 'Ds and primary .1eed of 1ormalization 1ormalization of data can hence be loo. @ere scalar is atomicity. 1ormal forms are based on primary . A database schema is in &1' if every relation schema included in database scheme is in &1'. 1ormalization) &t is the process of structuring an unstructured relation into structural one with the purpose of removing redundancy and anomalies.ey. A relation is in 31' if and only if all underlying domains contain scalar value only. 'irst 1ormal 'orm (&1 ') Definition ) A relation schema is said to be in &1' if the values in the domain each attribute of the relation are atomic.eys to achieve the desirable properties of) 3. &n other words. and updation. %inimizing the insertion. meaning there should be single value at the intersecting of each row and column as shown in the '&+"T relation obtained by original relations .

if we delete the '&+"T tuple with "V value "8 2V value 26. we lose the information that "V is located in 2aris. . until "< supplies some part. D#D#T#) &f we delete only the '&+"T tuple for a particular supplier. The reason is that.The functional dependencies in relation '&+"T is as follows) !ut problem occurs with each of the three operations. 'or example.ey values. we have no appropriate primary . &1"#+T) ?e cannot insert the fact that a particular supplier is located in a particular city until that supplier supplies at least one part. '&+"T relation does not show that supplier "< is located in Athens. we destroy not only the shipment connecting that supplier to some port but also the information that the supplier is located in particular city.

A relation is in 61' if it is &1' and every non. !efore 2roceeding to next form let us denote) + S +elation "cheme " S "et of attributes ' S All of functional dependencies "econd 1ormal 'ormal (61') Definition) A relation schema +Y". 'or example. we are faced with either the problem of reaching '&+"T to final every tuple connecting "3 and Dondon (and changing it) or the possibility of producing an inconsistent result (the city for "& might be given as Amsterdam in one tuple. .ey is a single attribute then the relation is automatically in the 61'.A2DAT#) The city value for a given supplier appears in '&+"T many times. This redundancy causes update problems. 'Z is in second normal form (61') if it is in the &1' and if all nonprime attributes are fully functionally dependent on the relation . A database schema is in 61' if every relation schema included in the data base schema is in 61'.eys). Dondon in another). to overcome this problem we ma.e 61'. in general.ey. if supplier "& moves from Dondon to Amsterdam. &f the . 'eature) 3.ey attribute is fully dependent on the . 6. Therefore.

. by simply inserting the appropriate tuple into "#(>1D.etched earlier. A2DAT#) The "V . D#D#T#) ?e can delete the shipment. which is thus the primary . "o we decompose '&+"T relation in two table word. Thus we can change the city for "& from Dondon to Amsterdam by changing it once and for all in the relevant "#(>1D tuple. &1"#+T) ?e can insert the information that "< is located in Athens.g.ey."econd 1ormal 'orm (definition assuming only one candidate . (onnecting "8 and 26 by deleting the opposite tuple from "6: we do not lose the information that "8 is located in 2aris.ey attribute is irreducibly dependent on the primary . we cannot state that any supplier in +ome must have a status of <= G until we have some supplier actually located in that city. .ey. even though "< does not currently supply any parts.ey)) A relation is in 61' if and only if it is in &1' and every non. "till we have problem with their operations in the following ways) &1"#+T) ?e cannot insert the fact that a particular city has a particular status e. &t should be clear that revised structure overcomes all the problems with update operation s. (&T$ redundancy has been eliminated.

if we delete the "#(>1D tuple of "<. either R contains a .ey) or A is a prime attribute. in this case) by two pro*ections ma. we lose the information that the status for Athens is 8=). Again to overcome such problems we replace the original relation ("#(>1D. R is a super.ey (i.ey.D#D#T#) &f we delete the only "#(>1D tuple for a particular city.. we are faced with either the problem of searching "#(>1D to find every tuple for Dondon (and changing it) or the possibility of producing an inconsistent result (the status in Dondon might be given 6= in one tuple and 8= in another). by such relation we have removed transitivity from relation "#(>1D !oyce (odd 1ormaO 'orm (!(1') .ey)) A relation . (41o transitive dependencies5 implies no mutual dependencies).ey attribute is non. which is thus the primary . 'eature) A relation + is in 81' if and only if it is in 61' and every non.ey attribute is nontransitively dependent on the primary . transitively dependent on the primary . Third 1ormal 'orm (81') Definition) A7 relation schema +Y". (81') (Definition assuming only one candidate . 'or example. we destroy not only the information for the supplier concerned but also the information that city has that particular status. A database schema is in 81' if every relation schema included in the database schema is in 8 1'. "( and (" are both in 81' Thus.e. if we need to change the status for D>1D>1 from 6= to 8=.ing 81'. +elation. if for all nontrivial functional dependencies in ' of the form R[A.is in 81' if and only if it is in 61' and every non.ey.ey. 'Zis in 81'. A2DAT#) The status for a given city appears in "#(>1D many times. in general (the relation still contain some redundancy). Thus.

"( and (". (&T$. are each in !(1'. of these. which are not in 81'.ey at least one attribute is overlapping. &n each candidate . "#(>1D is not in !(1' ether because the determinant (&T$ is not a candidate . less formally. +elation '&+"T and "#(>1D. #xplanation . and T"V.eys) are composite (there is no single . Iey attributes (candidate . left. Any table if follow above mentioned three features of !(1'.ey. only T"V. 8. are not in !(1# either: also that relation "2. then we will say that this table is in !(1' 9.ey of +.irreducible 'D has a candidate . +elation "2. so '&+"T is not in !(1' "imilarly. !(1' is a special case in 81'.-alued Dependencies and 'ourth 1ormal 'orm &t was proposed as a sample form of 81' but it was found to be stricter than 81' because every relation in !(1' is also in 81': however a relation in 81' is not necessarily in !(1'.ey which identify record). Definition) A normalized relation scheme +Y". !1' (informal definition)) A relation is in !(1' if and only if the only determinants are candidate . 2V).ey in the only determinant in the reduction %ulti. R is a super. "( and (" on the other hand.eys.in !(1' if for every nontrivial 'D in ' of the form x A where R WW" and AW". because in each case the (single) candidate . 6. namely "V.ey as its determinant. 2VJ is a candidate . 'Z is .ey. 'eatures. +elation '&+"T contains three determinants. 3. which were in 81' are also in !(1'. %ore than one candidate .A relation is in !(1' if and only if every nontrivial.eys are there. >r.

Advisor G &t is in &1' by definition. A relation is in !(1' if every determinant is a candidate .ey.ey. G &t is in 61' since any non .(omposite Iey wor. . 3. &t is not in !(1' because it has a determinant '1ame. G &t is in 81' because it has no transitive dependencies.ey.ey attributes are dependent on the entire .ing as a candidate .ey. that is not a candidate .

decomposed into two pro*ections we will call 46.loss way by exactly two of its pro*ections. >r less formally. Joint Dependencies and 'ifth 1ormal 'orm "o far in this chapter we have assumed that the sole operation necessary or available in the further normalization process is the replacement of a relation in a non. A relation that can be non. decomposed into three (or more)."TA.ey of +.Iey) ("&D. we will describe such a relation as 4n. A database scheme is in 91' if all relation schema included in the database schema are in 91'. to discover that there exist relatior< that cannot be non.5 .ey as its determinant. therefore.decomposable5 (for some n Z 6)G meaning that the relation in uestion can be non. &t comes perhaps as a surprise. To coin an ugly but convenient term.loss.loss.loss.loss. consider a set of attributes R and $ where R WW+ . decomposable. either R WWW$ is a trivial %-D or R is a super.decomposed into two pro*ections but can be non. $W +. '1ame) AD-G"A!J Iey) ('1ame) +elations in !(1' 1ow we can say that a relation is in !(1' if and only if every nontrivial left.eys. This assumption has successfully carried us as far 91'. irreducible 'D has a candidate . A relation is !(1' if and only if the only determinant are candidate .AD. The relation schema + is in 'ourth normal 'orm (91') if for all multivalued dependencies of the form R WWWWWW DC.decomposed into n pro*ections but not into m for any m Yn. 'ourth 1ormal 'orm (91') Definition) Hiven a relation schema + such that the set D of 'Ds and %-D" satisfied.

remove the attributes involved in the transitive dependency and put them in a new relation 0 +ule) A relation in 61' with only one non.repeating groups.ey. . 81' does not guarantee that all anomalies have been removed. one for the repeating groups and one for the non.ey that has partially dependent attributes 81' 0 A relation is in 8' if it contains no repeating groups.ey 0 >ne relation for each part of the . 61' 0 A relation is in 61' if it contains no repeating groups and no partial . and no transitive functional dependencies 0 To convert a relation with transitive functional dependencies to 81'.&n short 31' 0 A relation is in &1' if it contains no repeating groups 0 To convert an unnormalised relation to & 1' either) 0 'latten the table and change the primary . 0 +emember to put the primary . the whole .ey attribute must be in 81' 0 &n a normalized relation a non..ey field must provide a fact about the .ey.ey from the original relation into both new relations.ey and nothing but the . +elations in 81' are sufficient for most practical database design problems.ey functional dependencies 0 +ule) A relation in &1. 0 This option is liable to give the best results.ey field must be in 61' 0 To convert a relation with partial functional dependencies to 61' create a set of new relations 0 >ne relation for the attributes that are fully dependent upon the .ey. no partial functional dependencies. or 0 Decompose the relation into smaller relations.' with a single . @owever..

R $ . The terms functional dependence are used interchangeably in the technical literature.5 !ut we very fre uently need to refer to 'Ds in the plural. @ere then is the definition for (ase (a). ?e will refer to this revised relation as "(2 to avoid confusion. What are mu(tiva(ued dependencies? . we will first define the concept of functional dependency as it applies to (ase a. which ma.r Define the tern functiona( dependency. and 4dependencies5 seems to trip off the tongue more readily than 4dependencies5: hence our use of both terms. Det + be a relation. 1ow. A possible tabulation of relation "(2 is given in 'ig. !asic definitions &n order to illustrate the ideas of the present section. 'A1(T&>1AD D#2#1D#1(&#" &ntroduction !asically. there is a functional dependence from the set of attributes ("V. (ustomary #nglish usage would suggest that the term 4dependence5 be used for the 'D concept perse and would reserve the term 4dependency5 for 4the ob*ect that depends. the value of a given relation (i. 'Ds) provide a basis for a scientific attac. &n the shipments relation "2. and then extend it to apply to (ase b. for example.e use of a slightly revised version of the shipments relation. the set of all possible values that the given relation (variable) might assume at different times. there is one corresponding value of the attribute (NT$. &n what follows.e. we ma.Q. in addition to the usual attributes "V. This is because 'Ds possess a rich set of interesting formal properties. relation variable) at a given point in time and b. 2.e it possible to treat the problems in uestion in a formal and rigorous manner. one that includes. +#2+#"#1T&1H T@ (&T$ '>+ T@# +#D#-A1T "A22D&#+. 2V). Then we say that $ is functionally dependent on RG hi symbols. on a number of practical problems. and let R and $ be arbitrary subsets of the set of attributes or +. it is very important in this area G as in so many other G to distinguish clearly between a. and 4dependencies5 !ut we very fre uently need to refer to #Ds in the plural. an attribute (&T$. Ans. 2V) to the set attributes (NT$) what this means is that for many values of the attribute pair ("V. 2V and NT$.. a functional dependence (usually abbreviated 'D) is a many to one relationship from one set of attributes to another within a given relation.

value in +.g.hand side of an 'D are sometimes called the determinant and the dependent.value in + has associated with it precisely one $. &n other words. the tabulation of relation "(2 shown in 'ig. the determinant and dependent are both sets of attributes. 'or example.(read 4R functionally determines 3. (sV) ((&T$) because every "cp tuple with a given "V value also has the same (ity -alue.value. they also agree on their $.. whenever two tuples of + agree on their R.e. L. these. respectively. As the definition states.value. the following among them) 'ig) The relation "(2 ("ample tabulation) (#xercise) (hec. when it is a singleton set G we will often drop the set brac. "V (ity .5 or simply 4R arrow $5) G if and only if each R. however G i.3 satisfies the 'D. ?hen the set contains *ust one attribute.) The left. &ndeed it also satisfies several more 'Ds.hand side and right.ets and write : *ust #.

which do hold in the sample tabulation of 'ig. @owever. at any given time.. @enceforth. "V (ity @ole for all possible values of "(2. in every possible legal value of +.5 or simply 4R arrow $5)Gif and only if.K. we will usually ta. but rather in those 'Ds that hold for all possible values of that variable in the case of "(2. whenever two tuples agree on their R. a given supplier has excessively one corresponding city. R $ (read 4R functionally determines $. Then we say that $ is functionally dependent on R G in symbols. when we consider relation variables G in particular. for example. because. values. 8.e. to individual relation values. they also agree on their $ G value. more demanding. timeGindependent meaning(barring explicit statements to the contrary). ?hen we consider base relations G we are usually interested not so much in the 'Ds that happen to hole in the particular value that the variable happens to have at some particular time. . 0 Det + be a relation -ariable. The foregoing definitions apply to 4(ase (a)5Gi. each R G value has associated with it precisely one $ G value &n other words. for all possible values of "(2) integrity constraint for "(2 G it places limits on the values that "(2 can legitimately assume. and let R and $ be arbitrary subsets of the set of attributes of +.As already explained.e the term 4functional dependency5 to have this latter. @ere are some time. do not hold 4for all time5.e. in every possible legal value of +. the 'D. &n fact The statement that this 'D holds 4for all time5 (i. and so any two tuples appearing in "(2 at the same time with the same supplier number must necessarily have the same city as well.independent 'Ds that apply to the relation variable "(2) 1otice in particular that the following 'Ds. @ere then is the 4(ase5 definition of functional dependency (the extensions over the (ase definition are shown in boldface.

the statement that (e.5 "imilarly for 4functionally dependent on5. Therefore. and the 'Ds in " will then be enforced automatically. it is sufficient that the D!%" enforce the 'Ds in T. Trivial and 1ontrivial Dependencies 1ote) &n the remained of this section. 'or example.ey). we must necessarily have) &n fact.ey of relation +Gin particular. Hiven a particular set " of 'Ds.K).5 etc.5 "imilarly for 4functionally dependent on function dependency5 to *ust 4dependency. we will occasionally abbreviate 4functional dependency5 to *ust 4dependency. 1ow. >ne obvious way to reduce the size of the set of 'Ds. as the "(2 example suggests. even if we restrict our attention to 'Ds that hold 4for all time. &f such a set T can be found. &n the usual parts relation. it is desirable to find some other set T that is (ideally) much smaller than " and has the property that every 'D in " is implied by the 'Ds in T. functionally determines. for example.&n other words.K but it is not true for all possible legal values of "(2. The problem of finding such a set T is thus of considerable practical interest. &n the case of relation "(2.ey. and hence the D!%" needs to chec. them when updates are performed. if it cannot possibly not be satisfied *ust one of the 'D is . M then + will involve some redundancy. we have to deal with is to eliminate the trivial dependencies. 8. if relation + satisfies the 'D A [ ! and A is not a candidate . &t is worth pointing out that if R is a candidate . ?hy is this ob*ective desirableW >ne reason is that (as already stated) 'Ds represent integrity constraints.5 the set of 'Ds satisfied by all legal values of a given relation can still be very large.ey G then all attributes $ of relation + must necessarily be functionally dependent on R (this fact follows from the definition of candidate . in general (see 'ig. 4&s trivial5.g. if it is the primary . the fact that a given supplier is located in a given city appears many times.)5every shipment for a given supplier has the same shipment uantity5 happens to be true for the sample values in 'ig 8.

^ &mplies both the following 'Ds) T"V. via !. Nty. the rules are complete. (. then A( !(. however. 2V) T"V. (losure of a "et of Dependencies As already suggested that certain 'Ds imply others. Augmentation) &f A 8. The 'D A ( here is an example of a transitive 'D G ( is said to depend on A transitively. hand side.. and ( be arbitrary subsets of the set of attributes of the given relation +. "everal further rules can be derived from the three given above. trivial dependencies are not very interesting in practice: we are usually more interested in practice in nontrivial dependencies (which are. because these are the ones that correspond to 4genuine5 integrity constraints.trivial if and only if the right hand side is a subset (not necessarily a proper subset) of the left.) As a more complex example.) A! to mean the union of A and !. and (. suppose we have a relation + with three attributes A. Then 3. %ore over. 2V) ((ity) TNty. 38. 'Ds not implied by ") can be so derive. 2V3 ((ity. As the name implies. sense that no additional 'Ds (i. 2recisely the ones that are not trivial). The set of all 'Ds that are implied by a given set " of 'Ds is called the closure of ". in the sense that. the 'D T"V. then A 6. . of course. we cannot necessarily assume that all dependencies are nontrivial. Those rules can be stated in a variety of e uivalent ways. !. and is denoted ". all 'Ds implied by " can be derived from " using the rules. the following among them.g. the rules can be used to derive precisely the closure ". (learly we need a way of computing " from ". ! ! and ! (. The first attac. on this problem appeared in a paper by Armstrong which gave a set of rules of inference (more usually called Armstrong7s axioms) by which new 'Ds can be inferred from given ones. then A #ach of these three rules can be directly proved from the definition of functional dependence (the first is *ust the definition of a trivial dependence. such that the 'Ds A ! and ! ( both hold in +. As a simply example.e. of course). +eflexivity) &f ! is a subset of A. one of the simplest of which is as follows /Armstrong7s inference rules) Det A. &n other words. and let us agree to write (e. given a set " of 'Ds. ?hen we are dealing with formal dependency theory. They are also sound. Then it is easy to see that the 'D A ( also holds in +. Transitivity) &f A !.

The name 4Heneral Anification Theorem5 refers to the fact that several of the earlier rules can be seen as special cases) #xample) "uppose we are given relation + ?ith attributes A. 1ote) &f you would prefer a more concrete example. then A F (omposition &f A ! and ( D. (D is another arbitrary subset of the set of attributes of +. !) [!D (where 4#5 is union and 4 . !( for the set consisting of attributes ! and (G previously !( would have meant the union of ! and (. !. where ! and ( were sets of attributes. e. #. of computing "C from ". K. '. ?e now show that the 'D[AD ' holds in +. &f A ! and ( D.determination) A A ! and A !(. # as department name. then A E.e A as employee number for a pro*ect directed by that manager (uni ue within manager).4 is set difference). ta. and the 'Ds A ! (D !( # #' >bserve that we are extending our notation slightly (though not incompatibly) by writing. then A((. which he calls the Heneral Anification Theorem. Anion) &f A ! and A (.) 9. <. D. and so is a member of the closure of the given set) (losure of a "at of Attributes . "elf. (.These additional rules can be used to simplify the practical tas.g. Decomposition) &f A !(. then A( And Darwin proves the following rule. !D (. and ' as percentage of time allocated by the specified manager to the specified pro*ect.

#. in this section we give an effective way of determining whether a given (specified) 'D is in that closure ?e begin our discussion with the notion of a super. once for each of the given 'Ds. we need a way of determining the set of all attributes of + that are functionally dependent on IGthe so."J as computed so far. 1ow suppose we . those super.ey is a big step toward determining whether I is in fact a candidate .ey.A simple algorithm for computing this closure is given in 'ig.ey.eys that are irreducible.eys for a given relation + are precisely those subsets I of the set of attributes of + such that the functional dependency. (. holds true for every attribute A of +. "o determining whether or not a given set of attributes I is a super. (^ . The candidate . 6. To determine whether I is a super. !.") is now the set (A. >n the first iteration (for the 'D.ey5 can thus be derived from that of 4candidate . and we need to determine the candidate .ey for a relation + is a set of attributes of + that includes at least one candidate ."U T> TA. by definition. @owever.eys for that relation. A !().!^ under this set of 'Ds.!^C of the set of attributes TA.hand side is indeed a subset of (D>"A+# QI.ey.ey5 by simply deleting the irreducibility re uirement. (D>"A+# TI.ey of + as a subsetG not necessarily a proper subset. >f course (The definition of 4super.now the 'Ds that hold for some given relation. so we add attributes (! and) ( to the result. #xample) "uppose we are given relation + with attributes A.) it follows immediately that the super. A super. we need to determine whether the set of all attributes functionally dependent on I is in fact the set of all attributes of + and so.!). and 'Ds ?e now compute the closure TA. ?e now go round the inner loop four times. D.called closure I of I under ". 3.eys are. ?e initialize the result (D>"A+# QI. '.?e have not yet given an effective algorithm for computing the closure of a given set " of 'Ds. !. given a set " of 'Ds that hold in +. we find that the left.

&f every 'D implied by "& is implied by the 'Ds in "6. if "3C is a subset of "6C. we find that the left. ?e can easily tell whether a specific 'D R. then it will automatically be enforcing the 'Ds in "&. 1ow we go round the inner loop four times again. ?hat this means is that if the D!%" enforces that constraints represented by the 'Ds in "6.!.#. i. An important corollary of the foregoing is as follows) Hiven a set " of 'Ds. (D>"A+# QI.'^: on the third and fourth. &n other words.ey (and hence not a candidate .e. F.(."U which now has value (A.!^ is not a super. that TA.. it expands to TA. #). the result does not change: on the second. therefore..". and so the whole process terminates.U does not change. >ne the first iteration.hand side is not a subset of the result as computed so far. with 1ote. >n the third iteration Tfor the 'D. (D>"A+# QI. 9."U. >n the fourth iteration (for the 'D.e. we add # to (D>"A+# QI.#). i. 1ext. we say that "6 is a cover for "3M. 1ow we go round the inner loop four times again. ! . >n the second iteration (for the 'D.!. <. &rreducible "ets of Dependencies Det "& and "6 be two sets of 'Ds.ey a fortiori). (').. # . we now have a simple way of determining whether a given 'D R[$ is in the closure "7 of ". because that 'D will follow if and only if $ is a subset of the closure R7 of R under ". remains unchanged.(. if "6 is a cover for "3 and "3 is a cover for "6. which thus remains unchanged. if we say .8. (D G/ #').. it does not change. E.7 $ follows from <.

meaning that no attribute can be discarded from the determinant without changing the closure (i.hand side is a single attribute in each case.e. without converting " into some set not e uivalent to "). 1o 'D in " can be discarded from " without changing the closure converting " into same set not e uivalent to "). (i.that "& B "6 are e uivalent. 6. the left hand side is obviously irreducible in turn. is a singleton set). The right. !y contest. it will automatically be enforcing the 'Ds in "&. 8. The left hand side (the determinant) of every 'D in " is irreducible in turn. without 'or example. and vice versa. the following sets of 'Ds are not irreducible. consider the familiar parts selection 2. The following 'Ds (among others) hold in that relation) This set of 'D is easily seen to be irreducible. ?e will say that such an 'D is left. . and none of the 'Ds can be discarded without changing the closure (i. if "& B "6 are e uivalent.e. if and only if it satisfies the following three properties) 3.e.e. The right hand side (the dependent) of every 'D in " involves *ust one attribute (i. then if the D!%" enforces the constraints represented by the 'Ds in "6.irreducible. 1ow we define a set " of 'Ds to be irreducible. (learly. without losing some information)..

1ext.@. we can assume without loss of generality that every 'D in " has a singleton right.The first step is to rewrite the 'Ds such that each one has. #xample) "uppose we are given relation + with attributes A. for each 'D fin ". and 'Ds. ?e observe immediately that the 'D A be eliminated. so one occurrence can 6. we delete A from the left hand side of f.hand.". ! occurs twice. and we are given A( D so A D by transitivity: thus the ( on the left. there exists at least one e uivalent set that is irreducible &n fact. so A! ( by decomposition.hand side. attribute can be eliminated from the D.hand side.?e now claim that for every set of 'DA.side (D@") of A( D is redundant. ?e now compute an irreducible set of 'Ds that is e uivalent to this given set 3. . (. 1ext. (. !. we delete f from " The final set " is irreducible and is e uivalent to the original set ". we observe that the 'D. Than. Det the original set of 'Ds be ". 1ext. 8. A! ( can be eliminated.s to the decomposition sets. so A [ A( by augmentation. because again we have A so A! (! by augmentation. Then. D. this is easy to see. for each 'D f remaining in <. we examine each attribute A in the left hand side of f: if " and the set of 'Ds obtained by eliminating A from the left hand side of f are e uivalent. if " and ".f are e uivalent. of the 'D A( D because we have A (.. a singleton right.

?e should ma. >ptimzation techni ues that apply heuristic rules to modify the internal representation of a uery. the 'D. which is then optimized according to heuristic rules. Q. A ( is implied by the 'D. that a given set of 'Ds does not necessarily have a uni ue irreducible cover. What are the various $uide(ines for database desi$n? Ans.level uery first generates an initial internal representation.9. which is usually in the form of a uery tree or a uery graph data structure to improve its expected performance. Thus. >ne of the main heuristic rules is to apply "#D#(T and 2+>J#(T operations before applying the J>&1 or other binary operations. A eliminated. "atisfy the information content re uirements of the specified users and applications . cover & instead. A set 3 of 'Ds that is irreducible and is e uivalent to some other set " of 'Ds is said to be an irreducible cover for ".e it clear. is usually a multiplicative function of the sizes of the input files. should be applied before a *oin or other binary operation. Hiven some particular set " of 'Ds that need to be enforced. uery execution plan is generated to execute groups of operations based on the access paths available on the files involved in the uery. What is >uery optimiEation? Ans. 'inally. so it can also be This set is irreducible. ?e are left with) ! and ! (. The parser of a high. is sufficient for the system to find and enforce an irreducible. Q. t. The process of database design can be stated as follows) Design the logical and physical structure of one or more databases to accommodate the information needs of the users in an organisation for a defined set of applications. 3. This is because the size of the file resulting from a binary operation. . however. 'ollowing that. such as J>&1. The "#D#(T and 2+>J#(T operations reduce the size of a file and hence. The goals of database design are multiple) 3.

can thus be considered as relational design by analysis. ! Discuss the concepts of norma(iEation in detai(. processing time. 1ormal 'orms The normalization process as first proposed by (odd (3LF6).to. 1>+%AD '>+%" !A"#D >1 2+&%A+$ I#$" 1ormalization &n very simple words normalization is a techni ue which helps to determine the most appropriate grouping of data items into records.down fashion by evaluating each relation against the criteria for normal form decomposing relation. segments or tuples. which he called first. The process. (odd proposed three normal forms. second and third normal form. Q. relationship integrity in the relational databases. design (iii) (hoice of a D!%" (iv) Data %odel mapping (also called logical database design) (v) 2hysical database design (vi) Database system implementation and tuning. This is necessary as the data items are arranged in tables which indicate the structure. six main phases of the database design process (i) +e uirements collection and analysis (ii) (onceptual database. it may be at the cost of performance The problem is aggravated because the database design process often begins with informal and poorly defined re uirements &n contrast. "upport processing re uirements and any performance ob*ectives such as response time. which proceed in a top. &nitially.understand structuring of the information. Ans. All these normal forms are based on the functional . ?e can identify.6. These goals are very hard to accomplish and measure.es a relation schema through a series of tests to 4certify5 whether it satisfies a certain normal form. A stronger definition of 81'Gcalled !oyce (odd normal form (!(1')Gwas proposed later by !oyce (odd. ta. 8. and storage space. and they involve an inherent tradeoff if one attempts to achieve more 4naturalness5 and 4understand ability5 of the model. as necessary. the result of the design activity is a rigidly defined database schema that cannot be modified easily once the database is implemented. 2rovide a natural and easy.

@ere scalar is atomic ity. only one value is associated with each attribute and the value is not a set of values or a list of values. 6. A relation is in &1' if and only if all underlying domains contain scalar values only. 1ormal forms are based on primary . 1eed of 1ormalization 1ormalization of data can hence be loo. %inimizing the insertion. a 91' and <1' were proposed. %inimizing redundancy. 'irst 1ormal 'orm (&1') Definition) A relation schema is said to be in &1' if the values in the domain of each attribute of the relation are atomic. meaning there should be single value at the intersection of each row and column as shown in the '&+"T relation obtained by original relation . deletion. process of analyzing the given relation schemas based on their 'Ds and primary . and updation. 1ormalization) &t is the process of structuring an unstructured relation into structural one with the purpose of removing redundancy and anomalies.eys to achieve the desirable properties of) 3. respectively. base on the concept of mutivalued dependencies and *oin dependencies. &n other words.dependencies among the attributes of a relation.ey. A database schema is in &1' if every relation schema included in database scheme is in &1'. Dater.ed upon as a.

'or example. '&+"T tuple with "V value "8 2V value 26. until "< supplies some part. 'or example. &1"#+T) ?e cannot insert the fact that a particular supplier is located in a particular city until that supplier supplies at least one part.ey values. '&+"T relation does not show that supplier "< is located in Athens. if we delete the.The functional dependencies in relation '&+"T is as follows) !ut problem occurs with each of the three operations. we lose the information that "V is located in 2aris. we are faced with either the problem of reaching '&+"T to final every tuple . D#D#T#) &f we delete only the '&+"T tuple for a particular) supplier. A2DAT#) The city value for a given supplier appears in '&+"T many times. The reason is that. we have no appropriate primary . in general. This redundancy causes update problems. we destroy not only the shipment connecting that supplier to some port but also the information that the supplier is located in particular city. if supplier "3 moves from Dondon to Amsterdam.

eys).ey attribute is fully dependent on the . to overcome this problem we ma. 'Z is in second normal form (61') if it &s in the &1' and if all nonprime attributes are fully functionally dependent on the relation .ey. • 0hapter 3 ) Dart 2 .ey is a single attribute then the relation is automatically in the 61'. 'eature) &. 6. !efore 2roceeding to next form let us denote) + S +elation "cheme " S "et of attributes ' S All of functional dependencies "econd 1ormaO 'ormal (61') Definition) A relation schema +Y". &f the .e 61'. A database schema is in 61' if every relation schema included in the data base schema is in 61'. A relation is in 61' if it is &1' and every non. Dondon in another).connecting "3 and Dondon (and changing it) or the possibility of producing an inconsistant result (the city for "3 might be given as Amsterdam in one tuple. Therefore.

by simply inserting the appropriate tuple into "#(>1D. A2DAT#) The status for a given city appears in "#(>1D many times. (onnecting "8 and 26 by deleting the opposite tuple from "6: we do not lose the information that "8 is located in 2aris. .S10. "o we decompose '&+"T relation in two table word.etched earlier.ey. (&T$ redundancy has been eliminated. we cannot state that any supplier in +ome must have a status of <= G until we have some supplier actually located in that city.ey attribute is irreducibly dependent on the primary . &t should be clear that revised structure overcomes all the problems with update operation s. which is thus the primary . if we delete the "#(>1D tuple of "<. Thus we can change the city for "& from Dondon to Amsterdam by changing it once and for all in the relevant "#(>1D tuple. (ourse &D is such an attribute which is overlapping. we destroy not only the information for the supplier concerned but also the information that city has that particular status.ey)) A relation is in 61' if and only if it is in 31' and every non. &1"#+T) ?e can insert the information that "< is located in Athens.. D#D#T#) ?e can delete the shipment.FD "econd 1ormal 'orm (definition assuming only one candidate . we lose the information that the status for Athens is 8=). 'or example. D#D#T#) &f we delete the only "#(>1D tuple for a particular city. in general 8. even though "< does not currently supply any parts. "till we have problem with their operations in the following ways) &1"#+T) ?e cannot insert the fact that a particular city has a particular status G e.g. A2DAT#) The "V .ey.

"V "uper Iey) Adding primary .ey.ey. A relation is in !(1' if every determinant is a candidate . . #ither $ belongs to DR ($.@ence relation shown is in 81' and also in !(1'.nown as super .ey. ') Sis in !(1' ("Sset of Attributes. i. >ne of the following two conditions hold.ey with any attribute is . eg. Advisor G &t is in &1' by definition. 2V) . >r R is a "uper. R) is a Trivial Attribute ii.nown as trivial dependency.. ("V. if a set of attributes R which is subset of " and an attribute $ which belongs to Ds. Trivial dependency) &f the right hand side is a subset of the left hand side is . A relation "chema +(". 'SAll of functional dependency). ?hereas.

either R contains a .ey attribute is non transitively dependent on the primary . Again to overcome such problems we replace the original relation ("#(>1D.ey. .ey as its determinant.ey) or A is a prime attribute. 'Zis in 81'. by such relation we have removed transitivity from relation "#(>1D !oyce (odd 1ormal 'orm (!(1') A relation is in !(1' if and only if every nontrivial.G &t is in 61' since any non . (81') (Definition assuming only one candidate .e.ey. (the relation still contain some redundancy). 'eature) A relation + is in 81' if and only if it is in 61' and every non.irreducible 'D has a candidate . G &t is in 81' because it has no transitive dependencies. in this case) by two pro*ections ma. which is thus the primary .ing 81'.. Third 1ormal 'orm (81') Definition) A relation schema +Y".ey attributes are dependent on the entire .ey attribute is non. less formally. "( and (" are both in 81' 'unctional dependencies in the relation "( and (" Thus. +elation. we are faced with either the problem of searching "#(>1D to find every tuple for Dondon (and changing it) or the possibility of producing an inconsistent result (the status in Dondon might be given 6= in one tuple and 8= in another). transitively dependent on the primary . if for all nontrivial functional dependencies in ' of the form R[A.ey (i. Thus. (41o transitive dependencies5 implies no mutual dependencies).ey. R is a super. A database schema is in 81' if every relation schema included in the database schema is in 8 1'. if we need to change the status for D>1D>1 from 6= to 8=. left.ey. >r.ey)) A relation is in 81' if and only if it is ir 61' and every non.

as a sample form of 81' but it was found to be stricter than 81' because every relation in !(1' is also in 81': however a relation in 81' is not necessarily in !(1'. +elation '&+"T contains three determinants.ey of +. Any table if follow above mentioned three features of !(1'. !(1' is a special case in 81'. because in each case the (single) candidate .eys. R is a super.eys) are composite (there is no single .ey. only T"V.-alued Dependencies and 'ourth 1ormal 'orm &t was proposed. +elation '&+"T) and "#(>1D. then we will say that this table is in !(1'. +elation "2. &n each candidate .ey. so '&+"T is not in !(1'. 'Z is in !(1' if for every nontrivial 'D in ' of the form R[A where R & " and A&". which are not in 81'. "( and (" on the other hand.eys are there. which were in 81' are also in !(1'.ey in the only determinant in the reduction. because the determinant (&T$ is not a candidate . 6. "( and (". Iey attributes (candidate . are each in !(1'. #xplanation . 8.ey which identify record). are not in !(1' either: also that relation "2. 2V^ is a candidate . (&T$. Definition ) A normalized relation scheme +Y". %ore than one candidate . 'eatures) 3.ey at least one attribute is overlapping. %ulti. and T"V. "imilarly.!(1' (informal definition)) A relation is in !(1' if and only if the only determinants are candidate . namely "V. 9. "#(>1D is not in !(1' either. 2V^: of these.

'ourth 1ormal 'orm (91') Definition) Hiven a relation schema + such that the set D of 'Ds and %-D" are satisfied. &t is not in !(1' because it has a determinant '1ame. >r less formally.$ +. consider a set of attributes R and $ where R +. that is not a candidate .irreducible 'D has a candidate .3.eys. The relation schema .ey as its determinant. "TA.'1ame) AD-G"A!J Iey) ('1ame) +elations in &8(1' 1ow we can say that a relation is in !(1' if and only if every nontrivial left.Iey) ("&D.AD.ey. A relation is in !(1' if and only if the only determinant are candidate .

the same texts are used. 'inally. in which the attributes corresponding to teachers and texts are relation.valued dependency (%-D): %-Ds are a generalization of 'Ds. where higher normal forms must be considered.valued (see 'ig. each @(TR tuple consists of a course name.es use of a new .5 or simply 4A double arrow !5) G if and only if.-alued Dependencies and 'ourth 1ormal 'orm The 81' and !(1' normal forms most of the times serve the purpose well. %ulti. plus a relation containing text names (two such tuples are shown in the figure). for every possible legal value or +. %oreover. @owever. there can exist any number of corresponding teachers and any number of corresponding texts. we also assume that a given teacher or a given text can be associated with any number of course. called a *oin dependency (JD).ey of +. ?e assume that.es use of another new . %ulti. As you can see.ind of dependency. Then we say that ! is multi.ewise. and ( be subsets of the attributes of +. plus a relation containing teacher names. The intended meaning of such a tuple is that the specified course can be taught by any of the specified teachers and uses all of the specified text as references. either R $ is a trivial %-D or R is a super.e an example. for a given course. !.+ is in 'ourth normal 'orm (91') if for all multivalued dependencies of the form R $ DC. The next higher form of normalization is the fourth normal form. teachers. To understand it. the definition of 'ifth normal form ma.). we also assume G perhaps not very realisticallyO G that teachers and text are uite independent of one another: that is. the set of ! values matching a give (A value. . &t ma.ind of dependency.determines !. and texts. ( value) pair depends only on the A value and is independent of the ( value. no teacher who actually teaches any particular offering of a given course. and let A. we will ta. JDs in turn are a generalization of %-D.valued dependence) Det + be a relation. Di. there are occasions. called a multi. in symbols) A ! (read 4A multi.dependent on A. A database scheme is in 91' if all relation schema included in the database schema are in 91'. "uppose we are given a relation @(TR (@ for 4hierarchy5) containing information about course.

T#RT x:) appears in (TR if and only if course c can be taught by teacher t and uses text x as a reference. 'igure) -alue for relation (TR corresponding to the @(TR value in 'ig. and T#RT as indicated in 'ig.ey for @(TR. each tuple of @(TR gives rise to rn M n tuples. 1ote that the resulting relation (TR is 4all . >ne way to do this.3=. t6.T#A(@#+) t. however.ey5 (the sole candidate . by contrast. all possible combinations of teacher and text appear: that is. T#A(@#+. tl. for a given course. x6) both appear . The meaning of relation (TR is basically as follow8) A tuple ((>A+"#) c. where m and n are the cardinalities of the T#A(@#+" and T#RT" relations in that @(TR tuple. (TR satisfies the (relation) constraint. >bserve that. &f tuples (c. is simply to.1ow suppose that we want to eliminate the relation. was *ust T(>A+"#)). replace relation @(TR by a relation (TR with three scalar attributes (>A+"#. 8.valued attributes. As you can see from the figure. xl). (c. in (TR.

The problems in uestion are caused by the fact that teachers and texts are completely independent of one another: 6. 1ote) At this point. leading as usual to certain update anomalies. it does seem reasonable to suggest that there should be a way of 4further normalizing5 a relation li.. 1ote in particular that (TR satisfies no functional dependencies at all (apart from trivial ones such as (>A+"# G/ (>A+"#).ey G any 4ill .e.T#RT3U respectively (see 'ig. (TR is in !(1'.T#RT combinations for a given course: for example. xl) both appear also. you might ob*ect that the redundancy in (TR was unnecessary in the first place. so the decomposition is non. The problem is. (an we avoid such problemsW ?ell it is easy to see that) 3. therefore.e (TR. it is obvious that the design of (TR is bad and the decomposition into (T and (R is better.then tuples (c. these facts are not formally obvious. 'or example. %atters would be much improved if (TR were decomposed into its two pro*ectionsG call them (T and (RGon ((>A+"#. (1ote that the two pro*ections (T and (" are also all . which two tuplesW Any particular choice leads to a relation having a very unobvious interpretation and very strange update behavior (try slating the predicate for such a relationO G i.T#A(@#+) and ((>A+"#. tl. x6). t6. try slating the criteria for deciding whether or not some given update is an acceptable operation on that relation). it is necessary to insert two new tuples. 1ow. you might suggest that (TR need not include all possible T#A(@#+. however. one for each of the two texts. (c. together again.loss. and hence that the corresponding update anomalies were unnecessary too. to add the information that the physics course can be taught by a new teacher. %ore specifically. To add the information that the physics course can be taught by a new teacher. it should be apparent that relation (TR involves a good deal of redundancy.). all we have to do now is insert a single tuple into relation (T. two tuples are obviously sufficient to show that the physics course has two teachers and two texts. &nformally. 'igure) -alues for relation (T and (R corresponding to the (TR value in 'ig. (1ote that relation (TR can be recovered by *oining (T and (R bac.) Thus. in fact.ey5 relation must necessarily be in !(1'. since as already noted it is all . The trouble is.

it ma.valued dependencies are a generalization of functional dependencies. and the way to deal with them was also understood. and ( be subsets of the attributes of + Then we say that ! is multi.5 or simply 4A double arrow !5) .defined set of corresponding teachers. ( value) pair depends only on the A value an is independent of the ( value. @ere then is the formal definition of %ulti.valued dependence) Det + be a and let A. %-Ds always go together in pairs in this way. %ore precisely..e. although a course does not have a single corresponding teacher. !y 4well. there exist %-Ds that are not 'Ds). nevertheless. in the sense that every 'D is an %-D. x) in (TR depends on the value c alone. (^.dependent on A. that for a given course c an a given text x.valued dependencies. thus) A .dependent on A. %ulti. is interpreted analogously.e (TR was recognized very early on. &ntuitively. i. defined5. in every possible legal value of +. the %-DA ! holds if and only if the %-DA ( also holds. (>A+"# T#RT.e. what this %-D means is that. e uivalently. The ideas of the previous chapter are therefore of no help with the problem at hand. 4A multi. 'or this reason it is common to represent them both in one statement. an 'D is an %-D in which the set of dependent (right. the functional dependence (>A+"# T#A(@#+ does not hold.. %-Ds. The second %$D.hand side) values matching a given determinant . (ourse Teacher. The existence of 4problem5 !(1' relation li. in the sense that every 'D is an %-D. given the relation + TA.in symbols.5 or. here we mean. ! . at least intuitively. it was not until 3LFF that these intuitive ideas were put on a sound theoretical footing by 'agin7s introduction of the notion of multi. !.determines !. &t is easy to show that.5 Det us concentrate on the first %-D. Text ?e stated above that multi. the set of ! values matching a give (A value.determines !. A ! (read 4A multi. more precisely. &n the case of relation (TR there are two %-Ds that hold) (ourse (ourse Teacher Text 1ote the double arrows: the %-D A ! is read as 4! is multi.valued dependencies are a generalization of functional dependencies.ey and hence in !(1'). each course does have a well. ( 'or example) (ourse Teacher .es no difference which particular value of x we choose. !. @owever. but the converse is not true (i. if and only if. the set of teachers t matching the pair (c.

we can now see that the trouble with relations such as (TR is that they involve %-Ds that are not also 'Ds. + is in 91' if it is in !(1' and all %-Ds in + are in fact 4'Ds out of . ?hat is more. () be a relation. # uivalently.loss. whenever there exist subsets A and ! of the attributes of + such that the nontrivial (An %-D A ! is trivial if either A is a superset of ! or the union of A and ! is the entire heading) %-D A ! is satisfied.loss. therefore. (&n case fit is not obvious. 0 'ourth normal form) +elation + is in 91' if and only if.loss. Those two tuples are needed in older to maintain the integrity constraint that is represented by the %-D.e to replace (TR by those two pro*ections.. ?e would therefore li. we point out that it is precisely the existence of those %-Ds that leads to the necessity ofG for example G inserting two tuples to add another physics teacher. &n other words. then all attributes of + are functionally dependent on A. Then + is e ual to the *oin of its pro*ections on (A. Joint Dependencies and 'ifth 1ormal 'orm "o far in this chapter we have assumed that the sole operation necessary or available in the further normalization process is the replacement of a relation in a non.decomposed into n pro*ections but not . to discover that there exist relations that cannot be non. decomposed into three (or more). however.5 The two pro*ections (T and (R are both in 91'. This assumption has successfully carried us as far as 91'.loss way by exactly two of its pro*ections. functional dependency from a super.e exactly that replacement) 0 Theorem ('agin) ) Det + (A. then certainly A !.5 1ote in particular.loss. &t comes perhaps as a surprise. the only nontrivial dependencies ('D< or %-Ds) are in the form I R (i. ! and ( are sets of attributes. if A !. let alone an 'D 4out of a . since it involves an %-D that is not 'D at all. () if and only if + "atisfies the %-Ds A ! .hand side) value is always a singleton set.(left.ey I to some other attribute R). +elation (TR is not in 91'. Thus 91' is an improvement over !(1' in that it eliminates another form of undesirable dependency. !. meaning that the relation in uestion can be non. and an important theorem proved by 'agin in reference allows us to ma.ey.eys. To coin an ugly but convenient term. that 91' is always achievable: that is. any relation can be non.) The two pro*ection (T and (R do not involve any such %-Ds which is why they represent an improvement over the original design.e. therefore. (.. where A. we will describe such a relation as 4n.decomposed into an e uivalent collection of 91' relations. Thus.decomposable5 (for some n Z 6). that 91' implies !(1'. () and (!. +eturning to our original (TR problem.decomposed into two pro*ections but can be non.

and J" corresponding to the "2J relation value shown at the top of the figure: (b) The effect of *oining the "2 and 2J pro*ections (over 2V): (c) The effect of *oining that result and the J" pro*ection (over JV and "V). and the effect of the second *oin is then to eliminate that spurious tuple. and is therefore in 91'.3=. 2J. decomposable. 1ote to that 'ig.pro*ects database (but ignore >T$ for simplicity): a sample value is shown at the top of 'ig. . A relation that can be non. also shows) (a) The three binary pro*ections "2.decomposed into two pro*ections we will call 46.decomposable. the original "2J relation is 8.parts.ey and involves no nontrivial 'Ds or %-Ds at all. 8. &n other words.loss.5 (onsider relation "2J from the suppliers. 1ote that relation "2D is all .into m for any m Yn. 'igure) +elation "2J is the *oin of all three of its binary pro*ections but not of any two >bserve that the result of the first *oin is to produce a copy of the original plus one additional (spurious) tuple. thereby bringing us bac. to the original "2J relation.

in the portion of the real world that relation "2J is supposed to represent.36 is of course expressed in terms of relations. 'or brevity.J3) and (J3. is clearly true for any degree. we can rewrite the statement above as a constraint on "2J) &f ("3.ed to 'l and J& must all coexists in the same tuple5).1ote) The net result is the same whatever pair of pro*ections we choose for the first *oin. observe first that the statement 4"2J is e ual to *oin of its three pro*ections "2. 8. 23) appears in pro*ection "2 etc. and .world termsW Det us try to ma. independent integrity constraint. it is a fact that. To understand what that constraint must be. 23) (23.J&).independent property G i. and J" (The converse of this statement. ("3. #xercise) (hec. 23.. J&) appears in "2J for some J6. ("&.23.independent constraint (the sample values in 'ig.ey wrenches are used in the %anhattan pro*ect. and similarly for (23.J3) appears in "2J then ("&.26.23. though the intermediate result is different in each case.ey wrenches. 23.decomposable for some nZ 6 if and only if it satisfies some such (n. 1ow. J&) appeals in "2 appears in 2J appears in J" appears in "2J because the tripl "3.J3) also appears in "2J And if this statement is true for all timeGi. "&).8 relation "2J. ("6.J3) appear in "2J then ("3.33 are consistent with this hypothesis). if (for example) (a) "mith supplies mon. Ji) obviously appears in the *oin of "2. "uppose then that relation "2J does in fact satisfy that time.independent constraint on the relation (albeit a rather bizarre one) 1otice the cyclic nature of that constraint (4if "& is lin. the 8.J6).2i) appears in "2 if and only if ("&.) "ince ("3. decomposability of "2J could be a more fundamental.. 2J. time. J3) (J3. A relation will be n. this claim. let us agree in refer to that constraint as (onstraint 8D (8D for 8. The constraint says that.decomposable).way) cyclic constraint."&).e. @owever.e it a little more concrete by giving an example. 23.e. that if ("&:2&. 8.23. ?hat does (onstraint 8D mean in real. a property satisfied by all legal values of the relationGif the relation satisfies a certain time. and (b) %on. for all possible legal values of relation "2JG then we do have a time. 2@ and J"5 is precisely e uivalent to the following statement) if and and then the pair the pair the pair the triple ("3. the example of 'ig.

namely (onstraint 8D. 2J. should it beW And the answer is 42robably yes.. 1ote that a b.5 +elation "2& (with its JD) suffers from a number of problems over update operations.ey wrenches to the %anhattan pro*ect.. J"). Then we say that + satisfies the JD (A. b. 'or example. 38. and c. valid in this particular case. can be 8. we refer to that constraint as a *oin dependency (JD). "ample update problems in "2J . _. that relation "2J. that ma. *ust as a %-D or an 'D is a constraint on the relation concerned. if we agree to use "2. and similarly for 2J and J". J"). and c:.. ! . ! _ be subsets of the attributes of +. it means the subset ("V. then. A JD is a constraint on the relation concerned. !. then "mith supplies mon.(c) "mith supplies the %anhattan pro*ect. then relation "2J satisfies the JD ("2. To return to the main topic of discussion) !ecause (onstraint 8D is satisfied if and only if the relation concerned is e ual to the *oin of certain of its pro*ections._^M (read 4star A. problems that are removed when it is 8. with its JD ("2.2V) of the set of attributes of "2J. 'ig. 0 Joint dependency) Det + be a relation. and let A.decomposed.es the inference of d. 2J. The uestion is. ?e have seen.world constraint in effect. decomposed. ?e are saying there is no trapG because there is an additional real. from a. together normally do not imply d. _) if and only if every possible legal value of + is e ual to the *oin of its pro*ections on A.

to decompose such a relation into smaller components.(.ey must be in fact an 'D in which that . we have A !.decomposed into its pro*ections on (A. into the pro*ections specified by the *oin dependency.!. because (as we have seen) an %-D is a special case of a JD. relation "2* is not in <1'. (@owever.because we have not yet explained what it means for a JD to be implied by candidate . 'ormally.eysGit is a fact that any relation in <1' is automatically in 91' also. of course.en as a definition of multi. or (e uivalently) that JDs are a generalization of %-Ds. () if and only if the %-Ds A [[ ! and A [[ ( hold in A. &t satisfies a certain *oin dependency.*oin normal form (2J1') if and only if every nontrivial *oin dependency that holds for + is implied by the candidate . since they do not involve any (nontrivial) JDs at all.5 +elation "2J is not in <1'. 1ote) ?e explain below what it means for a JD to be 4implied by candidate .A() if and only if it satisfies the %-Ds A !. after 8. and J" are each in <1'. "ince this theorem can be ta.decomposition.eys of +. there does not exist a still higher form of dependency such that JDs are merely a special case of that higher form G so long as we restrict our attention to dependencies that deal with a relation being decomposed via pro*ection and recomposed via *oin.eys. to the effect that +(A. that is certainly not implied by its sole candidate .decomposability is not implied by the fact that the combinations ("V. ?e have also seen that it is possible. namely (onstraint 8D. then other types of dependencies might come into play). the three pro*ections "2. 23. because (a) it can be 8 decomposed and (b) 8. which we now define) 'ifth normal form) A relation + is in <1'. + (A. That decomposition process can be repeated until all resulting relations are in fifth normal form. also called pro*ection. To state this differently. !) and (A. That is.ey being the combination of all of its attributes).ey. can now be restated as follows) 3. +eturning now to our example. JV) is a candidate .() satisfies the JD (A!.ey (that . the term 4dependency5 in a very special sense).!.valued dependency. namely. we can see that the problem with relation "2J is that it involves a JD that isnot an %-D. if we permit other decomposition and recomposition operators. 2V.A() 1ote) &t follows from the definition that *oin dependencies are the most general form of dependency possible (using. Although it might not yet be obvious.() can be non. and probably desirable.loss. it follows that an %-D is *ust a special case of a JD.(SM(A!. &n fact any %-D that is implied by a candidate . and hence not an 'D either. !y contrast.'agin7s theorem.

"TATA"^ T"1A%#.because the intuitive meaning of JDs might not be obvious. they are not %Ds and not 'Ds. pro*ection.*oin normal form). That is. "TATA") and ("V. the same cannot be said for JDs rmal that is.) 'or example. (&T$). That is. TT"V. @ence the process of determining when a given relation is in 91' but not in <1'. T"V. &t can be further decomposed in several nonloss ways. . relation " is e ual to the *oin of its pro*ections on ("V.. ("V) and T"1A%#).eys.less. T"V. "1A%#". "uppose once again that the familiar "A22D&#+" relation " has two candidate . 'or if a relation is in <1'. @owever. it satisfies the JD M (A. As the foregoing example suggests.eys.eys.) This JD is implied by the fact that ("V) is a candidate .eys if and only if each of A. of course. relation " also satisfies the JD. a given JD M (A. and so could probably be decomposed to advantage..loss.eys and all JDs in +.!. the "A22D&#+" relation " is in <1'. . discovering all the JDs might itself be a nontrivial operation.ey. (&T$^^ That is. "1A%# . (&T$)) This JD is implied by the fact that ("V) and T"1A%#^ are both candidate .eys.decomposed into those pro*ections. as we saw earlier. and hence can be non.decomposed into an e uivalent of location of <1' relations: that is. (This fact does not mean that it should be so decomposed. and so the only valid decompositions are ones that are based on those candidate . "1A%#)._) is trivial if and only if one of the pro*ections A. the pro*ection over all attributes of +).now all candidate . !. #xperience suggests that such relations are pathological cases and li. that any given relation can be non.candidate .ey is the determinant. we note that it follows from the definition that <1' is the ultimate normal form with respect to pro*ection and *oin (which accounts for its alternative name. relation +._ is the identity pro*ection + (i. a relation in <1' is guaranteed to be free of anomalies that can be eliminated by ta. ! _) is implied by candidate .world interpretation). . is still unclear. TT "V.. !. whereas it is relatively easy to identify 'Ds and %-Ds (because they have a fairly straight forward real.eys. 'irst we consider a simple example..eys. Then that relation satisfies several *oin dependenciesGfor example. (#ach pro*ection in such a decomposition will consist of one or more of those candidate .ely to be rare in practice.. &n conclusion. _ is in fact a super.e. ?e now explain what it means for a JD to be implied by candidate . "TATA"^. we can decompose it in <1' so long as we . but every pro*ection in any such decomposition will still include one of the original candidate .ing pro*ections. and hence there does not seem to be any particular advantage in further reduction. <1' is always achievable.. The given. Di. only that it could be.ey for the relation in uestion. .. the only *oin dependencies are those that are implied by candidate .ewise.eys. plus zero or more additional attributes.

and JDs that apply to +. corruption or destruction.e pro*ections of the original &1' relation to eliminate any 'Ds that are not irreducible.. the data must be protected from accidental or intentional corruption or destruction e g data on national defense is vital to the security of a state There is safety of data processing in a chemical plant &n addition to the economic or strategic reasons for protecting data from un.valued attributes) #ach step of the reduction process consists of ta. The given constraints are used at each step to guide the choice of which pro*ections to ta. Ans. 8. 'irst a certain class of data is available only to those person who are authoize9 to access it This ma. The basic idea is as follows) Hiven some 31' relation + and some set of 'Ds. defined sense but are also in some way more desirable. Ta. This step will produce a collection of 61' relations.. there is a privacy dimension for data security and integrity "ecurity and &ntegrity Threats) .day. "ecurity ma database involves both policies and mechanism to protect the data and ensure that it is not accessed. altered or deleted without proper authorization As well as information is increasingly in an organization.authorization access. we systematically reduce + to a collection of 4smaller5 (i. thus) 3. The overall process can be stated informally as a set of rules. %-Ds. lower. Q 1 What is database security? 12p(ain the mechanism for maintainin$ database security. Ta. There are two dimensions for the protection of data in the database.es the data confidential e g the medical records of patients in a hospital are accessible to health care officer "econd. 6 Ta.nown person Data has to be protected in the database.e.e pro*ections of those 61' relations to eliminate any transitive 'Ds.degree) relations that are e uivalent to + in a certain well.ing pro*ections of the relations resulting from the preceding step. "o there are all database should be secure from unauthorized access or manipulations from the hand of un.to. more and more database created day. This step will produce a collection of 81' relations.e pro*ections of those 81' relations to eliminate any remaining 'Ds in which the determinant is n0hapter B System /mp(iementation 8echni>ues 9Dart 1.e next. we have been concerned with the techni ue of nonloss decomposition as an aid to database design. (The original relation + might have been obtained by first eliminating certain relation.The 1ormalization 2rocedure "ummarized Ap to this point in this chapter.

8 . E. @ardware failure also causes the problem of data destruction. A user can get access to a portion of. . alter or destroy the data in the database or ma. or integrity violation.e unauthorized copies of sensitive data. "ometimes failure of any portion effect the whole data. %alacious or &ntentional "ecurity and &ntegrity Threats ) "ome intentional security and integrity threats factors are as ) 3. An unauthorized user can bet access to a secure terminal or the password of an authorized user and compromise the database. & 2roper recovery procedures are normally used to recover from the failure occurring during transaction processing. "uch user could also destroy the data base file. A computer system operator or system programmer can intentionally by pass the normal security and integrity mechanisms. <. integrity should be needed. during a transaction processing if power supply becomes (fail) off then the computed data will not be transferred to the storage device and so data will be lost. the database which other users cannot access Also that user damage a part of the data accidentally then whole data may be corrupted e g if an application programmer accidentally delete some function or subroutine then whole of the program in database will be affected. 6. or example."ome security and integrity threats are) "ome types of threats can only be addressed using social."ometimes concurrent processing or concurrent usage of data gives problem and it will be lost or damaged 9 . "ome accidental security and integrity threats are) 3. The threats are either accidental or intentional. A dial in user may be assigned the identity of another dial in user who was disconnected accidentally or who hung up without going through a long off procedure. "o to avoid this @ardware failure security."ometime system error occur. behavior and control mechanism to damage the data. 6. "ometimes improper authorization will cause the problem. "o there are two types of security B integrity threats occur in the security integrity concept Accidental security and &ntegrity threats. which could lead to database security and .

Authorized users could pass on sensitive information under pressure or form personal gain. using communication channel and compromise the database. 9. audit and recovery data during logging process Also D!%" have some integrity constraint and validation procedure for the chec. legal and social environments.ept confidential otherwise unauthorized user compromises the database. and ma. their program by directly accessing database files. dis. 2rotection and -ersion %ethods of 2rotection) .eys and entry 3og to computing facility and terminals. >perating system gives protection to the data and progress both in primary B secondary memories Also users are established by operating system The D!%" transaction management. Aser identification and password have to be . (c)Administrative control) (ontrols are the security and access control policies that determine what information will be accessible to what class of user and the type of access that will be allowed to this class.8. An unauthorized person can get access to the computer system. <. of user and procedures. 2rotection) 'our levels of defense (protection) are generally recognized for database security. physically or by. etc) within the organization and when being transmitted from one location to another must be maintained.s and . (b) 2hysical "ecurity) mechanism includes appropriate loc. pac. An organization depends on these to provide a certain degree of protection. These are) (a) @uman 'actor) which encompass the ethical. (d) >" and D!%" mechanism) These are very good feature of security. "ystem and application programmers could by pass normal security in. "ecurity of the physical storage devices (magnetic tapes.ing changes and copies for illegal use.

!ut for sensitive and important data on every step authentication . This is best way to protection. Also with distribution different sites can have different levels of security and protection of data. "o before ma.2rotection 2rotection is the branch of security when you want to safe the data from unauthorized access by using different mechanisms and ways Then these ways and mechanisms are protected.nows the identity of the user ma. card or . &f we ignore the space between words and the punctuation and by substitution can be made by shifting each character by a different random amount. .age of some portion of the fragment data may be not as disastrous as the lea. security enforcement in distributed system can be enhanced by distributor "ensitive information can be fragmented and stored at dispersed sites. The user enters the user name or number B then authenticates her(himself) by the password These are used once for the initial signs on to the system. The enforcement of the security policies in the data base system re uires that the system . identification procedures can be operated. The lea. "ometimes badge.eys are used for access. (b) Distributed system 2rotection) 'or the protection of data.ing any re uest the user has to identify her or himself to the system and authenticate the identification to confirm that the user in fact the correct person. "ome protection methods are) (a) &dentification and Authentication) The authorization mechanism prepares the user profile for user and indicates the portion of the database accessible to that user and the mode of access allowed. !efore transmission data should be encrypted. then the above message can be transformed into as) 4"bolrbsfdpnn*ohupxbsebnB*ulbs5 The above process is cryptography. (c) (ryptography and #ncryption) "uppose defence want to send or transmit a message with protected way. This is also called #ncryption of data.now the identity of security policies in the database system re uires that the system . The message is) 4Than.ing the re uest.age of unfragmented data. The simplest and most common authentication scheme used is a password to authenticate the user.s are coming towards /A%+&T"A+ >ne method of transmitting this message is to substitute a different character of the alphabet for each character in the message.

"ecurity in a database involves both policies and mechanism to protect the data and ensure that it is not accessed. "uch operations are the result of the action such as update. altered or deleted without proper authorization. The integrity is also the maintenance of data. Data &ntegrity This re uires that there is a need for guarding against invalid database operations. &n database integrity there are some types of constraints that the database has to enforce to maintain the consistency and validity of the data.nown person. for data information stored in memory. Actually integrity is obtained from security. more and more database are created day. This correctness has to be preserved in the presence of concurrent operations. What is database security? Ans. insert or delete."ystem &ntegrity. Data has to be protected in the database. The mechanism that is applied to ensure that the data in the database is correct and consistent is called Data &ntegrity. "o there are all database should be secured from unauthorized access or manipulations from the hand of un.day. which is damaged by unauthorized person. The centralized integrity constraints can be maintained in a system catalog (data dictionary) and can be accessible to the database users via the uery language. Database integrity involves the correctness of data. alteration or deletion of data in the database does not change the validity of the data security and integrity concepts are distinct but are related with each other. error in the user7s operations and application programs and failures in @ardware and "oftware &ntegrity has recovery system for the lost and damaged data and also chec. Q. &ntegrity constraints are hard to understand when we use these constraints rule in application program.to. (entralizing the integrity chec. There are two .ing directly under the D!%" reduces duplication and ensure the consistency and validity of the database. As well as information is increasingly in an organization. 2. &ntegrity implies that any properly authorized access. An operation here is used to indicate any action performed on behalf of a user or application program that modifies the state of the database.

To ensure consistency of the database. data on national defense is vital to the security of a state. A transaction that changes the contents of the database must alter the database from one consistent database state to another.r What are the desirab(e properties of transaction? Ans. 'irst a certain class of data is available only to those person who are authorized to access it.nown consistent state A transaction is a logical unit of wor. A transaction ends when it is committed or rolled bac. the medical records of patients in a hospital are accessible to health care officer. that must be either entirely completed or aborted: no intermediate states are acceptable. A transaction is a logical unit of wor. This ma. that contains one or more "ND statements..dimensions for the protection of data in the database. Discuss the concept of transaction in detai(. &n addition to the economic or strategic reasons for protecting data from un. the data must be protected from accidental or intentional corruption or destruction e. .world database transactions are formed by two or more database re uests. corruption or destruction. (undone from the database). "econd.g.g. There is safety f data processing in a chemical plant. A transaction is an atomic unit. %ost real.es the data confidential e. either explicitly with a (>%%&T or +>DD!A(I statement or implicitly when a DDD statement is issued. Q. . A database re uest is the e uivalent of a single "ND statement in an application program or transaction. 3. A transaction begins with the first executable "ND statement. The effects of all the "ND statements in a transaction can be either all committed (applied to the database) or all rolled bac. every transaction must begin with the database in a . there is a privacy dimension for data security and integrity.authorization access.

#xecuting successfully means that a single statement was) .ing account 0 +ecord the transaction in the transaction *ournal &f all three "ND statements can be performed to maintain the accounts in proper balance. be applied to the database. 'igure 6 illustrates the ban. the transaction can consist of three separate operations) 0 Decrement the savings account 0 &ncrement the chec.correct.ing transaction example "tatement #xecution and Transaction (ontrol) A "ND statement that runs successfully is different from a committed transaction.ing account. @owever. the entire transaction must be rolled bac. when a ban. if a problem such as insufficient funds. the effects of . customer transfers money from a savings account to a chec.ing database.(onsider a ban. or a hardware failure prevents one or two of the statements in the transaction from completing. the transaction can. invalid account number. so that the balance of all accounts is .

The changes made by the "ND statements of your transaction become permanent and visible to other users only after your transaction has been committed. statements. runs successfully. +un without error as an atomic unit... the transaction can be rolled bac. An example of such an error is attempting to insert a duplicate value in a primary .. &f at any time during execution a "ND statement causes an error. 1ote that users cannot directly refer to implicit save points in rollbac. &f the statement is a DDD statement. . is as if that statement had never been run. #rrors discovered during "ND statement parsing. such as a syntax error. by issuing a +>DD!A(I statement. (competition for the same data) can also cause a statement.02arsed 0 'ound to be a valid "ND construction 0.ey. 'or example. This ma. for example.running transactions and to resolve in.. until the transaction that contains the statement is committed. @owever. have not yet been run. level rollbac.level rollbac. #rrors discovered during "ND statement execution cause statement. The effect of the rollbac. This operation is a statement. so they do not cause a statement. A statement.. then the implicit commit that immediately preceded it is not undone. A "ND statement that fails causes the loss only of any wor.es it easier to monitor long. >nly other users7 transactions that started after yours will see the committed changes.. that preceded it in the current transaction. all rows of a multirow update are changed.s.. and all of the changes of the statement can be undone. &t does not cause the loss of any wor.level rollbac. (ommitting means that a user has explicitly or implicitly re uested that the changes in the transaction be made permanent. "ingle "ND statements involved in a deadloc. it would have performed itself. An explicit re uest )means that the user issued a (>%%&T statement. all effects of the statement are rolled bac. The user can also re uest a statement. rather than a transaction. 1A%# statement before you start the transaction. $ou can name a transaction using the "#T T+A1"A(@>1 . doubt distributed transactions.. level rollbac. An implicit re uest can be made through normal termination of an application or in data definition language.level rollbac.

Thus all transactional resources are held through a statement suspend and resume. utilities and networ. +esumable space allocation is suspended when one of the following conditions occurs) 0 >ut of space condition 0 %aximum extents reached condition 0 "pace uota exceeded condition 'or nonresumable space allocation. ('ig. there is a core collection of services. then those updates will be undone. that coordinates the flow of transactions through the system. "uspending a statement automatically results in suspending the transaction. Therefore the transaction either executes in its entirety or is totally canceled. The term 4T2 system5 is generally ta. the suspended statement automatically resumes execution. 2rocess of Transaction. the suspended operation automatically resumes. and later resuming. instead of the database server returning an error to the user. ?hen the error condition disappears (for example. After the error condition is corrected. This feature is called resumable space allocation and the statements that are affected are called resumable statements. these conditions result in errors and the statement is rolled bac. one or more database systems. A D!%" must provide transaction processing system (T2 "ystem) to guarantee that if the transaction executes some updates and then a failure occurs due to some reason before transaction reaches its termination. Transaction processing systems provide tools to help software development for applications that involve uerying and updating databases.ing software. the execution of large database operations in the event of space allocation failures. including application generators. "ome D!%" provides a means for suspending. A statement runs in a resumable mode only when the client explicitly enables resumable semantics for the session using the ADT#+ "#""&>1 statement..en to mean a complete system. This enables an administrator to ta.) .+esumable "pace Allocation. as a result of user intervention or perhaps sort space released by other ueries).e corrective action. called the T2 monitor. ?ithin a Ti7 system.

performance optimization.processing system. 0 Dow +esponse Time) >n. properly transaction. 0 "ecurity) "ensitive information must be carefully protected since system is accessible to many users .line SZ users are waiting.&n order to wor. database security 0 "ystem Administrator maintains transaction. 0 @igh +eliability) (orrectly trac.s state.sec. . An executable "ND statement is a "ND statement that generates calls to an instance. does not lose data.using input from customer and also provides complete description of functionality from customers and users point of view.processing system needs following system re uirements 0 @igh Availability) "ystem must be on. A transaction begins when the first executable "ND statement is encountered. and %aintenance of a T2" 0 "ystem Analyst specifies system. 0 Dong Difetime) (omplex systems are not easily replaced. &mplementation. deals with failures and congestion. 0 Database Designer specifies structure of data that will be stored in database 0 Application 2rogrammer implements application programs (transactions) that access data and support enterprise rules 0 Database Administrator maintains database once system is operational) space allocation. +oles in Design. controlled concurrency. monitors inter connection of hardware and software modules. 0 @igh Throughput) %any users SZ many transactions. including D%D and DDD statements. 0 %ust be designed so they can be easily extended as the needs of the enterprise change.line and operational while enterprise is functioning.

ing permanent the changes performed by the "ND statements within the transaction. 1ote that the data changes for a committed transaction.. This writing ta. &t can happen before the transaction commits or alternatively. segment to record the rollbac. 1ote that Applications should always explicitly commit or roll bac. A transaction ends when any of the following occurs) 0 A user issues a (>%%&T or +>DD!A(I statement without a "A-#2>&1T clause. After one transaction ends. information contains the old data values changed by the "ND statements of the transaction. The current transaction is rolled bac. (ommit Transactions (ommitting a transaction means ma. transactions before program termination. and the change to the rollbac.es place when it is most efficient for the database to do so. the following has occurred) 0 D!%" has generated rollbac. ?hen a transaction is committed. !efore a transaction that modifies data is committed. &f the current transaction contains any D%D statements.. These changes may go to dis. D!%" assigns the transaction to an available undo tablespace or rollbac. The redo log record contains the change to the data bloc. entries for the new transaction. or ADT#+. segment records in buffers in the "HA that store rollbac. These changes may go to dis. D+>2. 0 D!%" has generated redo log entries in the redo log buffer of the "HA. 0 The changes have been made to the database buffers of the "HA.ground process. before a transaction is committed. segment data. +#1A%#. the following occurs) . The current transaction is committed. stored in the database buffers of the "HA. before a transaction is committed. D!%" first commits the transaction. are not necessarily written immediately to the data files by the database writer (D!?n) bac. bloc. and then runs and commits the DDD statement as a new. the next executable "ND statement automatically starts the following transaction. 0 A user disconnects from D!%". single statement transaction.?hen a transaction begins. 0 A user process terminates abnormally. it can happen some time after the transaction commits. 0 A user runs a DDD statement such as (+#AT#. The rollbac.

6. of a transaction due to abnormal process termination 0 +ollbac. &t also writes the transactions "(1 to the online redo log file. "avepoints in Transactions . This atomic event constitutes the commit of the transaction. D!%" lets you roll bac. without referencing any savepoints. D!%" undoes all changes made by all the "ND statements in the transaction by using the corresponding undo tablespace or rollbac. and the corresponding uni ue system change number ("(1) of the transaction is assigned and recorded in the table. +ollbac. 0 The log writer process (DH?+) writes redo log entries in the "HA7s redo log buffers to the online redo log file. means undoing any changes to data that have been performed by "ND statements within an uncommitted transaction. D!%" uses undo tablespaces or rollbac. of Transactions) +olling bac.s of data. All types of rollbac. you can roll bac. execution error) 0 +ollbac. segments to store old values. segment records that the transaction has committed. the trailing portion of an uncommitted transaction to a mar. an entire uncommitted transaction. 0 D!%" releases loc. of a transaction due to user re uest 0 +ollbac. The redo log contains a record of changes. the following occurs) 3. of all outstanding transactions when an instance terminates abnormally 0 +ollbac.er called a savepoint. segment. to a savepoint 0 +ollbac. 0 D!%" mar. The transaction ends. of incomplete transactions during recovery &n rolling bac.s use the same procedures) 0 "tatement. (due to statement or deadloc.level rollbac. Alternatively.s held on rows and tables. 8. D!%" releases all the transaction7s loc.0 The internal transaction table for the associated rollbac.s the transaction complete. an entire transaction.

s. After a rollbac. only the statements run after the savepoint.. D!%" releases the data loc. &f a procedure contains several functions. To ma.run the function with revised parameters or perform a recovery action. "avepoints divide a long transaction into smaller parts. you can use savepoints throughout a long complex series of updates.e an error. but all savepoints that were established after the specified one are lost. so if you ma.ers called savepoints within the context of a transaction. >ther transactions that want to update previously loc. Asing savepoints. to a savepoint. Transaction 1aming . a rollbac. the following occurs) 0 D!%" rolls bac..s ac uired previous to the savepoint.s obtained by rolled bac. >ther transactions that were waiting for the previously loc. your wor. wor. at any point within a long transaction. 'or example. The transaction remains active and can be continued. statements. 1>?A&T before issuing A2DAT# or D#D#T# statements. 0 D!%" releases all table and row loc. 1ote that whenever a session is waiting on a transaction. if a function fails. you do not need to resubmit every statement.$ou can declare intermediate mar. it is easy to return the data to its state before the function began and re.ed rows can do so. Then. 0 D!%" preserves the specified savepoint. $ou then have the option later of rolling bac. you can arbitrarily mar. to a savepoint. performed before the current point in the transaction but after a declared savepoint within the. ?hen a transaction is rolled bac.e sure a transaction doesn7t hang if it cannot obtain a loc. "avepoints are similarly useful in application programs. then you can create a savepoint before each function begins.s ac uired since that savepoint but retains all data loc. transaction. to savepoint does not free rowloc. use '>+ A2DAT# X.ed resources can proceed.

phase commit mechanism is completely transparent to users who issue distributed transactions. This name is a reminder of what the transaction is about. The two. 'or example. ?hen you name a transaction. even if a networ. and triggers.now the transaction is distributed. Transaction names do not have to be uni ue: different transactions can have the same transaction name at the same time by the same owner.phase commit . A distributed transaction is a transaction that includes one or more statements that update data on two or more distinct nodes of a distributed database.doubt distributed transactions. guarantees that all database servers participating in a distributed transaction either all commit or all roll bac. and maintain data consistency. &n fact. Transaction names replace commit comments for distributed transactions. such as -\T+A1"A(T&>1. remote procedure calls. A (>%%&T statement denoting the end of a transaction automatically triggers the two. using a simple and memorable text string. phase commit mechanism also protects implicit D%D operations performed by integrity constraints. A two. the statements in the transaction. 0 Dog %iner can use transaction names to search for a specific transaction from transaction auditing records in the redo log. a database administrator can view transaction names in #nterprise %anager when monitoring system activity.phase commit mechanism. 0 Transaction names are written to the transaction auditing redo record. D!%" must coordinate transaction control over a networ. you associate the transaction7s name with its &D. or system failure occurs. users need not even . with the following advantages) 0 &t is easier to monitor long.phase (ommit %echanism &n a distributed database. 0 $ou can view transaction names along with transaction &Ds in applications. A two. 1ame a transaction using the "#T T+A1"A(T&>1 .. 1A%# statement before you start the transaction.running transactions and to resolve in..$ou can name a transaction. 0 $ou can use transaction names to find a specific transaction in data dictionary tables. $ou can use any name that enables you to distinguish the transaction. The Two.

output(A) . &n the event of a long. D!%"7s recovery facilities enable database administrators at other sites to return their databases to the earlier point in time also. After the failure is repaired and communication is reestablished. t S t G <= . read(A.t) (or read(A) when t is not important) G This operation is used to read database element A into local variable t. 6.term failure. there are only two really important operations) 3.doubt distributed transactions consistently on all involved nodes. the +#(> process of each local D!%" server automatically commits or rolls bac.term failure. 9. 1o coding or complex statement syntax is re uired to include distributed transactions within the body of a database application. @owever. Det us consider an example to understand the use of read and write operation. The recoverer (+#(>) bac.ground process automatically resolves the outcome of in. any in.doubt distributed transactions . there are an enormous number of potential operations that a transaction can perform. &f a database must be recovered to a point in the past. then the set of operations performed are) 3.t) 6.ed resources that are held indefinitely as a result of the long.. read(A. ?e could ma.e the memory management more explicit by using following operations) 8.?rite >peration "ince a transaction is a general program. ?e will assume that the buffer manager insures that database element is in memory. D!%" allows each local administrator to manually commit or roll bac. failure. write(A.t) (or write(A) when t is not important) G This operation is used to write the value of local variable t to the database element A. This operation ensures that the global database remains consistent. This operation is used to copy the bloc. "uppose that we want to transfer ###BOT_TEXT###lt;= from account A to account !. +ead. any distributed transactions that are in doubt as a result of the failure This option enables the local database administrator to free any loc.distributed transactions in which the commit was interrupted by any type of system or networ.mechanism to commit the transaction. input(A) G This operation is used to read database element A into local memory buffer. containing A to dis.

ing account to a savings account. atomic. the transaction fails. has to ensure that the change leaves the system in a stable and consistent state. software system. that if any individual action fails. the transaction as a whole must fail. consistency. ! S ! . &f either one fails. AT>%&(&T$ +esults of a transaction7s execution are either all committed or all rolled bac. read(!. is one in which the following four fundamental transactional properties are satisfied) atomicity. or none do. and durability (A(&D)..t) <. ?ithin the 'lute !an.. ! S 3.t) The first sep is used to read amount in account A with the help of read operation into the local variable t. customer may interact with 'lute7s AT% and transfer money from a chec.e effect. write(A.. a transfer transaction involves two actions) debit of the chec. Transaction 2roperties (Acid 2roperties) Any change to system state within a transaction boundary.ing account and credit to the savings account. we reduce the vale of t by <=. value of account ! is read into local variable t. &n step 6. both actions must complete successfully.8. A 'lute !an.5 &n describing a transaction as a unit of wor. therefore. As an example consider two transactions) T&) T6) !#H&1 A S A C 3==. which is incremented by <= in step <. we are describing one fundamental property of a transaction) that the activities within it must be considered indivisible that is. 3== #1D !#H&1 A S 3. 'or the transfer transaction to be successful. "tep 8 is used to write bac. -alue of t is written to account ! in step E with the help of write operation.=EM! #1D . ?e will examine each property in detail. and. &t is common to refer to a transaction as a 4unit of wor. The atomic property of transactions dictates that all individual actions that constitute a transaction must succeed for the transaction to succeed.t) 9. conversely. &n step 9.=EMA. t S t C <= E. write(!. isolation. A transactional unit of wor. All changes ta. the updated value to account A with the help of write operation.

! S ! G 3==.=EM! The D!%" /s view of the second schedule is T3) T6) read(A). A S 3.&ntuitively.=EMA.=EMA. and consistency must be restored when the transaction completes. &solation . suppose that you define a rule that. ! S 3.line and can7t be written.=== out of the country. 'or example. from the rest of the database. read(A). the net effect must be e uivalent to these two transactions running serially in some order. The data in the store may not be consistent during the duration of the transaction. if an integrity constraint can7t be satisfied then the transaction is rolled bac. A database or other persistent store usually defines referential and entity integrity rules to ensure that data in the store is consistent.defined integrity constraints. but the inconsistency is invisible to other transactions. ! S ! . the first transaction is transferring == from !7s account to A7s account. read(!). 3==.. T&) T6) A S A C 3==. This defines a transaction as legal only if it obeys user. &f the audit table7s dis. read(!). A transaction that changes the data must ensure that the data remains in a consistent stateG that data integrity rules are not violated. if both are submitted together. write(A). the transaction is aborted. write(!) write(!) 'ig. regardless of whether the transaction succeeded or failed. write(A). ! S 3.versa. is off. a row is added to an audit table so that you can prepare a legally re uired report for the &+". @owever. A possible interleaving (schedule) (onsistency The database is transformed from one valid state to another valid state. There is no guarantee that T& will execute before T6 or vice. &llegal transactions aren7t allowed and. The second is crediting both accounts with a E` interest payment. !ut what about this T&) T6) A S A C 3==. A S 3.=EM! This is >I. after a transfer of more than =. 2erhaps for performance reasons that audit table is stored on a separate dis.

Antil the transaction commits. or 4dirty read5. a transaction may read data that is in the process of being changed by another transaction but that has not yet been committed. This situation is illustrated at the top of the figure < as activities occurring over time. That is. because the transaction could roll bac. The name 4repeatable read5 for this level of isolation comes from the fact that a transaction with this isolation level can read the same data repeatedly and be guaranteed to see the same value. The isolation property of transactions dictates how concurrent transactions that act on the same subset of data behave. (oncurrent transactions are transactions that occur at the same time. The most restrictive form of isolation is serializable.committed isolation levels: effectively ensuring that transactions that act on the same piece of data are serialized and will not execute concurrently. read and read. the transaction that read the data would have read a value that was not committed. At the lowest level of isolation. such as shared multiple users accessing shared ob*ects. This level of isolation. &f the first transaction is rolled bac.?hen multiple transactions are in progress. . additional application logic must be executed to handle the effects of some transactions having read potentially erroneous data. This level of isolation is more restrictive (and conse uently provides less concurrency) than a read uncommitted isolation level and helps avoid the problem associated with the latter level of isolation. the change.read uncommitted. acting on the same subset of application state. An isolation of read committed ensures that a transaction can read only data that has been committed. This level of isolation combines the properties of repeatable. The isolation portion of the A(&D properties is needed when there are concurrent transactions. &f other transactions read intermediate or transient states caused by a transaction in progress. the changes it has made should be treated as transient state.. The safeguards used by a D!%" to prevent conflicts between concurrent transactions are a concept referred to as isolation. one transaction may want to read the same data another transaction has changed but not committed. An isolation level of repeatable read signifies that a transaction that read a piece of data is guaranteed that the data will not be changed by another transaction until the transaction completes.can cause erroneous results but ensures the highest concurrency. the isolation property determines the degree to which effects of multiple transactions. are isolated from each other.

0 Degree 8 Degree 6 plus other transactions do not dirty data read by a transaction before the transaction commits. Transactions are serializable when the effect on the database is the same whether the transactions are executed in serial order or in an interleaved fashion.'ig. &t is important to note that a serialized execution does not imply the first transactions will automatically be the ones that will terminate before other transactions in the serial order. As you can see at the top of the figurel. #ach set of changes must be isolated from those of the other users. (oncurrently executing transaction As an example.<. in isolation. wor. you can see several ways in which these transactions may execute. it7s not acceptable for one person7s changes to be 4clobbered5 when the second person saves a different set of changes. .ing as though he or she is the only user. if two people are updating the same catalog item. at the bottom of the figure 3.<. Degrees of &solation 0 Degree = A transaction does not overwrite data updated by another user or process (4dirty data7) of other transactions 0 Degree 3 Degree = plus a transaction does not commit any writes until it completes all its writes (until the end of transaction). The effect on the D!%" is that th transactions may execute in serial order based on consistency and isolation re uirements. Transactions 3 through Transaction 8 are executing concurrently over time. 0 Degree 6 Degree 3 plus a transaction does not read dirty data from other transactions. &f you loo. An important concept to understanding isolation through transactions is serializability. !oth users should be able to wor.

That is. and recover the transactions up to the coffee spill.versa.These were originally described as degrees of consistency by Jam Hray 'or example let us consider two transactions) 'irst transaction transfers == from !7s account to A7s "econd transaction credits both accounts with E` interest. if a programmer spills coffee into a dis.s. &f the airline reservation system computer gives you seat 66A and crashes a millisecond later. or file systems. (onsider a possible interleaved schedule Durability >nce committed (completed). Then what are the legal outcomes of running T& and T6W There is no guarantee that Ti will execute before T6 or vice. if both are submitted together. such as dis. showing that you had seat 66A. it won7t have forgotten that you are sitting in 66A and also give it to someone else. state changes made within a transactional boundary must be persisted onto permanent storage media. Det us assume at first A and ! each have ===. &f the application fails after the transaction has committed. The durability property of transactions refers to the fact that the effect of a transaction must endure beyond the life of a transaction and application. drive. a transaction are permanent and survive future system and media failures. the results of. the system should guarantee that the effects of the transaction will be visible when the application . it will be possible to install a new dis. 'urthermore. databases.

Active) This state is the initial state of a transaction. A transaction enter this state when the transaction is interrupted by an event such as a program exception or a system. and . 2artially committed) A transaction is partially committed after its final statement. and the database restored to its prior state before the transaction.5 or at the system recovery. . Any change committed by one transaction must be durable until another valid transaction changes the data.restarts. has been executed. A transaction may change is state form active to failed state. a transaction can be in many states. A transaction may change its state form active to partially committed one. Transactional resources are also recoverable) should the persisted data be destroyed. All updates made by the transaction are rolled bac.the database is restored to the state prior to the start of the transaction.) and wait to be finalized. <. A transaction enters active state when the first uery or update is encountered. . A transaction enters this state immediately before the 4commit wor. The transaction stays in this state while it is executing. 9. Apdates are guaranteed to be permanent.5. 'ailed) A transaction enters the failed state after the discovery that normal execution can no longer proceed. recovery procedures can be executed to recover the data to a point in time (provided the necessary administrative tas. A transaction enters this state after a 4rollbac. All operations are completed (in the memory buffer or on dis. 0 Iill the transaction) This option is selected if there is problem with transaction itself. Data is processed in buffer or on dis.wor. 6. /There are two options after abort) 0 +estart the transaction) This option is selected only if there is no internal logical error. These states indicate the status of a transaction.s were properly executed). (ommitted) (ommit state occurs after successful completion. 8. Aborted) A transaction is aborted after it has been rolled bac. %ay also consider terminated as a transaction state A transaction enters this state after 4commit wor.5. -arious states in which a transaction can be are) 3.. "tates of Transaction During its execution.

Advantages of using concurrent execution of transaction are) 0 &ncreased processor utilization . . &f system is executing only one transaction then processor might not be always busy for example if the only transaction in the system is waiting for the completion of some &.'ig. >ne transaction can be using the (2A while another is reading from or writing to the dis. >n the other hand if system is executing more than one transaction at same time. utilization 0 !etter transaction throu3put . 0 &ncreased Dis. Thus.> operation. multiple transactions are allowed to run concurrently in the system. processor is also waiting and thus doing no tas. The schedule shown in fig.. represents an interleaved execution of two transactions. Transaction "tate Diagram Advantages of (oncurrent #xecution of Transaction (oncurrent execution of transaction means executing more than one transaction at the same time. processor might be always busy executing one or the older transaction. #nsuring transaction isolation while permitting such concurrent execution is difficult.

user D!%" to maintain the integrity of transactions by isolating a transaction from others executing concurrently.ed the balance on behalf of the first transaction in preparation for its update. +educed average wait time for transactions . the money deposited by the first is lost . is a variable associated with a data item in the database and describes the status of that item with respect to possible access operations to the item.0 +educed average response time for transaction .. thereby using the updated balance as a basis for its wor. As short transactions need not wait behind long ones. because the second transaction bases its update on the original balance. loc.s enable other concurrently executing transactions to read the same data but prohibit any transaction from writing the chosen data.ing data for reads can be important in certain situations. share loc. 0 +educed average turnaround time for transactions . intensive and mixed wor. ?hile D!%"s use exclusive loc. The second transaction would have waited.s for writes.s enable a multi.s are particularly critical in write. Doc. ?e could have avoided this scenario if the D!%" had appropriately loc. As with loc.s are commonly used for reads. As more transactions are completed in less time. data Two transactions read the same ban. As more than one transaction is executing at same time there is reduction in average turnaround time. What are the various &oc3in$ 8echni>ues for 0oncurrency 0ontro(? Ans. Turnaround time is the time interval between transaction submission and transaction completion. account balance. 0.ing data in preparation for writes. "hare loc. . each intending to ad3 money to it @owever.write) environments. because they can prevent the inadvertent loss of data or consistency problems with reads. A loc. preventing inconsistent analysis of the database. .load (read. Q. 'igure 3= 3 depicts lost update situation that could occur if a D!%" did not loc. L.

s can be ac uired on behalf of a transaction after the D!%" releases a loc.(onsider a situation that might occur without share loc. as shown in 'ig 3= 6 The first transaction reads the balances of multiple accounts (perhaps through multiple ueries) with the intention of calculating an accurate sum Another transaction transfers money from one account to another during this process The timing of this wor. This re uirement is the ultimate criterion for database consistency and is the motivation for the two. of concurrently executing transactions will leave the database in consistent state as it would have been if these transactions had executed serially. held by that transaction. .s are held until commit time.phase loc.ing its sum total inconsistent with what it should be &f share loc. which dictates that no new loc. thus ma.s ?ere held by the first transaction until transaction commit. this protocol generally means that loc.ing. "erializability is an important concept associated with loc.ing protocol. &n practice.s. this inconsistent analysis would not occur. &t guarantees that the wor. is such that it causes the first transaction to read only part of the effect of this transfer.

To provide concurrency in a system execution e uivalent to some serial execution of the system . on that item.e. i.s are a popular approach concurrency control. Transactions go into a ?A&T state till re uired loc. a loc.s after some have already been released.s other transactions out.s can have a significant impact on performance.s) if a transaction holds a write loc.s is the responsibility of the transaction management subsystem.now transaction semantics) 0 Hoal of loc. ?+&T# (or exclusive) loc. doing so inhibits concurrency and increases the li. is re uired for the following 0 1eed isolation (the 4&5 of A(&D)) 0 Hive each transaction the illusion that there are no concurrent updates. 1eed. $et loc. (system does not . +#AD (or shared) loc. on that item. Transactions re uest and ac uire loc. 0 Do it automatically . &n addition. ?hile it may benefit a given application to loc.s uic. i.s for a long period of time.ing protocols are based on two types of loc. simple recovery. certain integrity problems can arise if a single transaction ac uires loc. on an item no other transaction may ac uire a read or write loc.s) if a transaction holds a read loc. for Doc. transactions should hold all exclusive loc. loc.e. Transactions can not access data items unless they have the appropriate loc.ly may be inappropriate for some applications. loc.ed resources. a large amount of data (perhaps one or more tables) and hold these loc..1ot deterrninisticLutc6me *ust a consistent transformation Doc.elihood that other applications will have to wait for loc. . on an item no other transaction may ac uire a write loc.s until (>%%&T or +>DD!A(I time.s on data items which they wish to access and which they do not want other transaction to update. increasing the overhead associated with transaction processing. 0 @ide concurrency anomalies.Aside from their integrity implications.ing only small amounts of data and releasing these loc.s) . is available. Thus . %ost loc. Ac uisition of loc. Doc. 'or strict schedules .

re uests. re uest to the loc. depending on the D!%" contains the following information) the number of transactions currently. 6. The transaction re uesting the loc. on an ob*ect. and the loc. table entry.ed in shared mode. table entry for the ob*ect (indicating that the ob*ect is loc. of the loc. table which is a hah table with the data ob*ect identifier as the . on the ob*ect (this can be more than one if the ob*ect is loc.. &ndeed. 8. manager grants the loc. ?hen a transaction aborts or commits. manager grants the loc. the entry contains a pointer to a list of loc. manager updates the loc. and a pointer to a ueue of loc.ey. it must obtain a shared or exclusive loc.eeps trac. 1ow. table entry for the ob*ect and examines the loc. loc. is re uested. re uest at the head of the ueue for this ob*ect. on = and must hold on to the loc. on the ob*ect at the front of the ueue. by one). on an ob*ect is released. a record. on =. the ueue of re uests is empty. cannot be immediately granted.ed in shared mode).s.ed in exclusive mode. and no transaction currently holds a loc. +e uests According to the "trict 62D protocol. the transaction that made the re uest is wo. the nature of the loc. even .no transaction can read or update an item until the last transaction that updated it has committed and released the exclusive loc. if T8 re uests a shared loc. table entry for an ob*ect which can be a page. the loc.en up and given the loc. A loc.s issued to transactions is called the loc. manager maintains a loc. &f an exclusive loc. if there are several re uests for a shared loc. is suspended. The loc. ?hen a transaction needs a loc. the loc. (shared or exclusive). and so on. and among other things. the re uested loc. T67s re uest is ueued. and updates the loc. the loc. manager) 3. manager.s held by the transaction. it releases all its loc. D>(I %A1AH#%#1T The art of the D!%" that . before a transaction T reads or writes a database ob*ect =. holding a loc. and T6 re uests an exclusive loc. is re uested. and Anloc. all of these re uests can now be granted together.. &f a shared loc. on the ob*ect (which also implies the ueue of re uests is empty). The D!%" also maintains a descriptive entry for each transaction in a transaction table. until it commits or aborts. re uest is added to the ueue of loc. re uests for this ob*ect. and incrementing the number of transactions holding a loc. &mplementing Doc. it issues a. >therwise. &f this re uest can now be granted. its re uest enters the ueue behind that of T6.. and updates the loc. 1ote that if T& has a shared loc.. ?hen a loc. and the ob*ect is not currently loc.

where most of the (2A cycles are spent on process switching.s of building a D!%" on top of a general purpose operating system with preemptive scheduling.write operations might conflict if the ob*ects being loc. Datches are unset immediately after the physical read or write operation is completed. The problem is that a transaction T holding a heavily used loc. held by T&. on the same ob*ect causes both transactions to wait for each other and therefore be bloc. on the ob*ect and therefore decides to grant the re uest. once formed.T6 does not starve. based on their re uests for loc. !ut in the meantime. manager chec. called convoys. wait indefinitely while a stream of other transactions ac uire shared loc.. which are held over a long duration.ing to see if the re uest can be granted. The D!%" maintains a transaction table. tends to be stable. To ensure atomicity of these operations when several instances of the loc.O To prevent this. is ueued. commands must ensure that these are atomic operations. that it is waiting for. two read. updating the loc. This rule ensures that .s and finds that no other transaction holds a loc. The rationale for favoring the transaction thus is that it already holds a shared loc. every other transaction that needs this loc.s. another transaction might have re uested and received a conflicting loc. &n addition to loc. a D!%" also supports short duration latches. pages (the units of & =).ing and Anloc. the entire se uence of actions in a loc. which contains (among other things) a list of the loc. may be suspended by the operating system.ed before re uesting a loc.though the re uested loc. This interleaving interacts with the operating system7s scheduling of processes7 access to the (2A and can lead to a situation called a convoy. manager code can execute concurrently. and unloc.s. Antil T is resumed. re uest call (chec.) must be implemented as an atomic operation. (onvoys are one of the drawbac.ing. on an ob*ect for which it already holds a shared loc. to ensure that the same transaction does not re uest the same loc.ed do not correspond to dis. immediately if no other transaction holds a shared loc.. access to the loc. "uch a loc. that is. suppose that a transaction re uests an exclusive loc. is compatible with the loc. can uic. twice.ed forever. "uch ueues. This list can be chec. table. @owever. "etting a latch before reading or writing a page ensures that the physical read or write operation is atomic: otherwise. . on the ob*ect and inserting the re uest at the front of the ueue otherwise.ly become very long: a convoy.s currently held by a transaction. ?e have concentrated thus far on how the D!%" schedules transactions. table has to be guarded by an operating system synchronization mechanism such as a semaphore. a transaction may need to ac uire an exclusive loc. The implementation of loc.s and thereby prevent T6 from getting the exclusive loc. etc. The loc. on the ob*ect and ueuing it behind another transaction that wants an exclusive loc.. upgrade re uest is handled specially by granting the write loc. Atomicity of Doc. To understand why.

&f an ob*ect is unloc. operation for each data item that is accessed. or denied to other users. waiting is done when the ob*ect is found loc.ed (3) or unloc. . the ob*ect its use.ing can be done at different levels.ed by transaction.ed in two modes 3. Data item can be both read as well as written R. (hec.s are simple but restrictive.s A binary loc. the transaction that loc. #xclusive (R) loc. must be used when the potential for conflict exists. !inary loc. Doc.. is done after use.ed. as re uested using loc. "hare. &f a ob*ect is loc. Data items can be loc. exists when access is specially reserved for.ed (=).R instruction 2revent another transaction from reading a record until it is unloc." instruction Allow other transactions to read a record or other resource.loc.. is re uested using loc. the ob*ect after its termination very transaction re uires a loc. any transaction can loc.ing is done before entry is made. no other transaction can use that ob*ect.T$2#" >' D>(I" Any data that are retrieved by a user for updating must be loc. These levels includes database.#xclusive Doc. Data item can only be read ". until the update is completed or aborted.ed ( 6 "hared (") loc.ed.loc. !inary Doc.ed the ob*ect. table. A transaction must unloc. has only two states) loc. field.s An exclusive loc. unloc. and unloc. record.ed. The exclusive loc.

An exclusive loc. is issued when a transaction wants to write (update) data item and no loc.s are currently held on that data item. There are two basic re uirements of loc.ing) 0 +#AD operations (such as "#D#(T and '#T(@), ac uire "@A+# loc. before rows can be retrieved. 0 ?+&T# operations (such as A2DAT#, &1"#+T, and D#D#T#),. must ac uire #R(DA"&-# before rows can be modified. A "@A+#(") loc. permits reading by other users. 1o other transaction may modify the data that is loc.ed with an " loc..

0 ?hen an " loc. is obtained at the table level the transaction can read all rows in the table. 1o row or page level loc. are ac uired when the transaction reads a row (the " loc. at the table level covers all of the rows in the table, so additional loc.s are not necessary). 0 ?hen an s loc. is obtained at the page level the transaction can read all rows on the page. 1o row level loc.s are ac uired when the transaction reads a row (the " loc. at the page level covers all of the rows on the page). 0 ?hen an " loc. is obtained at the row level, the transaction can read the row. An #R(DA"&-# (R) loc. prevents access by any other user. An R loc. is the strongest type of loc.. 1o other transaction may read or modify the data that is loc.ed with an R loc.. An R loc. must be obtained (either at the table, page, or row level) when user data is updated, inserted, or deleted. 0 ?hen an R loc. is obtained at the table level, the transaction can read and modify all rows in the table. 1o row or page level loc.s are ac uired when the transaction reads or modifies a row. 0 ?hen an R loc. is obtained at the page level, the transaction can read and modify all rows on the page. 1o row level loc.s are ac uired when the transaction reads or modifies a row. 0 ?hen an R loc. is obtained at the row level, the transaction can read and modify the row.

Disadvantages of Doc.ing 2essimistic concurrency control has a number of .ey disadvantages, particularly in distributed systems) 0 >verheat) Doc.s cost, and you pay even if no conflict occurs. #ven read only actions must ac uire loc.a@igh overhead forces careful choices about loc. granularity. 0 Dow concurrency) &f loc.s are too coarse, they reduce concurrency unnecessarily. 1eed for strict 62D to avoid cascading aborts ma.es it even worse.

0 Dow availability) A client cannot ma.e progress if the server or loc. holder is temporarily unreachable. 0 Deadloc.. T?> 2@A"# D>(I&1H 2+>T>(>D

A loc.ing protocol is a set of rules followed by all transactions while re uesting and releasing loc.s +ules for Two,2hase Doc.ing 2rotocol are 0 Two transactions cannot have conflicting loc.s. 0 1o unloc. operation can precede a loc. operation in the same transaction. 0 1o data are affected until all loc.s are obtained that is, until the transaction is in its loc.ed point. Two phase loc.ing protocol is a protocol which ensures conflict,serializable schedules. 2hase 3) Hrowing 2hase 0 Transaction may obtain loc.s 0 Transaction may not release loc.s 2hase 6) "hrin.ing 2hase 0 Transaction may release loc.s 0 Transaction may not obtain loc.s

The protocol assures serializability. &t can be proved that the transactions can be serialized in the order of their loc. points (i.e. the point where a transaction ac uired its final loc.). Two,phase loc.ing does not ensure freedom from deadloc.s. (ascading roll,bac. is possible under two,phase loc.ing. To avoid this, follow a modified protocol called strict two, phase loc.ing. @ere a transaction must hold all its exclusive loc.s till it commits;aborts. +igorous two,phase loc.ing is even stricter) here all loc.s are held till commit; abort. &n this protocol transactions can be serialized in the order in which they commit. There can be conflict serializable schedules that cannot be obtained if two,phase loc.ing is used. @owever, in the absence of extra information (e.g., ordering of access to data), two, phase loc.ing is needed for conflict serializability in the following sense) 0 Hiven a transaction Ti that does not follow two,phase loc.ing, we can find a transaction T* that uses two,phase loc.ing, and a schedule for Ti and T* that is not conflict serializable. #xample of a transaction performing loc.ing) !egin(T3) Doc.(T3,A,") +ead (T3,A) Anloc.(T3,A) !egin(T6) Doc.(T6,!,") +ead (T6,!)

Two. table also records the type of loc. and later re uests are chec." on item 0 can ac uire a loc. table to record granted loc.. !lac.memory hash table indexed on the name of the data item being loc. 1ew re uest is added to the end of the ueue of re uests for the data item. manager can be implemented as a separate process to which transactions send loc.!) Display(AC!) (ommit(T3) (ommit(T6) Doc.s. manager replies to a loc." to a loc.ed to see if they can now be granted.phase loc.R ) 0 can convert a 3oc. &n loc. Anloc.ing with loc." (downgrade) This protocol assures serializability !ut still relies on the programmer to insert the various loc. grant messages (or a message as. . conversions) 'irst 2hase) 0 can ac uire a loc. rectangles indicate granted loc.s. loc.s held by each transaction.(T6.. manager maintains a data structure called a loc.. &f transaction aborts.ing as above is not sufficient to guarantee serializability G if A and ! get updated in.Anloc.) The re uesting transaction waits until its re uest is answered The loc. white ones indicate waiting re uests. table. between the read of A and !." 0 can release a loc. in case of a deadloc. the displayed sum would be wrong.eep a list of loc. &%2D#%#1TAT&>1 >' D>(I&1H A loc. Doc.ed.R (upgrade) "econd 2hase) 0 can release a loc. granted or re uested.. manager may .s and pending re uests The loc. and granted if it is compatible with all earlier loc. to implement this efficiently.R on item 0 can convert a loc. re uest by sending a loc. and unloc... re uests result in the re uest being deleted... re uests The loc.ing the transaction to roll bac.ing instructions. all waiting or granted re uests of the transaction are deleted.R to a loc. table is usually implemented as an in.

2roblems ?ith Two 2hase Doc.ing 2rotocol (onsider the partial schedule

'ig. An example schedule 1either T8 nor T9 can ma.e progress G executing loc.,"(!) causes T9 to wait for T8 to release its loc. on !, while executing loc.,R(A) causes T8 to wait for T9 to release its loc. on A. "uch a situation is called a deadloc.. To handle a deadloc. one of T8 or T9 must be rolled bac. and its loc.s released. The potential for deadloc. exists in most loc.ing protocols. Deadloc.s are a necessary evil. "tarvation is also possible if concurrency control manager is badly designed. 'or example) A transaction may be waiting for an R,loc. on an item, while a se uence of other transactions re uest and are granted an ",loc. on the same item. The same transaction is repeatedly rolled bac. due to deadloc.s. (oncurrency control manager can be designed to prevent starvation. 2+#(#D#1(# H+A2@ 2recedence graph is used, for testing serializability of a schedule. There is one node for each transaction in the schedule.

&f the precedence graph has a cycle the schedule is not serializable. &f it has no cycle, any ordering of the transactions which obeys the arrows is an e uivalent serial schedule, so the schedule is serializable. "olution >f &nconsistency 2roblem 2roblem of inconsistent analysis can be solved with the help of loc.s. Det us understand this with the help of following example &nitial values) A S ==, ! S ###BOT_TEXT###lt;==, and ( S ==

1ote that the above example leads to a deadloc.. "till, it is an acceptable solution because the A(&D properties are preserved.

Q. !. What are 0oncurrency 0ontro( Based on 8imestamp ,rderin$?

Ans. (oncurrency control ) concurrent control is a method used to ensure that database transaction are executed in a safe manner it process of management operations against a database so tat 3bta operation do not interfere with each other in a multi,user environment. >ne such that shared database is used in on,line manner is the database for an airline reservation that is used by many agents accessing the database from their terminals A database could also be accessed in hatch mode and it is concurrently used with the online mode The sharing of the database for read only access not cause any problem, but if one of the transactions running concurrently tries to modify same data item, it could lead to inconsistencies.

'urther if more than one transaction is allowed to simultaneously modify a data item in a database. &t could lead to incorrect value for the data item and an inconsistent database will be created. 'or example, suppose that two tic.ed agents access the online reservation system simultaneously to see, if the seat is available on a given flight or not and if both agents ma.e the reservation against the last available seat on that flight then a message of overboo.ing will be displayed. This will ma.e the data in inconsistent way ?e can say that concurrent processing of the programs, process or *ob are similar to 33ie multiprogramming, i.e. no. of *obs or programs processed simultaneously to achieve their independent B different goals according to their own re uirements.

"ome concurrency problem, when we apply a correct in the concurrent processing then we see that databases becomes inconsistent after the completion of the transaction &n the case of concurrent operation where a number of transactions are running and using the database we cannot ma.e any assumption about the order in which the statement belonging to different transactions will be executed The order in which these statements are executed is called "chedu8 "o the processing of these statements which are in schedule and used in concurrent operation and we cannot change the schedule is called concurrent schedule. "ome problems occur during scheduling and concurrent processing are as) (a) Dost update problem (onsider the two transactions given below and these transactions are accessing the same data item A. #ach of these transactions modifies the data item and write it bac.. Then we see that he concurrent processing of the modification of the value of A

(b) &nconsistent +ead 2roblem The lost update problem was caused, by concurrent modification of same data item. @owever concurrency can also cause problem when only one transaction modifies given set of data while that set of data is being used by other transaction. 'or example, if here are two transactions occur T< and TE in a schedule "uppose A and ! represents some data items having integer value then if both are concurrently processed then one processing will be reading the data and other will be modifying the data. "o it will create an inconsistency in the reading in next transactions that which data is correct for reading and which is incorrect.

ing of such type of records also prevents the addition of such phantom records. the extent of loss. this additional information reflects the transaction and uery during the concurrent processing. plus the current uantity is used by various pro*ects The phantom problem means if additional items are added. loc. An important part of concurrency is serial execution or seralizability.e two different transactions for different ordering. 'or example. we want to see that whether (or if) current uantity of some part purchased and received is e ual to the current sum of the uantity of that part in stoc. (d) "emantic of (oncurrent Transaction As we ta. This step will produce a collection of !(1' relations. The parts are withdrawn from the stoc. . "uppose two transactions.ept in stoc. • ot a candidate . it is not necessary that the two transactions are commutative. ?hen we let some independent transactions in a schedule by setting them in a order such that their execution becomes serially then these type of execution is called serial execution or serialzability. but semantics of these transactions are different during concurrent processing. and us number of pro*ects To chec.(c) The phantom phenomenon &n phantom.e. phenomenon let us consider an organization where parts are purchased and .ey.ing i.. whether syntax of operation as same. This problem could be prevented by using the concept of Doc.

6.serial schedule that is e uivalent to some serial execution of transactions is called a serializiable schedule.3 and schedule. "enalizablity A non. The purpose of "erializable scheduling is to find the non.consistency."ome problem of concurrent processing are removed by serial execution by setting or ordering the operation in a particular se uences. 8 is "erializable schedule and is e uivalent to the scheule. the result obtained will be consistent with some serial execution of the transaction and will be considered . 1ote that serializability also removes the problem of inconsistency. Definition) The given interleaved execution of some transactions is said to be serializable if it produces the same results as some serial execution of the transactions. schedule. "olution to these problems &f all schedules in concurrent environment are restricted to serializable schedule. 'or example m below written three schedules. &n serializablity the ordering of read and write operations are important before any operation to avoid any type of confusion or in.serial schedules that allows the transactions to execute concurrently without interfering with one another and therefore produces a database state that could be produced by a serial execution. "erializablity Test can be explained and solved with the help of 2recedence Hraph.

Thus one of the following concurrent control schemes is applied m concurrent database environment to ensure that the schedule produced by concurrent transaction are serializable.ing. is a variable associated with each such data item %anipulating the value of the loc. The intention of their mode of loc.the shared mode but none of thes transactions can modify the data item. is also called a read loc.) #xclusive Doc.correct Also testing of serialzability of a schedule is not only expensive but it is impractical some time.. &f a transaction T loc. ) "hare Doc.ing scheme to control the concurrent access and manipulation of the associated data item.ed in the exclusive mode until the shared loc.ing) Another method of loc. %anager. types of Doc. are re uested. variable is used m loc. no other transaction can access N or not even read N until the loc. Two 2hase Doc. is called D>(I&1H The value of loc. (b) "hared Doc. is also called update or writes loc.s a data item in an exclusive mode. and access a data item in . &n this once a loc.ing 'rom the point of view of Doc.ing is done by a subsystem of D!%" and such system is called Doc. on the data item is released. &n other words the release of the loc.ed in the share mode untill the exclusive loc. There are two.ing is to provide exclusive use of data items to one transaction.) (a) #xclusive Doc. Any number of transactions can concurrently loc. is released by all transactions holding the loc. A data item loc. is released no additional loc. is .ing "cheme (ii) Time stamp based order (iii) >ptimistic scheduling (iv) %ulti version techni ue Doc.. is released by transaction.ing a database can be considered as being made up of set of data items A loc.ed in a shared mode cannot be loc.. A data item in the exclusive mode cannot be loc. The loc.ing is called Two 2hase Doc. "ome concurrency controls schemes used to solve all the problems occurs during the concurrent scheduling are as discussed below Doc.

decreasing number. !oth of these phases is monotonic ie. Data item R is thus represented by triple R set as R) TR.s increase from = to maximum for the transaction and a contracting phase in which the number of loc. until all the loc. &t has two phases. value. The usual value assigned to each transaction is the system cloc. it is not allowed to re uest any further loc. This value can then be used on deciding the order in which the conflict between two transactions is resolved. ?x ) The write time stamp value.s on all data items re uired by the transactions have been ac uired. A transaction with a smaller time stamp value is considered to be an /older7 transaction than another transaction with a larger time stamp value. the number of loc.delayed.s it may need during its life before it releases any. +x) where each R is represented as) R The value of data item. &n optimistic scheduling each transaction has three phases) . (iii) >ptimistic "cheduling &n the optimistic scheduling schema. the largest time stamp value of any transaction that was allowed to write a value of R.s are only increasing in first phase and decreasing in the 6 phase. value at the start of the transaction. time stamp) (a) ?rite time stamp (b) +ead time stamp.s. ?x.ing +eading is done when re uired and &f any data item is found to be inconsistent at the end of a transaction then the transaction is rolled bac. a serial order is created among the concurrent transaction by assigning to each transaction a uni ue non. &n this way a transaction is obliged to re uest all loc.s held decreases from maximum to zero. This leads to control and lower the degree of (oncurrency. This value can then be the system wide cloc. (Ased for recovery procedure in D!%"). &t is called Time "tamp ordering. the philosophy is to assume that all data items can be successfully updated at the end of a transaction and to read in the values for data item without any loc. >nce a transaction starts releasing loc. value.s. +x ) The read time stamp value. A variation of this scheme is used in a distributed environment includes the site of a transaction appendid to the system wide cloc. a growing phase in which the number of loc. (ii) Time stamp based order &n time stamp based method. the largest timestamp value of any transaction that was allowed to read the current value R. There are two types of.

recalled or accessed by using different mechanisms.ward recovery. write operations can occur concurrently.(a) The read phase ) This phase starts with the activation of a transaction and in this all data items are read into local variables and any modification that are made are only to those local copies. power failure. software failure. (a) 'orward #rror +ecovery) &n this scheme when a particular error in the system is detected.ing the database formation in valid form and original form. the validation phase.made disasters.ing a new copy or version of data item R "o it is called %ulti versions &n this way a history of evolution of the value of data item is recorded in the database. free even through starvation can still occur. "ince they do not overwrite with each other. (c) ?rite phase) ?hen transaction passes. ". Also read operation can read any version. What do you mean by Database 5ecovery 8echni>ues? Ans. (b) -alidation phase) &n this when data items are modified it chec. Q. Database system term use the later scheme to recover from errors. the recovery system ma. +ecovery ) A computer system is an electromechanical device sub*ect to failures of various types. The aim of the recovery scheme is to allow database operations to be resumed after a failure with a minimum loss of information at an economically *ustifiable cost.e the system error free. accidents. that data after the procedure be rolled bac. The aim of the ad*ustment is to restore the system so that the effects of the error are cancelled and system can continue to operate. Database recovery techni ue or methods of ma. which follows the accounting principle of never overwriting a transaction Any charge are achieved by entering compensating transaction e g m this R is achieved by ma. and so it is dead loc.es appropriate ad*ustment based on the anticipated result which ma. (iv) %ulti version techni ue &t is also called time domain addressing scheme. ?ith multi version techni ue. which has been lost during the processing or due to failure of any type. +ecovery is the procedure through which data can be again collected.es an accurate assessment of the state of the system and then ma. This ends with commitment. natural or man. which is damaged by any failure.ely to be sub*ected to include failures of components or sub systems.. +ecovery schemes can be classified as forward or bac. . then whole transaction be written into secondary storage data An optimistic scheme does not use loc. The types of failures that the computer system is li. This scheme is not applicable to unanticipated errors.

an airline reservation system. "ystem and application programmers could by pass normal security in their program by directly accessing database files and ma. Authorized users could pass on sensitive information under pressure or form personal gain. 6. then there are some methods used to recover the data.s for bac. delete and modify (iii) To set log on the system transaction &n an online database system.up from floppy dis. point scheme is an additional component of the logging scheme described above. 3. "uch user could also destroy the database file.e bac. $ou can ta.(b) !ac. alter or destroy The data in the database or ma.ward recovery. 9.point is used to limit the7 volume of log information that has to be handled B processed in the events of a system failure involving the loss of volatile information The chec. physically or by using communication channel and compromise the database. .er for transaction identification through which we can access the data. for example. A scheme called chec. 8.point operation performed periodically copies log information onto stable storage. A chec.nown to be free of any errors. Q # What are the various Database Security /ssue? Ans. (ii) !y applying some operations on record These operations are insert.ing changes and copies for illegal use. "ome methods used to recover the data are) (i) To set transaction mar.ward #rror +ecovery) &n this scheme no attempt is made to extrapolate and no state is accessed which is error free. &n this system is reset to some previous correct state that is . A computer system operator or system programmer can intentionally by pass the normal security and integrity mechanisms. <. The log for this type of database contains a very large volume of information. there could be hundreds of transactions handled per minute. An unauthorized person can get access to the computer system. +ecovery in a centralized D!%" &n a centralized D!%" if there be any failure. An unauthorized user can bet access to a secure terminal or the password of an authorized user and compromise the database.e unauthorized copies of sensitive data.

Q. %. What do you mean by the term dead(oc3?

Ans. (onsider the following example) transaction Ti gets an exclusive loc. on ob*ect A, T6 gets an exclusive loc. on !, Ti re uests an exclusive loc. on ! and is ueued, and T6 re uests an exclusive loc. on A and is ueued. 1ow, T& is waiting for T6 to release its loc. and T6 is /waiting for T& to release its loc.O "uch a cycle of transactions waiting for loc.s to be released is called a deadloc. )(learly, these two transactions will ma.e no further progress. ?orse, they hold loc.s that may be re uired by other transactions. The D!%" must either prevent or detect (and resolve) such deadloc. situations. Deadloc. 2revention ?e can prevent deadloc.s by giving each transaction a priority and ensuring that lower priority transactions are not allowed to wait for higher priority transactions (or vice versa). >ne way to assign priorities is to give each transaction a timestamp when at starts up.The lower the timestamp, the higher the transaction7s priority, that is, the oldest transaction has the highest priority.

&f a transaction Ti re uests a loc. and transaction T* holds a conflicting loc., the loc. manager can use one of the following two policies)

?ait,die &f Ti has higher priority, it is allowed to wait, otherwise it is aborted ?ound,wait &f Ti has higher priority, abort T*, otherwise ti waits.

&n the wait,die scheme, lower priority transactions can never wait for higher priority transactions. &n the wound,wait scheme, higher priority transactions never wait for lower priority transactions. &n either case no deadloc. cycle can develop. A subtle point is that we must also ensure that no transaction is perennially aborted because it never has a sufficiently high priority. (1ote that in both schemes, the higher priority transaction is never aborted.) ?hen a transaction is aborted and restarted, it should be given the same timestamp that it had originally. +eissuing timestamps in this way ensures that each transaction will eventually become the oldest transaction, and thus the one with the highest priority, and will get all the loc.s that it re uires.

The wait,die scheme is nonpreemptive: only a transaction re uesting a loc. can be aborted. As a transaction grows older (and its priority increases), it tends to wait for more and more younger transactions. A younger transaction that conflicts with an older transaction may be repeatedly aborted (a disadvantage with respect to wound wait, but on the other hand, a transaction that has all the loc.s it needs will never be aborted for deadloc. reasons (an advantage with respect to wound,wait, which is preemptive).

Deadloc. Detection Deadloc.s tend to be rare and typically involve very few transactions. This observation suggests that rather than ta.ing measures to prevent deadloc.s, it may be better to detect and resolve deadloc.s as they arise. &n the detection approach, the D!%" must periodically chec. for deadloc.s. ?hen a transaction Ti is suspended because a loc. that it re uests cannot be granted, it must wait until all transactions T* that currently hold conflicting loc.s release them.

The loc. manager maintains a structure called a waits,for graph to detect deadloc. cycles. The nodes correspond to active transactions, and there is an arc from Ti to T* if (and only if) Ti is waiting for T* to release a loc.. The loc. manager adds edges to this graph when it ueues loc. re uests and removes edges when it grants loc. re uests. >bserve that the waits,for graph describes all active transactions, some of which will eventually abort. &f there is an edge from Ti to T* in the waits,for graph, and both Ti and T* eventually commit, there will be an edge in the opposite direction (from T* to Ti) in the precedence graph (which involves only committed transactions). The waits, for graph is periodically chec.ed for cycles, which indicate deadloc., A deadloc. is resolved by aborting a transaction that is on a cycle and releasing its loc.s: this action allows some of the waiting transactions to proceed.

'ig. ?atts,for Hraph before and after Deadloc. As an alternative to maintaining a waits,for graph, a simplistic way to identify deadloc.s is to use a timeout mechanism) if a transaction has been waiting too long for a loc., we can assume (pessimistically) that it is in a deadloc. cycle and abort it.

Q. *. What is seria(iEabi(ity of schedu(es?

Ans. "erializablity A non,serial schedule that is e uivalent to some serial execution of transactions is called a serializiable schedule, 'or example in below written three schedules, schedule,8 is "erializable schedule and is e uivalent to the scheule,3 and schedule,6. The purpose of "erializable scheduling is to find the non,serial schedules that allows the transactions to execute concurrently without interfering with one another and therefore produces a database state that could be produced by a serial execution. 1ote that serializability also removes the problem of inconsistency.

Definition) The given interleaved execution of some transactions is said to be serializable if it produces the same results as some serial execution of the transactions. &n serializablity the ordering of read and write operations are important before any operation to avoid any type of confusion or in,consistency. "erializablity Test can be explained and solved with the help of 2recedence Hraph.

Q 1. Define the concept of a$$re$ation 6ive t'o e2amp(es of 'here this concept is usefu(?

Ans. "electing the data n group of records is called aggregation Data aggregation is in which information is gathered and eprdJiia.summary.3orzn, for purposes such as statistical

analysis A common aggregation purpose is to get more information about particular groups based on specific variables such as age profession or income. it writes log record YTi. "everal output operations may be re uired for Ti (to output A and !).based recovery 6. it writes log record YTi commitZ 0 ?hen Ti aborts3 it writes log record YTi abort Z . R. A failure may occur after one of these modifications has been made but before all of them are made. %odifying the database without ensuring that the transaction will commit may leave the database in an inconsistent state. one after the other. (onsider transaction Ti that transfers ###BOT_TEXT###lt;= from account A to account !: goal is either to perform all database modifications made by Ti or none at all. -&. Dog . it writes log record YTi startZ 0 !efore Ti executes write (R). Q 11 0ompare the shado')pa$in$ recovery scheme 'ith (o$)based recovery Schemes? Ans. A log) 0 "e uence of log records 0 %aintains a record of update activities on the database 0 Iept on stable storage. that are. -6Z) 0 -3 is the value of R before the write (for undo) 0 -6isthevaluetobewrittentoR 0 ?hen Ti commits. 0 ?hen transaction Ti starts. To ensure atomicity despite failures) 'irst output information describing the modifications to stable storage without modifying the database itself: only then start modifying the database. "hadow paging ?e assume (initially) that transactions run serially. ?e study two approaches) 3.

During recovery after a crash. or while recovery action is being ta. +edoing a transaction T. /actually execute the previously deferred write s. (rashes can occur while the transaction is executing the original updates. R. &mmediate database modification Deferred Database %odification 0 +ecords all modifications to the log 0 Defers all the writes to after partial commit 0 Transaction starts by writing YT3 startZ record to log. the log records are read and used to. 0 A write (R) operation results in writing a log record YT3. they are not buffered) The two approaches using logs are 3. YT3 commitZ is written to the log. but is deferred. #xample transactions T= and T3 (T= executes before T3)) T=) read(A) A)A<= ?rite(A) read(!) !)!C <= write(!) Det the original value of A be 3===. 'inally. The write is not performed on R at this time. Deferred database modification 6.is the new value for R 1ote) >ld value is not needed for this scheme.en.?e assume for now that log records are written directly to stable storage (that is. that of ! be 6=== and that of ( be F==. ?hen T3 partially commits. (redo Ti) sets the value of all data items updated by the transaction to the new values. T3) read(() ()(3== write (() . a transaction needs to be redone if and only if both Y T3 startZ and YT commitZ are in the log. -Z 0 .

en (b)) redo (T=) must be performed since YT= commitZ is present (c)) redo (T=) must be performed followed by redo (T3) since YT= commitZ and YT3 commitZ are present &mmediate Database %odification 0 Allows database updates of an uncommitted transaction. (c) Dog at three instances of time (a).s are output can be different from the order in which they are written.Det us handle the cases when crashes occur at three different instances as shown in (a).(b). &t can ta. 0 Apdate log record must be written before database item is written ?e assume that the log record is output directly to stable storage output of updated bloc.e place at any time before or after transaction commit order in which bloc. &mmediate Database %odification #xample .s. 0 Apdate logs must have both old value and new value. (b). (()) (rash at (a)) 1o redo actions need to be ta. 0 Andoing may be needed.

even if the operation is executed multiple times the effect is the same as if it is executed once. &t is needed since operations may get re executed during recovery. that is. but does not contain the record YTi commitZ. going forward from the first log record for Ti !oth operations must be idem potent. !oth the record YTi startZ and the record YTi commitZ. transaction Ti needs to be undone if the log contains the record YTi startZ. &mmediate Database %odification +ecovery . going bac. then redo operations. ?hen recovering after failure. Transaction Ti needs to be redone if the log contains. Ando operations are performed first.wards from the last log record for Ti redo (Ti) sets the value of all data items updated by Ti to the new values.+ecovery procedure has two operations instead of one) undo (Ti) restores the value of all data items updated by Ti to their old values.

and then A and ! are set to L<= and 6=<= respectively.Dog at three instances of time (a). (c) +ecovery actions in each case above are) (a) undo (T=)) ! is restored to 6=== and A to 3===. (b)undo (T3) and redo (T=)) ( is restored to F==.point system failure . Then ( is set to E== (hec. (hec. pointing procedure 0 >utput all log records currently residing in main memory onto stable storage. 0 >utput all modified buffer bloc. 0 ?rite a log record Ychec.point Z onto stable storage. 0 "treamline recovery procedure by periodically performing 0 (hec.points 2roblems in recovery procedure: 0 "earching the entire log is time consuming 0 ?e might unnecessarily redo transactions that have already output their updates to the database. (b).s to the dis. (c) redo (T=) and redo (T3)) A and ! are set to L<= and 6=<= respectively.

#very log record is given a uni ue id called the log se uence number (D"1). and Ando is initiated for this transaction .writes a commit type log record containing the transaction id. in principle growing indefinitely. removing the transaction7s entry in the transaction table: these follow the writing of the commit log record. The most recent portion of the log. This D"1 is called the page D"1. up to and including the commit record. Dog based recovery The log. and the log tail is written to stable storage. log records and data records are written to dis. Apdating a page) After modifying the page. every page in the database contains the D"1 of the most recent log record that describes a change to this page. an update type record (described later in this section) is appended to the log tail. (ommit) ?hen a transaction decides to commit.) 6. e. &f the log is a se uential file. 'urther. The page D"1 of the page is then set to the D"1 of the update log record. the log is a file of records stored in stable storage. an abort type log record containing the transaction id is appended to the log. sometimes called the trail or *ournal.During recovery we need to consider only the most recent transaction Ti that started before the chec. As with any record id. (The page must be pinned in the buffer pool while these actions are carried out. we can fetch a log record with one dis. This way. That is..) 8. at the same granularity (pages or sets of pages).6 The transaction is considered to have committed at the instant that its commit log record is written to stable storage. ("ome additional steps must be ta. it force.s (perhaps in different locations). access given the D"1. and transactions that started after Ti. so that the chance of all copies of the log being simultaneously lost is negligibly small. is a history of actions executed by the D!%". 2hysically. D"1s should be assigned in monotonically increasing order: this property is re uired for the A+&#" recovery algorithm. which is assumed to survive crashes: this durability can be achieved by maintaining two or more copies of the log on deferent dis. the D"1 can simply be the address of the first byte of the log record. A log record is written for each of the following actions) 3. Abort ) ?hen a transaction is aborted. 'or recovery purposes. called the log tail. the log record is appended to the log.ept in main memory and is periodically forced to stable storage.g. is .point.en.

only update record will contain *ust the before. is written. 0 (hec. . abort. perhaps. an end type log record containing the transaction id is appended to the log. #very log record has certain fields) prevD"1. some additional actions must be ta. with the exception of the update and compensation log record types. which we describe next.only update log record will contain *ust the after. and after. additional contents of the various log record types. An update log record that contains both before. The before. (transaction start. 0 !efore. <. Andoing an update ) ?hen a transaction is rolled bac. or during recovery from a crash).image.image: similarly an undo. and update operations).9. and type. or (D+. which we will not discuss further.image is the value of the changed bytes before the change: the after.en beyond writing the abort or commit log record. The tras&D field is the id of the transaction generating the log record. we can recognize that the change will never be undone (or.images can be used to redo the change and to undo it. when a transaction is aborted or committed. its updates are undone. commit). update. ?e have already mentioned the. Transaction records contain) 0 Transaction identifier 0 Type pf log record. The set of all log records for a given transaction is maintained as a lin. ?hen the action described by an update log record is undone.ed list going bac. #nd As noted above.point records. A redo. insert. delete. translD. Dog 'ile (ontains information about all updates to database) 0 Transaction records. The pagelD field is the page id of the modified page: the length in bytes and the offset of the change are also included.image of data item. redone). using the prevD"1 field: this list must be updated whenever a log record is added. &n certain contexts. in time.image is the value after the change. a compensation log record. 0 &dentifier of data item affected by database action (insert. Additional fields depend on the type of the log record. and the type field o&viously indicates the type of the log record. (because the transaction is aborted. 0 After. Apdate Dog +ecords. delete.image of data item. After all these additional steps are completed.

+ecords information about the progress of transactions in a log since the last consistent state. Distributed database has sharing of data. What do you understand by a distributed database? Ans. &t means customer data is distributed to all the locations and so we call it distributed database. 6.pointing.nows the state of the database before and after each transaction. #very so often database is returned to a consistent state and the log may be truncated to remove committed transactions. or by using networ. Distributed database technology is recent development within overall database field. Distributed database can be defined as 4&t is a system consisting of data with different parts under the control of separate D!%" running on interconnected way. #ach system has autonomous processing capability and is applicable for the local application. &n other words we can say data is stored at all the locations and any customers can access his data from any location via the communication networ. branch offices. A techni ue often used to perform recovery is the transaction log or *ournal 3. A Nuery in a distributed database is divided into sub. Q.. 12. ?hen the database is returned to a consistent state the process is often referred to as chec. Distributed database is always available and it is reliable. Distributed data are capable of handling both local and global transactions.ing system in which customer account database is distributed across the ban. of computer that are geographically dispersed and is connected by communication lin.s li. such that each individual customer can process his data or record at the local branch. uery and all the sub. 'or example. Also we can do the increments growth (addition of data) of data in a distributed system. >ne more advantage of distributed database system is that it loo.e a centralized system to . A distributed database system is also defined as it is not stored at a single physical location and it is spread across networ. ueries are parallel evaluated. The database therefore . Distributed database are handled or controlled by DD!%" (Distributed D!%"). consider a ban.0 Dog management information..5 #ach system participates in the more global applications. 8. 9.

e to complete an operation cannot be bounded in advance A scalable system is one that can easily be altered to accommodate changes in the number of users. "calability can be measured in three different dimensions) 0 Doad scalability) A distributed system should ma. many different multiprocessing options have been explored for use in distributed computing. 0 Heographic scalability) A geographically scalable system is one that maintains its usefulness and usability. "ome loss of performance may occur in a system that allows itself to scale in one or more of these dimensions. overlapping and possibly conflicting information. There is no central arbiter of truth in open distributed systems.e it easy for us to expand and contract its resource pool to accommodate heavier or lighter loads. 0 Administrative scalability) 1o matter how many different organizations need to share a single distributed system. Athlon E9 R6 and &ntel 2entium D processors feature multiple processor cores to also increase the number of concurrent threads they can run. or different threads belonging to the same process. A multiprocessor system is simply a computer that has more than one (2A on its motherboard.s can come in and go out between subsystems of an open distributed system. different subsystems can come up and go down and communication lin. 'or example &ndian +ailway reservation system has a distributed database system. . Z Anbounded nondeterminism) Asynchronously. resources and computing entities affected to it. (hallenges to Distributed "ystem Z %onotonicity) >nce something is published in an open distributed system. which can be accessed at any location by any station. it should still be easy to use and manage. it cannot be ta. >ver the years.en bac. regardless of how far apart its users or resources are. Z 2luralism) Different subsystems of an open distributed system include heterogeneous.the user.e advantage of this. &f the operating system is built to ta. The most recent "un A3tra"2A+( T&. &ntel (2As employ a technology called @yperthreading that allows more than one thread (usually two) to run on the same (2A. Therefore the time that it will ta. it can run different processes on different (2As..

resources. it is necessary to interconnect processes running on those (2As with some sort of communication system.server ) "mart client code contacts the server for data. to the server when it represents a permanent change..Tier.nown as peers.e the final result.tier architecture) 1.coupled devices and cables. in parts that are made individually by each one. At a higher level. 0 (lient. 0 8. .ind of distributed computing where every computer is wor. processing power and even basic underlying architecture. then formats and displays it to the user.. &nput at the client is committed bac. This type of application is . %ost web applications are 8.Tier refers typically to web applications which further forward their re uests to other enterprise services. being printed onto a circuit board or made up of several loosely.A multicomputer system is a system made up of several independent computers interconnected by a telecommunications networ. This simplifies application deployment. together to ma. . 0 1..ing on different parts of a single problem. &nstead all responsibilities are uniformly divided among all machines. 0 2eer. They are in widespread use today. At a lower level. eventually with vastly differing memory sizes.the one most responsible for the success of application servers. -arious hardware and software architectures exist that are usually used for distributed computing. subdividing the tas. it is necessary to interconnect multiple (2As with some sort of networ.to. with many companies adopting this architecture due to the speed with which hardware goes obsolete and the cost of upgrading a whole system simultaneously.tier architecture ) Three tier systems move the client intelligence to a middle tier so that stateless clients can be used. 0 Tightly coupled (clustered) ) refers typically to a set of highly integrated machines that run the same process in parallel. &n contrast an heterogeneous distributed system is one that can be made up of all sorts of different computers. %ulticomputer systems can be homogeneous or heterogeneous) A homogeneous distributed system is one where all (2As are similar and are connected by a single type of networ. regardless of that networ. They are often used for parallel computing which is a .peer ) an architecture where there is no special machine or machines that provide a service or manage the networ. and then put bac.

on the entire file. 13. *ust those records.s on ob*ects that contain other ob*ects. and if it accesses *ust a few records. (a) %ultiple Hranularity Another specialized loc.multiple instruction. even parts that are not needed by this transaction are bloc. the entire page. 0 %obile code ) !ased on the architecture principle of moving processing closest to source of data 0 +eplicated repository) ?here repository is replicated among distributed system to support online . Write short notes on the fo((o'in$ B 9a. only those pages "imilarly. granularity loc. single instruction. rather than loc.ing. on the file containing the page (and therefore. offline processing provided this lag in data update is acceptable. The uestion to be addressed is how a loc. The types of distributed computers are based on 'lynn7s taxonomy of systems:. . and it allows us to efficiently set loc.ed &f a transaction accesses relatively few pages of the %e. it should loc. single instruction. single data (%&"D).ing strategy is called multiple. implicitly. multiple data ("&%D) and multiple instruction. The idea is to exploit the hierarchical nature of the /contains7 relationship.ing overhead considerably >p the other hand.0 "ervice oriented ) ?here system is organized as a set of highly reusable services that could be offered through a standardized interfaces. if a transaction accesses several records on a page. manager can efficiently ensure that a page. and a page is a collection of records A transaction that expects to access most of the pages in a file should probably set a loc.ing individual pages (or records7) as and when it needs them. for example. is not loc. Mu(tip(e 6ranu(arity 9b. 8ransaction Drocessin$ Systems Ans. it should loc. a database contains several flies. a file is a collection of pages. 'or instance. it is better to loc. Distributed computing implements a . multiple data (%&%D).ind of concurrency. other transactions that re uire access to parts of the file .ed by a transaction while another transaction holds a conflicting loc. single data ("&"D). on the page). Doing so reduces the loc.

. it needs an " loc.ing can be released and must be released in leaf. multp3e. %ultiple.ed any ancestor in R mode..ing. all its ancestors in &" (respectively &R) mode Thus. that conflicts with either " or &R) instead of an " loc. on the node. but we will not discuss this extension.ing protocols also use two new .s all nodes on a path from the root (corresponding to the entire database) to the node corresponding to same page p in &" mode. f a transaction loc.s a node in R mode. Another transaction T* could now obtain an R loc.R mode.ind of loc. is very different from !C tree loc.to.s p in " mode. (1ote that this interpretation of a loc.. that is logically e uivalent to holding an " loc. This ensures that no other transaction holds a loc.s.) A loc. correctly. >ne approach is to begin by obtaining fine granularity loc. a transaction must first loc. similarly. implicitly gives T* an R loc.to. A transaction can obtain a single "&R loc.s a node in " mode.s. This loc. files.inds of loc.s obtained using multiple granularity loc.s &R loc. the loc.root order for this protocol to wor. on the file and an &R loc. on an ancestor that conflicts with the re uested " or R loc. and then releases. on page p. at the record level) and after the transaction re uests a certain .s (e. on the root node. 62D dictates when loc.s that node and.s conflict only with loc. A common situation is that a transaction needs to read an entire file and modify a few of the records in it: that is. no other transaction can have loc. To see this. no other transaction can have loc.granularity loc. loc. 'inally. where loc. on a node loc. To loc. any descendants implicitlyO) &n addition to shared (") and exclusive (R) loc.granularity loc. and an &R loc.s.A database contains a set of.ing is appropriate for a given transaction. At that time.ing must be used with 62D in order to ensure serializability. some of the contained ob*ects in . all its descendants. called intention shared (&") and intention exclusive (&R) loc. where each node contains all its children (The approach can easily be extended to cover hierarchies that are not trees. if a transaction loc.root order. on the root. &t is useful to define a new . which conflicts with the " loc. each file contains a set of page. a node in " (respectively R) mode. and each page contains a set of records This containment hierarchy can be thought of as a tree of ob*ects.s conflict with " and R loc.ing a node does not loc. loc. so that it can subse uently loc.s must be released in leaf. (which conflicts with any loc. consider what happens when a transaction Ti loc. implicitly. and an &R loc. there is the uestion of how to decide what granularity of loc. A subtle point is that loc. currently held by Ti.s can be released.s &" loc. called an "&R loc.g.ed any ancestor in " or R mode.

'or example. i. What are (oc3in$ techni>ues for concurrency contro(? 12p(ain.s at that granularity. Q.line manner is the database for an airline reservation that is used by many agents accessing the database from their terminals. at the page level). 1 . Q.s at the next higher granularity (e. The sharing of the database for read only access does not cause any problem. "ame Answer of Nuestion 1o. if the seat is available on a given flight or not and if both agents ma. .e the data in inconsistent way) ?e can say that concurrent processing of the programs. What are the desirab(e properties of transactions in a database? Ans. Ans.ing will be displayed.. 8. This will ma. no. suppose that two tic. to start obtaining loc. A database could also be accessed in batch mode and it is concurrently used with the online mode. (b) Transaction 2rocessing "ystems. &t could lead to incorrect values for the data item and an inconsistent database will be created. it could lead to inconsistencies. of *obs or programs processed simultaneously to achieve their independent B different goals according to their own re uirements. This procedure is called loc. escalation.number of loc.g. "ame Answer of Nuestion 1o. 8.e the reservation against the last available seat on that flight then a message of overboo. >ne such that shared database is used in on.ed agents access the online reservation system simultaneously to see. but if one of the transactions running concurrently tries to modify same data item. process of Job are similar to the multiprogramming. 1!. 'urther if more than one transaction is allowed to simultaneously modify a data item in a database.e. (oncurrent control is a method used to ensure that database transaction are executed in a safe manner or &t is the process of managing simultaneous operations against a database so that data operation do not interfere with each other in a multi user environment.

"ome problems occur during scheduling and concurrent processing are as) (a) Dost update problem ) (onsider the two transactions given below and these transactions are accessing the same data item A. (c) The phantom phenomenon &n phantom phenomenon let us consider an organization where parts are purchased and . we want to see that whether (or if) current uantity of some part purchased and received is e ual to the current sum of the uantity of that part in stoc. The parts are withdrawn from the stoc. "o the processing of these statements which are in schedule and used in concurrent operation /and we cannot change the schedule is called concurrent schedule. if there are two transaction occur T< and TE in a schedule. The order in which these statements are executed is called "chedule. plus the current uantity is used by various pro*ects The phantom problem means . we see that the concurrent processing of the modification of the value of A will create a problem by loosing the old value with the currently updated value. @owever concurrency can also cause problem when only one transaction modifies a given set of data while that set of data is being used by other transaction.e any assumption about the order in which the statement belonging to different transactions will be executed. (b) &nconsistent +ead 2roblem) The lost update problem was caused by concurrent modification of same data item. when we apply a correct transaction in the concurrent processing then we see that databases becomes inconsistent after the completion of the transaction &n the case of concurrent operation where a number of transactions are running and using the database we cannot ma.. and used by number of pro*ects. #ach of these transactions modifies the data item and write it bac...ept in stoc. 'or example."ome concurrency problem. "uppose A and ! represents some data items having integer value then if both are concurrently processed then one processing will be reading the data and other will be modifying the data. To chec. Then. the extent of loss. "o it will create an inconsistency in the reading in next transactions that which data is correct for reading and which is incorrect.

it is not necessary that the two transactions are commutative. 1ote that serializability also removes the problem of inconsistency. whether syntax of operation as same. of records also prevents the addition of such phantom records.3 "alary S ("alary x 3. (d) "emantic of (oncurrent Transaction ) As we ta.if additional items are added. 8 is "erializable schedule and is e uivalent to the scheule.3) C 3=== 'or example. this additional information reflects the transaction and uery during the concurrent processing.6. "uppose two transactions A S (AC3=) C 6= A S (AC6=) C 3= gives same result. This problem could be prevented by using the concept of Doc. An important part of concurrency is serial execution or seralizability. "erializablity) A non. ?hen we let some independent transactions in a schedule by setting them in a order such that their execution becomes serially then these type of execution is called serial execution or serialzability. !ut some time the commutative operations is not same "alary S ("alary C 3===) M 3. The purpose of "erializable scheduling is to find the non. "ome problem of concurrent processing are removed by serial execution by setting or ordering the operation in a particular se uences. 'or example in below written three schedules.3 and schedule. loc.serial schedules that allows the transactions to execute concurrently without interfering with one another and therefore produces a database state that could be produced by a serial execution.ing i. . but semantics of these transactions are different during concurrent processing.e two different transactions for different ordering.serial schedule that is e uivalent to some serial execution of transactions is called a serializiable schedule.ing of such type.e. schedule.

variable is used in loc. Thus one of the following concurrent control schemes is applied in concurrent database environment to ensure that the schedule produced by concurrent transaction are serializable.ing a database can be considered as being made up of set of data items A loc. &n serializablity the ordering of read and write operations are important before any operation to avoid any type of confusion or in. is called D>(I&1H. %anager. is a variable associated with each such data item %anipulating the value of the loc.consistency. The value of loc. the result obtained will be consistent with some serial execution of the transaction and will be considered correct. "ome concurrency controls schemes used to solve all the problems occurs during the concurrent scheduling are as discussed below) (i).) .ing "cheme (ii) Time stamp based order (iii) >ptimistic scheduling (iv) %ulti version techni ue (i) Doc. The loc. Also testing of serialzability of a schedule is not only expensive but it is impractical some time.ing 'rom the point of view of Doc.ing is done by a subsystem of D!%" and such system is called Doc. There are two types of Doc. Doc.ing scheme to control the concurrent access and manipulation of the associated data item.Definition) The given interleaved execution of some transactions is said to be serializable if it produces the same results as some serial execution of the transactions. "olution to these problems &f all schedules in concurrent environment are restricted to serializable schedule. "erializablity Test can be explained and solved with the help of 2recedence Hraph.

) "hare Doc. There are two types of time stamp) (a) ?rite time stamp (b) +ead time stamp.) #xclusive Doc. +x) where each R is represented as R ) The value of data item. The usual value assigned to each transaction is the system cloc. Any number of transactions can concurrently loc. A transaction with a smaller time stamp value is considered to be an /older7 transaction than another transaction with a larger time stamp value.s on all data items re uired by the transactions have been ac uired. value.e. a serial order is created among the concurrent transaction by assigning to each transaction a uni ue non.ing) Another method of loc. !oth of these phases is monotonic i. &n other words the release of the loc. value at the start of the transaction. &n this once a loc. &t has two phases.ing is called Two 2hase Doc.decreasing number. (ii) Time stamp based order &n time stamp based. it is not allowed to re uest any further loc. A data item loc. no other transaction can access N or not even read N until the loc. This value can then be the system wide cloc. method. is also called a read loc.ed in the exclusive mode until the shared loc. &t is called Time "tamp ordering. This leads to control and lower the degree of (oncurrency. the number of loc.ing is to provide exclusive use of data items to one transaction. is delayed.. This value can then be used on deciding the order in which the conflict between two transactions is resolved.s held decreases from maximum to zero. value.s. Two 2hase Doc.ed in the share mode until the exclusive loc. Data item R is thus represented by triple R set as R) TR.s. is released no additional loc. A data item in the exclusive mode cannot be loc. ?x.s are only increasing in first phase and decreasing in the 6nd phase. are re uested. The intention of their mode of loc. &n this way a transaction is obliged to re uest all loc.s it may need during its life before it releases any.(a) #xclusive Doc.. and access a data item in the shared mode but none of these transactions can modify the data item.ed in a shared mode cannot be loc.s increase from = to maximum for the transaction and a contracting phase in which the number of loc. until all the loc.. &f a transaction T loc. is also called update or writes loc.s a data item N in ar exclusive mode. . is released by all transactions holding the loc. on the data item is released. >nce a transaction starts releasing loc. is released by transaction (b) "hared Doc.ing. a growing phase in which the number of loc. A variation of this scheme is used in a distributed environment includes the site of a transaction appendid to the system wide cloc.

"ince they do not overwrite with each other. (b) -alidation phase) &n this when data items are modified it chec. that data after the procedure be rolled bac. Also read operation can read any version. "o it is called %ulti version. which follows the accounting principle of never overwriting a transaction. (c) ?rite phase) ?hen transaction passes the validation phase. (iii) >ptimistic "cheduling &n the optimistic scheduling schema.g. free even through starvation can still occur. then whole transaction be written into secondary storage data. in this R is achieved by ma. write operations can occur concurrently. (iv) %ulti version techni ue &t is also called time domain addressing scheme. &n optimistic scheduling each transaction has three phases) (a) The read phase ) This phase starts with the activation of a transaction and in this all data items are read into local variables and any modification that are made are only to those local copies. and so it is dead loc. Describe the usefu(ness of $rantin$ privi(ed$es to the users. An optimistic scheme does not use loc. the philosophy is to assume that all data items can be successfully updated at the end of a transaction and to read in the values for data item without any loc. +x ) The read time stamp value. Q.ing a new copy or version of data item R. . 1". &n this way a history of evolution of the value of data item is recorded in the database.. Any charge are achieved by entering compensating transaction e. This ends with commitment. ?ith multi version techni ue. the largest timestamp value of any /transaction that was allowed to read the current value R.ing. +eading is done when re uired and if any data item is found to be inconsistent at the end of a transaction then the transaction is rolled bac. (Ased for recovery procedure in D!%"). the largest time stamp value of any transaction that was allowed to write a value of R.?x ) The write time stamp value.

combine a number of entity sets that share the same features into a higher. What is cascadin$ ro(( bac3 . to bring the data from An. The +#->I# command is *ust the opposite of the H+A1T command: it will ta.+ diagram in the same way.e a role or system privilege away from a user or role) +#->I# role or system privilege Q/role or system privilegeU '+>% user or role or 2A!D&( Q/user or roleU Q. !y specifying that you want to grant a role or system privilege to 2A!D&(. 0 The terms specialization and generalization are used interchangeably. role or system privilege & T> user or role or 2A!D&( Q. 0 "pecialization and generalization are simple inversions of each other: they are represented an #. 1*. $ou use H+A1T to assign roles or system privileges to roles or users. Defi(e the term 6enera(iEation. you are specifying that you want that role or privilege to be granted to all users in the system. Heneralization seems to be simplification of data.normalized form to normalized form 0 A botto. 1%. Q. Ans. user or roleU Q?&T@ AD%&1 >2T&>1U The H+A1T command can ta.up design process . The "yntax for the H+A1T (ommand The H+A1T command ta. i.es the following syntax) "$1TAR) H+A1T role or system privilege Q.s whether you are assigning these roles or system privileges to an individual user or to a role that in turn can be assigned to many users.e any number of system privileges and roles and assign them to any number of users or roles.e.Ans. The same command wor. Heneralization.level entity set .

Q. roles are not meant to be used for application developers. one transaction7s failure causes many to fail. 33. @owever. you can grant the privileges for a group of related users to a role. +efer to N.1o.Ans. must be performed >ther transactions dependent on T37s actions must also be rolled bac. due to T37s failure. and then only the role needs to be granted to each member of the group. 12p(ain various recovery techni>ues based on deferred update. Ans. >racle provides for easy and controlled privilege management through roles.1o. +oles are named groups of related privileges that you grant to users or other roles. Q. A cascading rollbac. +efer to N. 21.user system and ob*ect privileges. 2. Q. 22. thus causing a cascating effect That is. What does ro(( name si$nify? 12p(ain 'ith e2amp(e. because the privileges to access ob*ects within stored programmatic constructs need to be granted directly.. These properties of roles allow for easier privilege management within a database) . occurs in database systems when a transaction (T3) causes a failure and rollbac. Ans. What is a (o$ record ? What fie(ds it contains ? Go' it is used for database recovery? Ans. are designed to ease the administration of end. 0 +educed privilege administration +ather than explicitly granting the same set of privileges to several users. +oles. 33. .

Applications can be created specifically to enable a role when supplied the correct password. analysis. the data is actually present in the database and is simply not visible to the unauthorized. and development processes are not exposing sensitive information. &n that situation.identification or mas. character shuffling. There are many good and *ustifiable reasons for ta. the values must be changed in some way that ma.ing is the process of replacing sensitive data with randomly generated data that is valid and functional for application processing.1 What is data mas3in$? So(B • Data mas. exposing sensitive data. and transforms it to meet application business rules.ing this approach in a production • • • . you can design database applications to uery the dictionary id automatically enable (and disable) selective roles when a user attempts to execute the application via a given username. Asers cannot enable the role if they do not . 0 "elective availability of privileges $ou can selectively enable or disable the roles granted to a user This allows specific control of a user7s privileges in any given situation.ing is not the same thing as restricting the visibility of information in production databases from people who are not authorized to see it.identifying (mas.ing. only the privileges of the role need to be modified. 0 Application.ing is done while provisioning non. but is not associated with the original record. Q.es detection or reverse engineering impossible.now the password. The result is that sensitive information is not available to users outside of authorized environments. ?ithout mas. De. The security domains of all users granted the group7s role automatically reflect the changes made to the role. these policies can ris.production environments so that copies created to support test. &t ensures that the sensitive data is replaced with realistic but not real data.ing) specific data elements within data stores. ?hatever method is chosen.er preserves the application and relational integrity of the data set.0 Dynamic privilege management &f the privileges of a group must change. including encryption. The data may be altered in a number of ways. the format of data remains the same: only the values are changed. Data mas. Data mas. dgmas. 0 Application awareness !ecause the data dictionary records which roles exist. and character or word substitution.ing. &n data mas.specific security $ou can protect role use with a password.ing is the process of de.

2ossibly it may not even be the biggest of your immediate worries. &n general. a reasonable security assumption is that the more people who have access to the information. >ften the costs of such an event. &t is a pretty safe assumption that the standards for the security and maintenance of data will become increasingly strict in the future.2 Why Mas3 Data? So(B • &e$a( 5e>uirements The regulatory environment surrounding the duties and obligations of a data holder to protect the information they maintain are becoming increasingly rigorous in *ust about every legal *urisdiction. whether accidental or malicious. the greater the inherent ris. both actual and un. a test or dev system in which the data is present but hidden is a system which sooner or later will expose its data.system. of the data being compromised. then the formal legal sanctions applied by governmental bodies is not the only problem you will be facing. uantifiable can far exceed any fines levied for the violation of the rules. The reason is that strict controls are in place in production databases and these can present a carefully managed view. Typically. 'rom a data visibility standpoint. they are an environment in which access is usually much wider. what will it cost the organization if potential customers are . can have devastating conse uences. that if a data escape happens at your organization. Test and development systems are different. &nappropriate data exposure. The modification of the existing data in such a way as to remove all identifiable distinguishing characteristics yet still usable as a test system can provide a valuable layer of security for test and development databases Q. 'or example. &nformation is visible to more people and those people often have greater privileges and low level access. • &oss of 0onfidence And Dub(ic 5e(ations Disasters &t can reasonably be said in most locations. but adopting a 4data is present but hidden5 approach to the protection of data in test and development databases is a recipe for trouble.

and firewalls is the appropriate and sufficient response. &t also does not ta. of exposure is mitigated.ing the most sensitive information (credit card numbers.ing the test data because everybody has access to production anyways5.s associated with an accidental exposure of the data remain. if they get through the networ.ers. @owever. the ris. of accidental exposure of information is often neglected when considering the security ris. • Data Mas3in$ Architectures 'undamentally. 1ot so. will have far fewer useable targets and a far greater proportion of the inside personnel will have no access to the real data.ed databases remain *ust as functional. The public relations costs of a data escape usually far exceed the sanctions levied by governmental organizations. by reducing the number of databases with unmas.e much imagination to realize that senior management are not going to be happy about having to give a press conference to reassure the public. there are two basic types of architectures which are used in the design of data mas.eep an insider from ac uiring data under such circumstances. There is no denying that such protection is necessary c however it has been shown that in many cases the data is stolen by malicious insiders who have been granted access to the data. 1o firewall can . in and steal the data. >ften it is thought that 4there is no point in mas. to the information they hold is external entities (and organized syndicates) out to brea. • Ma(icious 12posure %ost people thin. The external hac. the ma*or ris. Dealing with the public relations aftermath of seeing the companies name in the press will not be cheap. customer email addresses etc) is enough to somewhat mitigate the damage associated with accidental exposure and the mas. . The assumption then follows that protecting the networ.s associated with real test data. the overall ris.ing software. >ften *ust mas.ed information.not willing to provide sensitive information to your company because they read an article about a data escape in the newspaper. security. • Accidenta( 12posure The ris.

This might happen in cases where the mas. Q. the mas. &n this case. The anonymization rules are applied as part of the process of moving the data from the source to the target. tools to perform the cloning operation is impacted.ing. data after the transfer has completed can be troublesome.ing is integrated into the cloning process which creates the target database.ed target database has been built and it is subse uently decided that a specific column of information really needs to be mas. d The ability to mas.ing software needs to have &n.ing operation will need to be repeated.n the -(y4 Server)8o)Server4 Data Mas3in$ Architectures &n this architecture the data does not exist in the target database prior to mas.3B What are the Advanta$es H Disadvanta$es of Data Mas3in$? So(B Advanta$es d The data is never present in an unmas. B 12p(ain I/n)Situ Data Mas3in$ ArchitecturesI So(B . Disadvanta$es d Any errors in the process necessarily interrupt the transfer of the data.ed. d The ability to use alternative.ed form in the target database. perhaps preferred. >ften this type of mas.ing capabilities (see below) or the entire clone and mas.• ."itu mas. Q.

Q. the clone of the database to be mas. "ubstitution is very effective in terms of preserving the loo. Then to sanitize telephone numbers. 'or example. Disadvanta$es d The data is present in an unmas. and feel of the existing data.&n this style.ing rules which are controlled by a different system which then connects to the target and controls the execution of the rules.ing operations at any time. a list of phone numbers must be . So(B Substitution This techni ue consists of randomly replacing the contents of a column of data with information that loo.ing rules which are executed and controlled as a standalone entity on the target and data mas.ed state in the target database and hence additional security measures will be re uired during that time. d The mas. the surnames in a customer database could be sanitized by replacing the real last names with surnames drawn from a largish random list. The downside is that a largish store of substitutable information must be available for each column to be substituted.ing operations are separate from the copy process so existing cloning solutions can be used and the data mas.s similar but is completely unrelated to the real details. Advanta$es d &t is possible to apply additional mas.ing rules are possibly simpler to maintain.ed is created by other means and the software simply operates on the cloned database.ing) mas. 'or example. to sanitize surnames by substitution.situ mas.!B 12p(ain Data Mas3in$ 8echni>ue. There are two types of in. a list of random last names must be available.

ing software should contain datasets of commonly re uired items. &t does not prevent people from as.ed of it.shuffled state. Another useful feature to loo.shuffled5. ?hen evaluating data mas. "huffling is rarely effective when used on small amounts of data. 'or example.to.ing uestions li.ing rules. if there are only < rows in a table it probably will not be too difficult to figure out which of the shuffled data really belongs to which row.ing software the size. if a column of numeric data is shuffled. >n the other hand. Another consideration is the algorithm used to shuffle the data. for is the ability to build your own custom datasets and add them for use in the mas.and. and feel of the data intact. from an interested party to revert things to their un. "huffle rules are best used on large tables and leave the loo. They are fast.e much wor. 'or example. Fumber and Date ?ariance . "ubstitution data can sometimes be very hard to find in large uantities .so is on the supplier listW5 &n other words. however any data mas. if the shuffle algorithm simply ran down the table swapping the column data in between every group of two rows it would not ta.sum tests but never wor.e 4& wonder if so. This can sometimes be useful. 're uently. &f the shuffling method can be determined.en to use a sophisticated algorithm to randomize the shuffling of the rows. out to the same amount. the original data is still present and sometimes meaningful uestions can still be as. #ssentially the data in a column is randomly moved between rows until there is no longer any reasonable correlation with the remaining information in the row. scope and variety of the datasets should be considered. There is a certain danger in the shuffling techni ue. then the data can be easily 4un.available. but great care must be ta.have feature. the ability to generate .nown invalid data (credit card numbers that will pass the chec. the sum and average of the column still wor. Shuff(in$ "huffling is similar to substitution except that the substitution data is derived from the column itself.) is a nice.

ey while remaining effectively useless to anybody without the . it is one of the least useful techni ues. This would seem to be a very good option c yet. Date fields are also a good candidate for variance techni ues.ey c is actually a ma*or disadvantage in a test or development database. The variance techni ue can prevent attempts to discover true records using . "ome encryption is more secure than others. could be varied with in an arbitrary range of e 36= days which effectively disguises the personally identifiable information while still preserving the distribution.e binary data. !irth dates. some lower but all would be not too far from their original range. >f course. and feel of the data. (ertain types of encryption impose constraints on the data format as well.s li. The advantage of having the real data available to anybody with the . According to the experts. a column of salary details might have a random variance of e3=` placed on it.eeping the range and distribution of values in the column to within existing limits.The 1umber -ariance techni ue is useful on numeric or date data. for example.ey and regenerate the test instances c but outsourced.nown date data or the exposure of sensitive numeric or date data. 1ncryption This techni ue offers the option of leaving the data in place and visible to those with the appropriate . not very much will . &n other words. the algorithm involves modifying each number or date value in a column by some random percentage of its real value.ey. This techni ue has the nice advantage of providing a reasonable disguise for the data while still . #ncrypted data rarely loo. this means that the fields must be extended with a suitable padding character which must then be stripped off at decryption time. in fact.s meaningful. &n effect. "imply put. The 4optional5 visibility provides no ma*or advantage in a test system and the encryption password only needs to escape once and all of the data is compromised. stored or saved copies of the data are all still available under the old password. #ncryption also destroys the formatting and loo. The strength of the encryption is also an issue.en c it is *ust a matter of time and effort. you can change the . it usually loo.eep the national security agencies of largish . for anonymous test databases. "ome values would be higher. This sometimes leads to character set issues when manipulating encrypted varchar fields. most encryption systems can be bro. 'or example.

Mas3in$ . but is rarely useful as the entire data sanitization strategy. This effectively disguises the data content while preserving the same formatting on front end screens and reports. 1ADD7ing or truncating data is useful in circumstances where the data is simply not re uired. one in which the letter /A7 is replaced by /R7 and the letter /!7 by /%7 etc. it is very hard to write and test customer account maintenance forms if the customer name. This may not be a big worry if the re uirement is to protect proprietary business information. use a simplistic encryption scheme designed by amateurs. 1ever.ed to write such programs as assignments. 'or example. li. means replacing certain fields with a mas. besides being the generic term for the process of data anonymization. Asually the test teams need to wor.ut Data %as. Fu((in$ .ing data. address and contact details are all 1ADD values. is trivially easy to decrypt based on letter fre uency probabilities.e) 989E E9<9 ==6= <8FL 99L8 L68K F83< <FKF 96LF K6LE F9LE KF69 and after the mas. 'or example.countries from reading your files should they choose to do so. first year computer science students are often as. ever. &n fact. Anfortunately it is also one of the least desirable options from a test database standpoint. on the data or at least a realistic approximation of it.utJ8runcatin$ "imply deleting a column of data by replacing it with 1ADD values is an effective way of ensuring that it is not inappropriately visible in test environments.ing operation the information would appear as) 989E RRRR RRRR <8FL 99L8 RRRR RRRR <FKF . character (such as an R). a column of credit card numbers might loo. 'or example.

nown chec.ing operation such as) 96LF K6LE F9LE KFRR since the numbers are generated with a specific and well .sum algorithm. and feel.ed to preserve security.ing can be slow. or may not. &f the data is in a specific.ing >ut is a powerful and fast option. extremely complex to administer and can potentially leave some data items inappropriately mas. .96LF RRRR RRRR KF69 The mas. This may. then %as.en not to mas. Ta. &t would not be hard to regenerate the original credit card number from a mas. &f numerous special cases must be dealt with then mas.ing operation such as RRRR RRRR RRRR <8FL would strip the card issuer details from the credit card number.ing characters effectively remove much of the sensitive content from the record while still preserving the loo. be desirable. invariable format. out potentially re uired information. A mas.ed.e care to ensure that enough of the data is mas. Also care must be ta.