You are on page 1of 49
CHAPTER 1 RODUCTIO! Overy w of DBMS (Database Management S' em. DBMS is generally defined as a collection of logically related data and a set of programs to access the data. Strictly speaking, this is definition of “Database System”, which comprises of two components i.e. (i) Database and (ii) DBMS. | USER QUERIES + Query Processing Software DBMS, I ‘Storage Management Software a DATABASE, SYSTEM Schema DATA Definition DATABASE DATABASE A Database is a collection of logically related data that can be recorded. The information stored in the database must have the following implicit properties:- (a) It must represent some real-world aspect; like a college or a company ete ‘The aspect represented by the database is called its “Mini-world (b) It must comprise a logically coherent collection of data, which should have well-understood inherent meaning (semantics). Dr. Kakoli Banerjee (©) ___ The repository of data must be designed, developed and implemented for a specific purpose. There must exist an intended group of users, who must have some pre-conceived applications of the data. A Database System will have the following major organs:- = Sources of information, from where it derives its data = Some related real-world events, which influence its data, = Some intended users, who would be interested in its data. For example, in the college database, sources of information will be students, faculty, labs etc. The real-world events affecting the information in the database will be admissions, exams, results & placements etc. The set of intended users will be faculty, students, admin staff ete Database Management System (DBMS) ‘A Database Management System (DBMS) refers to a set of programs for defining, creation, maintenance and manipulation of a database. A DBMS must facilitate the following major functions:~ ~ Defining of Database Schema: The DBMS must facilitate defining the database structure ie. defining of data types, relationships amongst the data and specification of the integrity constraints to be enforced on the database. It should also facilitate specifying the access rights of authorized users. - _ Manipulation of the Database:- The DBMS must facilitate functions like: Insertion of new data into the database Update of changed information. Deletion of data, which might have been rendered defunct Reading of stored information, including generation of reports, : Sharing of a database The DBMS must enable concurrent access of, shared data items by multiple users, while preserving the consistency of the database, - Protection of a database The DBMS must protect the database against unauthorized/ malicious access. - Database Recovery In the event of system failures, DBMS must facilitate database recovery. ‘Some Important Concepts in DBMS Dr. Kakoli Banerjee ‘Transaction A Transaction refers to a unit of work in DBMS. It is basically a set of Data Manipulation statements, The main requirement is that a transaction must be executed atomically (indivisibly) i.e, it must be executed either fully or not at all. Tf system fails during execution of a transaction then it must be rolled back to its initial state (Le effect of its partial execution must be undone as if the transaction had never started execution), ‘Suppose there is a banking database table Account (Account_Number, Balance) There can be transaction “Transfer Rs 1000/= from Account Number 101 to Account_Number 203” This can be coded in SQL as:~ UPDATE Account SET Balance = Balance — 1000 WHERE —_Account_Number = 101; UPDATE — Account SET Balance = Balance + 1000 WHERE ——_Account_Number = 203; Let balance of Account_Number 101 be denoted as A and that of Account_Number 203 be denoted as B. Then the above transaction can be expressed as:~ Read (A); A= A-1000; Write (A); Read (B); B=B+1000; Write (B); The entire code as indicated above forms a single transaction and it has to be computed atomically. Imagine the consequences, if first part of the transaction is executed (i.e. Rs 1000/= has been debited from Account_Number 101) and system fails before execution of second part (ie. amount is not credited to Account_Number 203). Such a state of database is called an inconsistent state, which is not acceptable, Under this condition, the transaction has to be rolled back to initial state i.e, the balance of Account Number 101 has to be reverted back to its old value. The Rolled back transaction is then restarted. Such a rollback would be feasible, only if the old balance of Account 101 has been saved in the system. Concurrency Contr With a view to improve the utilization of system resources, database systems will enable multiple Transactions to execute concurrently. This concurrent access of databases will lead to certain anomalies like Racing Problem, which have to be controlled. Dr. Kakoli Banerjee Racing Problem This refers to a phenomena, wherein multiple transactions compete with each other to access shared data items, leading to arbitrary end results, leaving the database in an inconsistent state, Example: Suppose there are two transactions T; and T>. Ty: Transfer Rs 1000 from A to B Tx Debit Rs 500 from A Let Initial values be: A=2000, B=3000 Suppose T; and T; execute concurrently in the following sequence: Transaction Ty ‘Transaction Ts Database State Read (A); ‘A= 2000, B = 3000 Read (A); A=2000, B= 3000 A= A-1000; Write (A); A= 1000, B = 3000 A=A-500; Write(A); A= 1500, B = 3000 Read (B); B=B+1000; Write (B); A= 1500, B= 4000 At the end, expected values should have been A = 500 and B = 4000 whereas in the above sequence of operation, the actual values are A = 1500, B= 4000, which is wrong. So the data has lost its consistency. Thus, there is a requirement to control the concurrent of shared data items and such a control is called Concurrency Control. One possible solution would be that whenever a transaction needs to access a data item for updating jit, the data item should be locked in an exclusive mode and should be unlocked only after update is completed, as shown below:- ‘Transaction T; | Transaction T> | Remarks on Locking | Database State Lock-X(A); Exclusive Tock on A | A= 2000, B = 3000 Read (A); granted to T; Lock-X (A); | Not granted. So T2 goes to Wait State A=A-1000; Dr. Kakoli Banerjee Write (A); ‘A= 1000, B = 3000 Unlock (A); Now Lock is granted to Tz and it proceeds with execution of next instuction Read (A); ‘A= 1000, B= 3000 A=A-500; Write(A); A= 500, B= 3000 Lock-X(B); A= 500, B= 3000 Read (B); B=B+1000; Write (B); A= 500, B= 4000 So, with the use of locks for concurrency control, the end result is as expected. Such a control is called Concurrency Control, which is supported by DBMS. File Processing System Before the evolution of DBMS, dedicated systems known as “File-Processing Systems” were in vogue to handle the data repositories of organizations. Such systems needed a dedicated set of application programs, to add information to the files, to extract information from the files and to update the existing information. The structure of the files used to be hard-coded in the application programs. Normally, such application programs used to be written by different programmers in different languages, as and when need arose. Also, the same information used to be stored at multiple places, in different formats, on different machines, which were not even interconnected. Limitations of a File Processing Svstem @ Data Redundancy and Inconsistency:- _ Since the same information is stored at multiple places, it causes data-inconsisteney problems during updates, (ii) Difficulty in Accessing of Data Suppose there exists some information in the files, but the existing set of application programs do not support extraction of that information. Under such situations, the application programs need to be updated and this is very inconvenient, time consuming and costly solution. (iii) Data Isolation The information is scattered over a large number of files, on a number of stand-alone (not networked) machines, making it very difficult to process certain queries, which need information to be extracted from multiple locations, Dr. Kakoli Banerjee (iv) Difficulty in Enforcing Integrity Constraints: Enforcing of _ integrity constraints has to be handled at application program level, making the programs very complex. The redundancy of information makes this task all the more difficult. () Atomicity Problems: Since the information needed to rollback a transaction may not be readily available in a file-processing systems, ensuring atomicity of transactions will be difficult. (vi) Difficulty in Concurrency Control: It is complex to build in the concurrency control features at the application programs level, (vii) Security Problems: Since the information is scattered and does not have centralized access path, effective enforcing of user access rights will not be fail-proof, Features of a Database System A DBMS will support the following features:- (a) Data Dictionary A Database System will support a Data Dictionary (or Data Directory or DBMS Catalog), which contains information like Data Types, Relationships amongst the data and Data Constraints of the underlying database. In addition, it also contains the information about Authorized Users of the database like their Access Rights. Since, this information defines the nature of the data stored in the database, it is called metadata (data about data). This information makes the DBMS software independent of its underlying database. When a need arises to change the structure of data, no changes need to be made to the DBMS software; only the dictionary is updated to reflect the changes. Whereas in a file processing system, the application programs would need to be changed. Also, this feature makes the DBMS software generic. The same DBMS can be used for different organization having entirely different set of data; the distinguishing feature will be the information stored in the Data Directory. This feature of DBMS is generally referred to as ‘Self Describing Nature of a Database’, since the information stored in the Data Dictionary fully describes the nature of the data stored in the Database (b) Storage Management DBMS supports a File Manager to manage the allocation of disk space for the DBMS files. Also, it supports a Buffer Manager to manage the memory buffers, used for processing database information. Whenever, some information is to be updated, itis first read from the files into the buffer, where it is manipulated and then the updated information is written back into the files, (©) Language Interfaces DBMS supports language interfaces with 4GL languages like PLISQL for data manipulation applications. Dr. Kakoli Banerjee (@) (@) oO (8) Transaction Management — DBMS ensures atomicity of transaction processing. A Transaction, when executed transforms the database from one consistent state to another consistent state. During its execution, a log is maintained in a system Log File of all the operations performed by the Transaction. If a Transaction fails during its execution, then the log file is used to rollback the transaction during recovery of the database. This ensures atomicity of transaction processing. Concurrency control DBMS will support concurrency control tools for permitting multiple users or application programs to access the database concurrently, while preserving the consistency of database. Security Management Security Mechanism of a Database System will ensure that only the authorized users can access the database; and that too only to the extent, which is explicitly authorized by the Database Administrator. The authorized Access Rights are explicitly stored in the Data Dictionary. The access by each user and the type of operations performed on various data will be monitored and controlled by the DBMS. This will protect database against the authorized/ malicious access. Database Recovery Since DBMS maintains a log of all transactions being executed, it will enable recovery of the underlying database, in the event of failures. For example, if a Transaction fails during its execution, itis rolled back to initial state; thus reverting back to the consistent state that existed prior to the commencement of the failed transaction. This is made possible by the information stored in the system log file. Also, DBMS will support taking of periodic backups, which are used to recover databases in case of catastrophic failures; like Disk Crash. DATA MODELS A Data Model defines the underlying structure of a Database. It comprises a collection of conceptual tools for describing the Data, the Data Relationships, the Data Semantics and the Data Integrity Constraints, CATEGORIES OF DATA MODELS Basically, there are three categories of data models:~ (a) (b) Object Based Logical Models. Record Based Logical Models. (a) Object Based Logical Models. The Object Based Logical Models view the universe as a collection of objects, Dr. Kakoli Banerjee (i) Entity-Relationship Model. = An Entity refers to a real-world ‘object’ or a ‘concept’ that is distinguishable from other objects and other concepts in the real-world. For example, a person, a bank-account, a payment are all entities of different kinds. - An Entity will have a set of properties, known as Attributes; for example, the Entity “Account” may have attributes like “Account-Number”, “Current- Balance” ete - Each attribute will have a set of permitted values, called its Domain; for example the domain of Balance of an account can be the set of +ve real numbers. - A collection of entities of the same kind, having same set of attributes, is called an “Entity Set” - A relationship refers to the association amongst entities. For example, in a banking database, an entity ‘Customer’ can have relationship “Depositor’ with another entity ‘Account’ - A set of Relationships of the same kind, having the same set of attributes is called a Relationship Set. E-R Model is modeled as a collection of Entity Sets and Sets, - E-R Model also specifies certain constraints, like Mapping Cardinalities ic. whether the relationship is one-to-one, one-to-many, many-to-one or many-t0- many, - The E-R Diagram below depicts two Entity Sets “STUDENT”, “COURSE” and a relationship set “RESULT” indicating the marks obtained by students in different Courses. S_Name % STUDENT RESULT eau Re Dr. Kakoli Banerjee (ii) Object-Oriented Model, Like the E-R Model, this model also models a database as a Collection of Objects. An Object Body encapsulates Data (Variables) as well as Methods (Functions) to manipulate the Data (Variables). The Objects that contain same Type of Data Variables and same Type of Functions are grouped together as a Class. Thus, a Class may be viewed as a Type Definition of the Objects. The only way an Object “A” can access the Data Items of another Obje “B” is by invoking the Methods of “B”. “A” can accomplish this by making calls to the methods of “B”, through B’s Interface. The methods defined within an object are made visible to the external world, through its Interface Variables —p Functions Interface OBJECT The structure of an object-oriented database is modeled as a set of classes and database will comprise of objects belonging to those classes (b) Record Based Logical Models. These models describe data at the Logical level, as a collection of fixed-format Records of different types. Fach Record Type can have a fixed number of Fields (or Attributes) and each Field is usually of fixed length. Use of fixed-length Records simplifies the Physical Level implementation of a database. ‘The most widely used Record Based Logical Models are:- (i) Hierarchical Model. This is one of the oldest models, dating back to 1960s. The first commercial DBMS, based on this model, was “Information Management System” (IMS), released by IBM in 1966, At one time, it was the most used DBMS. In the Hierarchical Model, the Data is represented as Records, and the Records are organized as a collection of Trees. The relationships among Dr. Kakoli Banerjee 10 the data are represented by Links, which can be viewed as pointers. The tree structure permits that each record can have only one parent record. Thus, it permits modeling of only one-to-many relationship (not many-to-many relationship) amongst the Records, The following diagram, showing an Academic Database in Hierarchical Model, represents Records of three types “Course”, “Teacher” and “Student”; and links indicating relationship “Offered By” from “Course” to “Teacher”- indicating the faculty offering a course and the relationship “Attended By” from “Course” to “Student”- indicating the students attending a course. Course Offered Bs = By Teacher HIERARCHICAL MODEL It does not indicate the relationships * What are the courses being offered by a faculty”, “What are courses being attended by a student”, “who are the students being taught by a faculty” and “who are the faculty teaching a student”. This is due the limitation of tree structure that a node can have only one parent node; and thus we can represent only one-to-many relationship but not many-to- ‘many relationship (ii) Network Model, Like the Hierarchical Model, this Model also models a database as a collection of Records; and the Records are organized as a collection of arbitrary graphs (or Networks). Thus a Record can have any number of parent records; and thus supports many-to-many relationship amongst records. The relationships among the records are represented as links (pointers). Since, this Model supports many-to-many relationship amongst the records, it is considered ‘more versatile as compared to Hierarchical Model The above database can be better modeled in Network as indicated below. It contains additional information i.e. relationship “Offers” from “Teacher” to “Course” and relationship “Attends” from “Student” to “Course”. Since, the Hierarchical Model can strictly model only Tree Structures, it was not possible to depict “Offers” and “Attends” in the Hierarchical Model. Also, it depicts relationships “Teaches” and “Taught By” between “Teacher” and “Taught”. Dr. Kakoli Banerjee Course Offered By Attends Offery Attended By Teaches ee Teacher Student —| Taught By NETWORK MODEL (iii) Relational Model. This is most modern and most commonly used model amongst the Record Based Models. It has been widely accepted. The Relational Model models a database as a collection of Tables to represent both data and the relationships amongst the data. Each Table is called a Relation, which is assigned a unique name. Each relation has a number of Columns, representing the Fields (or Attributes) of the relation. Each Field is also uniquely named. A Relation (or Table) can have an un-limited number of Rows and each Row represents an Instance of the Relation, A Row is also termed as a Tuple, Each Tuple will be unique in a Relation. So, a Relation can be viewed as a set of Tuples of the same type. The relationships amongst the tables will be modeled as Foreign Key- Primary Key Relationships. The “Course-Student-Teacher” Database Schema in Relational Model will be represented by six Tables- three tables to represent entities i.e. STUDENT (giving details of all students), TEACHER (giving details of all teachers), COURSE (giving details of all courses); and three tables to represent relationships ie. COURSE-TEACHER (indicating relationships - OFFERED BY and OFFERS), COURSE-STUDENT (indicating relationships ATTENDS and ATTENDED BY) and TEACHER-STUDENT (indicating relationships TAUGHT BY and TEACHES). STUDENT Roll No_[S Name [Branch | Semester | Section | S_Address COURSE. ‘Sub_Code | Sub Title | Semester | Branch | Contact_Hrs TEACHER Fac Code | Fac Name | Desig [Dept | Fac Address Dr. Kakoli Banerjee COURSE-TEACHER Sub_Code | Fac Code COURSE-STUDENT ‘Sub_Code | Roll_No TEACHER-STUDENT. Fac_Code | Roll_No ‘The Relational Model has become extremely popular because:- (a) _Itis extremely simple and easy to implement. (b) _Tthasa strong mathematical foundation, (©) Ithas been highly standardized. SCHEMAS AND INSTANCES Schema. Database Schema refers to the overall structure of a database. Once defined, the schemas are rarely changed. A Database System will have several Schemas, partitioned according to the levels of its abstraction. Ins It refers to the actual collection of data (a Snapshot of data) existing in the database at a particular moment of time, Since, a database will continuously experience insertion of new data, deletion of defunct data and update of changed data, the Instance will be under continuous change. DATA ABSTRACTION & VARIOUS SCHEMAS OF A DATABASE There are three levels of data abstraction in a database; and each level is described by a schema as explained below:- (a) Physical Level. This is the lowest level of abstraction. At this level, a Physical Schema describes “how data is physically stored”. The Physical Schema may describe complex structures, used to store the data, with the sole aim of achieving an efficient access of the data (b) Logical Level, This is the intermediate level of abstraction. At this level, a Logical Schema (or Conceptual Schema) would describe “what data is stored in the database” and “what are the relationships amongst the data”. This Schema is used by Database Administrators, who decide what information is to be kept in the Database. It Dr. Kakoli Banerjee 13 would describe the logical structure of database, data types and integrity constraints, As compared to Physical Level, Database at Logical Level is described by relatively smaller number of simpler structures. But, the implementation of these simple structures may be quite complex at the Physical Level. The user operating at Logical Level need not be awware of the complexities at the Physical Level (©) View Level, This is the highest level of abstraction. At this level, there will be many Views, defined for different categories of users. A View for a certain group of users describes “what subset of the database is to be made visible” to that group. A view will describe only a subset of the underlying database. This is the subset, which the intended group of users needs to access, There may be many Views, tailored to the specific needs of various users. At the view level, the main goal is to provide an efficient and a user-friendly human-interaction with the system. So, the interface at this level is made as simple and user-friendly as possible. A user doesn’t have to be aware of the complexities at the conceptual level and physical level DATA INDEPENDENCE, The ability of a DBMS to modify its Schema definition at one level, without affecting a Schema definition at the next higher level, is called Data Independence. There are two levels of Data Independence:~ (a) Physical Data Independence. __It is the ability of DBMS to modify the Physical Schema without causing any changes in the schema at the logical level and at the view level. Modifications at Physical Level are driven by advancements in hardware technology and by the requirements to upgrade hardware for improving system performance. (b) Logical Data Independence. This refers to the ability of DBMS to modify the Logical Schema without causing any changes in the application programs at the view level. Modifications at Logical Level are necessitated by need to alter the Logical Structure of the database. The Logical Data Independence is much more difficult to achieve than the Physical Data independence, since the application programs are heavily dependent on the logical structure of the database. DATABASE LANGUAGES A DBMS will support two kinds of languages; one called Data Definition Language (DDL) to specify the Database Schema and the other called Data Manipulation Language (DML) to enable accessing and manipulation of the data stored in the database. (a) _ DDL, A database schema is specified by a set of definitions expressed in DDL. In a Relational Database, the result of interpretation of DDL statements will be a set of Tables that are stored in a special file called Data Dictionary or Data Directory or DBMS Catalog. This data stored in Data Dictionary is called Metadata i.e, data about data Dr. Kakoli Banerjee 14 Whenever the database is to be accessed, the DBMS will first make a reference to the Data Dictionary with a view to determine the structure of data to be accessed; only then it will access the actual data in the database. Thus the data dictionary is accessed during processing of each query. The storage structure and access methods used by the database system are specified by a set of definitions in a special type of DDL called Data Storage and Definition Language. The result of interpretation of these definitions will be a set of physical schema structures and a set of access methods supported by the system. These details are usually hidden from the database-users. (b) DML. A DML isa language that enables users to access and manipulate the data stored in the database. A DML query is a statement specifying information to be accessed for retrieval or insertupdate/delete. The portion of a DML that involves information retrieval is called a query language. The goal of a DML is to provide an efficient and friendly human interface for the following operations in a database: (i) Retrieval on information stored in the database. (ii)__Insertion of new information into the database. (iii)_Deletion of information from the database. (iv) Update of information stored in the database. ‘There are two types of DMLs:- (i) Procedural DMLs. A query in procedural DML requires the user to specify not only “what data is required to be extracted from the database” but also to specify “how to extract those data”. ‘A Query in Non-Procedural DML requires the user to specify only “what data is needed”, without specifying how to get those data, Non-procedural DMLs are easier to lean and to use than the procedural DMLs. However, since non-Procedural DMLs do not specify “how to get the data”, the queries in Non-Procedural DMLs may not generate as efficient code as the equivalent queries in Procedural DMLs. This limitation of Non-Procedural DMLs is overcome by performing query optimization at the System Level OVERALL DATABASE STRUCTURE In a Database System, the OS provides the basic services and DBMS js built on that base. The functional components of a database system can be broadly divided into:- (@) Query Processor Components (b) Storage Manager Components. Dr. Kakoli Banerjee (a) (b) 15 Query Processor Components, These include:- (DML Compiler, It translates. DML statements into low-level instructions that are understood by the Query Evaluation Engine. Also, the compiler optimizes the DML Queries, for efficient execution by the Query Evaluation Engine. One of the inputs for Query Optimization is the statistics gathered from the execution of previous queries. Such statistics are kept in the Data Dictionary. (ii) Pre-Compiler for Programs in 4GL. The Programs written Fourth Generation Languages (4GL) like PL/SQL will have the DML Queries embedded in the programs. This is done to supplement a Query Language like SQL with the constructs required for implementing loops (While.... Do) ete. The Pre-Compiler compiles the programs and it interacts with the DML Compiler to generate object code. This code could be directly executed or it could be saved for later execution as and when required. (iii)DDL Interpreter, It interprets DDL statements and converts them into a set of tables (called metadata), which are saved in the Data Dictionary (iv) Query Evaluation Engine. It executes the low-level instructions generated by the DML Compiler and produces results, Storage Manager Components. These components provide _ interface between the data stored in the database and the query processor. The Storage Manager components include:~ (i) Authorization & Integrity Manager. While processing a query, the system will fetch necessary information regarding Authorization Rights and Integrity Constraints from the Data Dictionary and perform the following. functions:- (a) It ensures that the users have the required access rights; and only then their queries will be processed. (b) Before a query is processed, this component ensures that execution of the query will not violate any of the existing integrity constraints. (ii) Transaction Manager. This component ensures that concurrent transactions proceed without conflict and the database remains in a consistent (correct) state despite system failures. (iii)Eile Manager. It manages the allocation of disk-space for the storage of DBMS files. Dr. Kakoli Banerjee 16 (iv)Buffer Manager. It is responsible for fetching from data from the disk into main memory buffers for processing, and then writing the updated data back onto the disk. system, In addition, there are following components of ph implementation:- (i) Data Files, which store the database itself. (ii) Data Dictionary, It is a metadata file, which stores the database schema, Since it is accessed very frequently, a great emphasis needs to be placed on its design for efficient access of the metadata, (iii) Indices, which enable fast access of indexed data items (iv) Statistical Data, It stores statistical information about processing of previous queries. This information is used by the Query Processor to optimize queries. Dr. Kakoli Banerjee 7 OVERALL STRUCTURE OF DBMS Unskilled Application DML Users, Programmers Users ‘Applicaton Progam DM Tools Da Toots Thea Development Tools DML or [GL Programs eri ¥ ¥ +e ¥ Tonlcion Pre-Conpier Daf DDL em mr [nce One Cae fe | esc + : Sy Query Pfs [ { t Baler ‘Authorization Manager “ete Triton ‘mr Mme File Manager Storage Manager| Tadex ‘Quely Evaluation information Ntstics “Applicaton Data Files Database (Schema Access Rights) Disk Storage Dr. Kakoli Banerjee 18, Functions of a Database Administrator (DBA\ DBA is the custodian of the Database System placed under his control and is responsible for the following funetions:~ 1. Creation of Conceptual Schema and its periodic update to adapt to the changed requirements. 2. Implementation of efficient Storage Structure and Access Methods. 3, Liaise with the Users to ensure that the information required by the Users is made available 4. Ensure system security, through Grant and Revoke of Access Rights to the Users. A user must have only as much rights as required by his role in the organization- nothing ‘more, nothing less, 5. _ To ensure Physical Security of Database against malicious access and accidents like fire ete. 6. Take periodic backups and keep the archived data safely. 7. Execute immediate recovery procedures in case of failures. 8. Monitor the system performance. In case of degradation in system performance, perform tuning procedures. If necessary, upgrade the system (hardware / software) to meet the changed requirements of the organization 9. Ensure sufficient Disk Space is always available. If needed, upgrade the Disk Drives to meet the increased requirements. 10. To liaise with the DBMS vendor to obtain necessary technical supports and to obtain the necessary tools & software upgrades, whenever made available by the vendor. Characteristics of a Database System, which distinguish it from a conventional Processing Svstem In a traditional file system, each user defines & implements the files needed for a specific application, as a part of programming the application itself. Multiple users of the same set of data will create replicated sets of files, specific to their respective applications. This redundancy in defining & storage of data results in higher storage costs and database inconsistencies during updates. On the other hand, in a database approach, a single repository of data is maintained, which is defined once and then accessed by various users of the data. Dr. Kakoli Banerjee ‘The main characteristics of a database approach, which distinguis 19 ‘processing approach a1 @ (iv) ww) Self-Describing nature of a Database System A database contains not only the data, but also a complete definition of the data structure, data types & data constraints. This additional information is called meta-data, which is stored in a file called Data-Dictionary (also called DBMS Catalog). The information stored in the Data Dictionary is accessible to the DBMS software. This additional information makes the DBMS software independent of its applications. When a new need arises to change the structure of data, no changes need to be made to the DBMS software; only the meta-data in the Data Dictionary needs to be changed, to reflect the changes. This feature enables the DBMS software to be adapted for any application. The same DBMS will work for a college, a bank or a factory, Whereas in a traditional file processing system, the application programs would need major changes while shifting from one application to another. Data Abstraction In a traditional file processing system, the structure of the data files is hard coded in the application programs; thus any changes in structure would need the related application programs to be modified accordingly. Whereas in a Database System, the application programs are insulated from the data stored in the database. The application programs are only concerned with ‘what data’ is stored in the database and not concerned with ‘how the data is stored’. As long as the contents of data remain unchanged, the database structure can be changed, without affecting the existing application programs. This feature is called Data Abstraction, Support for Multiple Views of the Data Depending on different needs and different levels of authorizations, different users would be provided different perspectives of the same data, called Views. A View refers to a subset of the stored data or a set of Virtual Data ie. data derived from the stored data. A View is not explicitly stored in the Database; only its Definition is stored in the DBMS Catalog. Whenever a user or a program submits a query to access a View, the View is instantly computed and presented to the User or the Program. Next time, when the same view is again accessed, it is, re-computed fresh. Multi-User Access & Concurrency Control A Multi-User DBMS allows multiple users to access the same database concurrently. This is achieved by including Concurrency Control Software in the DBMS, to ensure that database remains consistent, despite access by multiple users concurrently. Effective System Protection through grant of Access Rights Access Rights are granted to the users, to the extent required for their roles in the organization. These rights are stored in the data dictionary itself. When a query is to be processed, the DBMS will first ensure that the user submitting Dr. Kakoli Banerjee (vi) (a) (b) © @) «& (0) (g) 20 the query has sufficient rights for the processing of that query; only then the query is processed Support for efficient Recovery, When a system is restarted after a failure, log-based recovery recovers the database efficiently. Controlling Redundancy While designing a database, various Views of different users are integrated into a single database, thus controlling redundancy. This results in reduced effort and reduced storage space. Also, it ensures database consistency, in case of updates, Restricting unauthorized access The user access rights are stored in the data dictionary. Whenever, any query is received from any user, it is checked for valid access rights. If access rights exist, the query is processed else it is rejected as ‘Invalid Query’. This prevents unauthorized access of data. Providing Multiple User-Interfaces__A DBMS provides various types of user interfaces for various categories of users:- ~ Query Languages (like SQL) for skilled users = Programming Languages (like PL/SQL) for application programmers - Menus, Forms for Naive Users - DDL for Database Administrator Enforcing of Data Integrity Constraints The Data_—_Integrity Constraints are stored in the data dictionary itself. Whenever, some data is inserted/updated/deleted, the data constraints are automatically applied to the related data items and invalid operations are rejected. Supporting Concurrent Access A DBMS supports concurrent access by multiple users. Despite concurrent access by multiple users, database consistency is maintained Providing backup & recovery A DBMS supports data backup & recovery in case of failures, Reduced Application Development Time Development time of a new application using DBMS is of the order of 15 ~ 25% as compared to the time needed in development of equivalent applications in a traditional file processing system. Dr. Kakoli Banerjee (h) Easy Adaptability A database system can be easily adapted to changed requirement, with minimal time and cost implications (i) Potential for enforcing Standards —It__—_permits.—the_-—_—Database Administrator (DBA) to define & enforce standards among the database users. The standards can be defined for naming conventions, formats of data items, display formats or report structures etc, CHAPTER 2 ENTITY-RELATIONSHIP MODELING ‘The Entity Relationship Model (ER Model) models the real world situations as a collection of entities and relationships amongst the entities. Entity An Entity is an object (like a CAR”) or a concept (like an “ACCOUNT”) from the real world, which is distinguishable from other objects and other concepts. Each Entity will be defined by a set of properties (called Attributes). For example entity “ACCOUNT” may be defined by Attributes like “ACCOUNT-NUMBER”, “BRANCH- NAME” and “BALANACE” etc. Entity Set An Entity-Set refers to a collection of entities of the same kind, Each entity in an Entity-Set will have the same set of attributes and the set of attributes will distinguish it from other Entity Sets. No other entity set will have exactly the same set of attributes. Some of the attributes of an entity set may overlap with other entity sets, Relationship A Relationship refers to an association amongst Entity Sets. Like there may be relationship “DEPOSITOR” between Entity Set “CUSTOMER” and Entity Set “ACCOUNT”. Relationship Set __A Relationship Set refers to the collection of Relationships of the same kind (ie. having exactly same set of Attributes). A Relationship Set will inherit some of the Attributes (properties) of the associating Entity Sets. Like the Relationship Set “DEPOSITOR” between Entity Sets “CUSTOMER” and “ACCOUNT” will inherit Attributes “CUSTOMER-ID” from “CUSTOMER” and Attribute “ACCOUNT- NUMBER” from “ACCOUNT”. In addition, a Relationship Set may have some of its own. attributes called “Descriptive Attributes”; for example the relationship set “DEPOSITOR” may have a descriptive attribute “DATE-OF-OPERATION”, indicating the date on which a customer has last operated an account, Domain of an Attribute Dr. Kakoli Banerjee 22 Each attribute has a set_of permitted values called its domain or value set, like the attribute ‘NAME’ may have a domain that is set strings of characters of specified maximum length. A database will consist of a set of Entity-sets and Relationship-Sets, each of which will contain a number of entities of the same type or Relationships of the same type. An entity in a database may be described by a set of (attribute, data value) pairs; like a student in Entity-Set “STUDENT” may be described by {(ROLL-NUMBER, 0990013010), (NAME, ‘Karan Singh’), (DATE-OF-BIRTH, 10-DEC-1985")}. Attribute Types (i) Simple Vs Composite Attributes, A Simple attribute is the one, which is not divisible into sub-parts like ‘BRANCH’. On the other hand, a Composite attribute is the one, which can be divided into sub-parts like ‘DATE-OF-BIRTH’, which may be divided into *birth-date’, “birth-month” & “birth-year’. (ii) Single-Valued Vs Multi-Valued Attributes, An_ attribute, which can assume one value at a time, is called Single-Valued attribute; like ‘name’ of an EMPLOYEE entity, On the other hand, an attribute, which may assume a set of values at a time, is called multi-valued attribute; like attribute “dependant” of an ‘Attribute Set “EMPLOYEE”, which may have none or one or multiple values, depending upon the number of dependents of an employee. Gil) Null Attribute, A null value is assigned to an attribute under any of the following three conditions:- (a) If the attribute value is not applicable to an entity; like SPOUSE-NAME will not be applicable if an employee is unmarried. (b) If value is applicable, but not specified; like TEL#- an employee may not be owning a Telephone. (©) If value is applicable and specified but not known to the agency entering the information; like an employee may be owning a Telephone but the number may not be known to the organization. Null value can only be assigned to an Attribute, if assigning value to that attribute optional (not mandatory). The Mandatory attributes cannot be assigned a “Null” value, (iv) Derived Attribute Vs Stored Attribute, A derived attribute is the one, whose value is not stored in the database, but is derived from the value of other stored attributes; like the value of attribute ‘age’ can be derived from attribute “date-of-birth” and current date obtained from the system. Dr. Kakoli Banerjee Degree of a Relationship Set refers to the number of Entity Sets participating in the Relationship. Most of the relationships are binary. E-R Diagram Notations Rectangle represents an entity set Ch represents an attribute Diamond _ represents a relationship set. Line links an atribute to an entity set or an entity set to a relation set. Double Line indicates total participation of an entity set ina relation set. Dashed Ellipse _ indicates derived attribute, << ~C(DuutieEinse indicates mi Double Rectangle indicates weak. entity set. salued attribute. Double mond indicates a relationship set with participation of some weak entity sets RELATIONSHIP constraints - Mapping Cardinalities ~ Participation Constraint Mapping Cardinalities. For a binary relationship set R between entity sets A and B, the mapping cardinalities can be on of the following:- Dr. Kakoli Banerjee 24 (a) One-to-one. An entity in A is associated with at most one entity in B and an entity in B is associated with at most one entity in A. It is represented in E-R Model as follows:- r <4 5 irected lines drawn from R to A & B both One-to-one cardinality is represented by. (6) _ One-to-many.One to many cardinality from A to B implies than an entity in A is associated with any number (Nil/ one/ many) of entities in B; however, an entity in B is associated with at most one entity in A. Itis represented in E-R Model as follows:~ a (©) _ Many-to-one. Many to one cardinality from A to B implies that an entity in A is associated with at most one entity in B; however, one entity in B can be associated with any number of entities in A. It is represented in E-R Model as follows:- oO (@) Many-to-mam Many to many cardinality from A to B implies that an entity in A can be associated with any number of entities in B and one entity in B can be associated with any number of entities in A. It is represented in E-R Model as follows:- o_o One-to-One relationship from CUSTOMER to ACCOUNT implies that each customer can have only one account and each account has to be Single. Sevostt0) ‘CUSTOMER re ACCOUNT (One-to-Oiie Relationship) Dr. Kakoli Banerjee 25 One-to-Many relationship from CUSTOMER to ACCOUNT implies that each customer can have any number (NIL or One or More than One) of accounts, but each account has to be Single. (One-to-Many Relationship) ‘Many-to-One relationship from CUSTOMER to ACCOUNT implies that each customer can have only one accounts, but each account can be Joint (held by one or more). ‘customer -}—~ of account (Many-to-One Relationship) Many-to-Many relationship from CUSTOMER to ACCOUNT implies that each customer can have any number (Nil or One or More than One) accounts and each account can be Joint (held by one or more). (Many-to-Many Relationship) ation Constraints in Relationship Sets ~ Total Participation ~ Partial Participation ‘Total Participation ‘An Entity Set E is said to have total participation in relationship set R if each entity in E is participating at least in one relationship through R. In E-R Diagram, the Total Participation is represented by a “Double Line” drawn between the Entity Set symbol and the Relationship Set symbol. Dr. Kakoli Banerjee 26 An Entity Set E is said to have partial participation in relationship set R if some of the entities in E are not participating in any relationship through R. In E-R Diagram, the Partial Participation is represented by a “Single Line” drawn between the Entity Set symbol and the Relationship Set symbol. Example: Suppose Entity Sets “CUSTOMER” and “ACCOUNT” are related by Relationship Set “DEPOSITOR” and Entity Sets “CUSTOMER” and “LOAN” are related by Relationship Set “BORROWER”. Suppose it is possible that a customer may have only account or only loan or both, then the situation can be modeled as follows:~ (CUSTOMER ‘ACCOUNT Participation Partial Paficipation BORROWER To tal Pantypation LOAN Concept of Key Super Key. A Super Key of an Entity Set or Relationship Set refers to the set of attributes, which when taken collectively, will uniquely determine an entity within the Entity Set or a Relationship within the Relationship Set. If K forms a Super Key (SK) of an Entity Set E then any super set of K will also be a Super Key of E. So, a Super Key ‘may have some extraneous (unnecessary) attributes, which if removed, the balance set ‘may still form a Super Key of R. Example :- Suppose each student in the Entity Set STUDENT (ROLL_NO, NAME, BRANCH, FATHERS-NAME, ADDRESS, DOB, TEL-NO) has a unique value of ROLL-NO. This implies that no two students can have same ROLL-NO. Then {ROLL- NO, NAME} forms a super key of Entity-Set STUDENT. In this, the attribute NAME is extraneous; which if removed, the balance set i.e. {ROLL-NO} still forms a Super Key of STUDENT. Dr. Kakoli Banerjee Candidate Key. A Super Key, whose no proper subset forms a Super Key, is called a Candidate Key. Thus, Candidate Key is a minimal Super Key (ie. a Super Key having no extraneous attributes). An Entity Set may have more than one Candidate Keys. Example:- The Entity Set STUDENT will have at least two Candidate Keys ice. {ROLL-NO} and {NAME, FATHERS-NAME, DOB, ADDRESS}. Primary Kev. Primary Key is one of the Candidate Keys that is designated by the database designers as primary means of identifying entities within an entity set. In the E- R Diagram, the Primary Key Attributes are underlined with a firm line. Primary Key of a Relationship Set Let R be a binary relationship set between Entity Sets Fi and E2. Let Ky and Kz be the respective Primary Keys of E} and E>. Then the Primary Key of Relationship Set R will depend upon the cardinality mapping of the relationship set, as explained below:- (i) One to One Relationship PK(R) = PK (Ei) or = PK (E:) (ii) One to Many Relationship from E; to E; Here F: is called “Many-Side” Entity Set and E; is called “Oni PK (R) =PK(E2) ie, Primary Key of “ -Side” Entity-Set, lany-Side” Entity-Set. (iii) Many to One Relationship from Ey to E Here Ey is called “Many-Side” Entity Set and E2 is called “One-Side” Entity-Set. PK (R) =PK(Ei) ie. Primary Key of “Many-Side” Entity-Set. (iv) Many to Many Relationship from Ey to Ey PK (R) = PK (Ei) U PK (Ea) «i JS SePostrO} (One-to-One Relationship) Dr. Kakoli Banerjee PK (DEPOSITOR) = CN Gi) or AN ‘CUSTOMER SePostro} ACCOUNT PK (DEPOSITOR) = AN i Ce) . (One-to-Oie Relationship) ive. PK of ACCOUNT Ga PK (DEPOSITOR) = CN (iv), ‘CUSTOMER ACCOUNT (One-to-Onie Relationship) ie, PK of CUSTOMER Brrost0} ‘CUSTOMER ACCOUNT PK (DEPOSITOR) = {CN, AN} Concept of Weak Entity Set An Entity Set is said to be a Weak Entity Set if it does not have sufficient attributes to form its Primary Key. On the other hand, an entity set having a primary key of its own is called a Strong Entity Set. A Weak Entity Set (say E>) will be dependent for its existence on a Strong Entity Set (say E1) to form its Candidate Key. Then Entity Set Ea is said to be “Existence-Dependent” on E} and E; is said to be the “Owner Entity Set” of E2, The relationship R between E> and E; is called “Identifying Relationship”. The Weak Entity Set E> will have a set of attributes called its “Discriminator”, which together with the Primary Key of F; will form the Primary Key of Dr. Kékéli Bar < (One-to-Oiie Relationship) ie. PK (CUSTOMER) v PK (ACCOUNT) Ey Owner Entity Set Identifying Relationship Weak Entity Set Example:-Suppose an Entity Set EMPLOYEE (EMP_ID, EMP_NAME, SALARY, DEPENDENTS) has an attribute DEPENDENT which is multi-valued ie. an employee may have none or one or more than dependents. This situation can be best modeled as follows: MI a Cm? crm) Pap or SALARY EMPLOYEE, DEPENDENT ‘Owner Entity Set, Identifying Relationship Weak Entity Set - The Weak Entity Set DEPENDENT is Existence Dependent on the Strong Entity Set EMPLOYEE. - The Weak Entity Set “DEPENDENT” has a Discriminator Attribute D-NAME, which along with primary key EMP-ID of EMPLOYEE, forms Primary Key of the weak entity set DEPENDENT. In E-R Diagram, the Discriminator (also called Partial Key) of a weak entity set is marked by underlining with a broken line. Special Features of an Identifying Relationshiy Normally, a situation modeled by Weak Entity Set will have following features:- (The Identifying Relationship will be one-to-many from Owner Entity Set to Weak Entity Set, Dr. Kakoli Banerjee 30 (ii) The Participation of Owner Entity Set in the Identifying Relationship will be partial and the participation of the Weak Entity Set in the Identifying Relationship will be Total In the above example, the Weak Entity Set DEPENDENT can also be modeled as a multi-valued attribute of Entity Set EMPLOYEE. The multi-valued attribute can be used to indicate the names of the dependents of employees. But suppose we want to indicate other parameters of dependents like dependent’s relationship with the employee then the multi-valued approach will not be suitable. In this case, the Weak Entity approach will be the ideal choice, since then the weak entity set DEPENDENT can have any number of attributes. Extended E-R Features Specialization. An entity set E may include some sub-groups of entities (say E1, Ez, «..» Ey), such that each of these sub-groups may have some distinct attributes different than the other sub-groups. There will be some attributes that will be common to all sub-groups. The process of designating these sub-groups within an entity set is called Higher Level Entity Set Or Super Class CoP Xe Ey E Lower Level Entity Sets or Sub Classes In the above example, an Entity Set E has been specialized into Sub-groups designated as E, , Ep «s+» En, E is called “Super Class” or “Higher Level Entity Set” and the entity sets EE) ..... Eq are called “Sub Classes” or “Lower Level Entity Sets” of E. The common attributes of all sub entity sets are represented with the super entity sets. And the distinct, attributes of each sub entity set are represented with the sub entity set. Dr. Kakoli Banerjee 31 The relationship of Higher Level Entity Set with its Lower Level Entity Sets is called ISA relationship. It is read as “is a”. Inheritance of Attributes in Specialization Each Sub Class will inherit the Attributes of its Super Class; plus it will have its own distinct Attributes. Like in the above case, each lower entity set will inherit attributes A, and Az of the Super Class E. Example Consider an entity set ACCOUNT with attributes Account-Number and Balance. The Entity Set ACCOUNT may be specialized into different types of accounts like SAVINGS-ACCOUNT, CURRENT-ACCOUNT, _ FIXED-DEPOSIT. (FD) and RECURRING-DEPOSIT (RD). The SAVINGS-ACCOUNT may have an attribute Imerest- Rate and CURRENT-ACCOUNT may have attribute Over-Drafi. Similarly, FD and RD have distinct attributes of their own, ‘ACCOUNT ) ——) RD SAVINGS. Gets ACCOUNT ACCOUNT FD Specialization Constraints Disjoint Vs Overlapping Specialization Disjoint. _It implies that an entity does not belong to more than one lower- level entity set ie. an account is either savings-account or current-account but not both. Overlapping, In overlapping generalizations, an entity may belong to more than ‘one lower-level entity sets within a single generalization. Dr. Kakoli Banerjee Each higher level entity must belong to a lower-level entity set. Partial, Some higher-level entities may not belong to any lower-level entity set. Generalization. Specialization is a top-down approach; whereas Generalization is exactly inverse of that. Generalization refers to the process of fusing several distinct entity sets into a single Higher Level Entity Set, on the basis of commonality of their attributes. Then the fused sets form sub classes or lower level entity sets. The common attributes of the Lower Level Entity Sets will be assigned to the Higher Level Entity Set. Thus, generalization is a process, which proceeds in a bottom-up manner, in which multiple entities are synthesized into a single higher-level entity set, on the basis of their common features. The higher-level entity set is termed as super-class and lower level entity set is termed as sub-class. As regards E-R Diagram, both Specialization and Generalization are represented exactly in the same manner. Aggregation. One limitation of E-R Model is that it fails to express relationships among relationship sets or relationship between a relationship set on one side and an entity set on the other side. Aggregation provides a solution in this case. Aggregation is an abstraction through which relationships are treated as higher-level entities, which can then participate in relationships with other Entity Sets or with other relationship sets. For example the relationship between RI and Es as indicated below. B he Ri E Aggregated Higher Level Entity Set “Ry Here, the Relationship Set Ri between Entity Set E, and Entity Set E2 has been aggregated as Higher Level Entity Set “Rj”. This Higher Level Entity Set is participating in a Relationship R> with Entity Set Es. Thus, through aggregation, we are able to represent a Relationship between Relationship Set R, and Entity Set Es, Dr. Kakoli Banerjee Example:- Suppose, we have Entity Sets “EMPLOYEE”, “BRANCH” and “JOI which are related through a Relationship “EBJ” which indicates, “which employee” is performing “what jobs” at “which branch”. There will be multiple jobs at each at each branch and assume that each employee may be performing multiple jobs at one of the branches. Suppose, we want to relate another Entity Set “MANAGER” to indicate:- (i)The set of Employees managed by a manager. (ii)The set of jobs managed by a manager. )The Branches managed by a manager (assume a manager can manages only one branch), If we represent this scenario without use of aggregation, then the E-R Diagram will be as follows: BRANCH EMPLOYEE —| JOB EM MANAGER The above Scenario can be better modeled by aggregating the Relationship Set “EBJ” a a higher level Entity Set and the creating a relationship between this higher level entity set and the Entity Set “MANAGER”, as indicated below:~ BRANCH t EBI 34 EMPLOYEE, JOB <> MANAGER This modeling represents the situation more realistically, wherein the Relationship Set “EBJM” indicates “which combinations of employee-branch-job” are being managed by each manager. Dr. Kakoli Banerjee Reduction of E-R Schema to Tables An E-R Diagram can be reduced to a set of Tables, as explained below:- (@) Tabular representation of a Strong Entity Set. A Strong Entity Set E will be represented by a Table named “E”. The Table will have columns as follows:~ (i) Simple, Single-valued Attributes There will be a column for each simple, single-valued attribute of Entity Set E. (ii) Composite Attributes There will be a column for each sub-part of a Composite Attribute; no column needs to be assigned for composite attribute as, such. For example for NAME comprising of First Name (FN), Middle Name (MN) and Last Name (LN) there will be three columns for FN, MN and LN. No column needs to be assigned for NAME. If NAME needs to be produced, it can be done by combining the sub-parts. (iii) Derived Attributes No column needs to be assigned for the derived attributes; since the values of these attributes are not stored in database. (iv) Multi-Valued Attribute Each Multi-Valued Attribute (say M) will be represented by a separate Table (say named E-M) which will have a column each for the primary key attributes of E and a column for Attribute M. Each value of the multi-valued attribute will be represented in a separate row in this table. Let E be a Strong Entity Set with simple single-valued attributes al,a2,......,an. This Entity Set will be represented by a Table called E with n distinct columns, each of which will correspond to one of the attributes. Let D1,D2,...Dn be the domains of attributes al,a2,....,an respectively. The Table E will comprise of a set of rows, which will be a subset of the Cartesian Product D1 X D2 X.......Dn. Dr. Kakoli Banerjee STUDENT The derived attribute Age will not be represented in the STUDENT table. When required, its value will be derived from DOB. ‘The Tel-No will be represented in a separate table (say named STUDENT-TEL-NO), which will have a column for Primary Key of STUDENT i.e. Roll-No and a column for Tel-No. Suppose, a student has more than one Tel-No then his Roll-No will appear that ‘many times in this table ‘The Above E-R Diagram will be reduced to following two Tables:~ STUDENT Univ Roll No Name | DOB H-No Street City Pin STUDENT-TEL-NO Univ Roll No Tel_No (b) Tabular representation of Relationship Sets. let a1, a2, Let R be a Relation Set and ‘am be the set of attributes formed by the union of the primary keys of all the Entity Sets participating in Relation R and let the descriptive attributes of R (if any) be bi,bo,....bn. Then the Relation R will be represented by a Table named say “R”, which 37 will (mn) columns, each column representing one of the attributes from the set (a1, Adye0e0+-@m} U {by,b: Example <> bn}. ACCOUNT CUSTOMER ‘The Relationship Set DEPOSITOR will be represented by a table named DEPOSITOR. ‘The Entity Sets CUSTOMER and ACCOUNT have Primary Keys C-Id and Account-No respectively, which will also form part of the DEPOSITOR table. In addition, the DEPOSITOR table will have a column for its Desi iptive Attribute “ c-of- Operation”. ‘The above E-R Diagram will be reduced to the following set of tables:~ CUSTOMER Cd C-Name address ACCOUNT. Account-Number [ Balance Branch-Name DEPOSITOR Cd ‘Account-Number | Date-of-Operation. Shifting of Descriptive Attributes of a Relationship Set and Merging of Relationship. Set Table with the tables of participating Entity Sets. Dr. Kakoli Banerjee Depending on the Cardinality 38 Mapping of the participating Entity Sets, the Descriptive Attributes of the Relationship set can be shifted to one of the participating Entity Sets. Also, a Relationship Set Table can be combined with the table of one of the participating Entity Sets, as per the following conditions: (1) One-to-One Relationship Suppose there is a One-to-One relationship between two entity sets, then the rows in the Relationship Set table will have one-to-one mapping with the rows in the tables of the participating entity sets. Under this condition, it is possible to shift the descriptive attributes of the relationship set to any of the participating Entity Sets and also it is possible to merge the table of the Relationship Set with the table of any of the participating Entity Sets, without loss of any information. TRS ACCOUNT Example: CUSTOMER As indicated above, there is One-to-One Relationship between CUSTOMER and ACCOUNT ie. Each Customer has at most one account and each account is “Single” (ie. owned by only one customer). CUSTOMER Cid Name vaddress C001 Aja 320, Sector-26, Noida C220 Vijay | 110,Sector-8, RKP C310 Ram | 120,Sector-25, Noida C505 Shyam | 303,Sector-22, RKP ACCOUNT ‘Account-Number| Balance | Branch-Name A-101 10000 Sec-18 ‘A-203 30000 Sec-26 Dr. Kakoli Banerjee 39 A305 50000 cP A310 25000 RKP. DEPOSITOR Cd ‘Account-Number | Date-of-Operation A310 10-Jan-2007 ‘A-101 23-Dec-2006 A203 (03-Feb-2007 A305 27-Dec-2007 ‘As obvious, the rows in DEPOSITOR table are having one-to-one mapping with the rows in the CUSTOMER Table and also with the rows in the ACCOUNT Table. That is, the first row of DEPOSITOR maps onto the fourth row of ACCOUNT, the second row of DEPOSITOR maps onto the first row of ACCOUNT, the third row of DEPOSITOR maps onto the second row of ACCOUNT and the last row of DEPOSITOR maps onto the third row of ACCOUNT. Thus, the descriptive attribute Date-Of-Operation of the Relationship Set DEPOSITOR can be shifted to either CUSTOMER or ACCOUNT. Also, the DEPOSITOR Table can be combined either with the CUSTOMER Table or with the ACCOUNT Table, without losing any information. The combined table will have union of the columns of the two merged tables. Suppose, DEPOSITOR Table is merged with the CUSTOMER Table, then the CUSTOMER Table will also include attributes Account_ Number and Date_Of Operation . The resulting set of tables will then be:- CUSTOMER Cd | C-Name C-address ‘Account- | Date-of- Number | Operation C-001_| Ajay 320, Sector-26, Noida | A-310 10-Jan-2007 C-220_| Vijay 110,Sector-8, RKP__| A-101 23-Dec-2006 C310 | Ram 120,Sector-25, Noida | A-203 (03-Feb-2007 €-505_| Shyam | 303,Sector-22,.RKP__| A-305 27-Dec-2007 ACCOUNT ‘Account-Number| Balance | Branch-Name A-101 10000 Sec-18 ‘A-203 30000 ‘Sec-26 ‘A305 50000 cP 310 25000 RKP ‘The combined CUSTOMER Table now includes the Primary Key (AN) of ACCOUNT and descriptive attribute Date_Of_Operation of DEPOSITOR. Dr. Kakoli Banerjee 40 (2) Qne-To-Many Relationship Suppose there is a One-to-Many relationship between CUSTOMER and ACCOUNT i.e. each customer can have many accounts, but each account has to be single. Date-of-Operation Example =) CUSTOMER ae CUSTOMER Cid Name C001 ‘Alay ¥ 220 Vijay | 110,Sector-8, RKP C310 Ram | 120,Sector-25, Noida C505 ‘Shyam | 303,Sector-22,RKP ACCOUNT ‘Account-Number| Balance | Branch-Name A101 10000 Sec-18 30000 Sec-26 50000 cP. 25000 RKP 35000 cP 60000 ‘Sec-18 DEPOSITOR Cid ‘Account-Number Date-of- Operation C001 A310 10-Jan-2007 C220 A-101 23-Dec-2006 C310 A208 (03-Feb-2007 C505 A305 27-Dec-2007 C101 A550 22-Dec-2006 C310 A670 ‘Ol-Jan-2007 ‘The rows in the DEPOSITOR table have one-to-one mapping onto the rows in ACCOUNT Table i.e. with the “Many-Side Entity Set” Table. That is, the first row of DEPOSITOR maps onto the fourth row of ACCOUNT, the second row of DEPOSITOR Dr. Kakoli Banerjee 41 maps onto the first row of ACCOUNT, the third row of DEPOSITOR maps onto the second row of ACCOUNT, the fourth row of DEPOSITOR maps onto the third row of ACCOUNT, the fifth row of DEPOSITOR maps onto the fifth row of ACCOUNT and the last row of DEPOSITOR maps onto the last row of ACCOUNT table. Thus, the descriptive attribute Date-Of-Operation can be shifted to ACCOUNT (The “Many-Side” Entity Set) and the DEPOSITOR Table can be with the ACCOUNT Table (ie. with the table of the “Many-Side” Entity Set), without losing any information, The resultant ACCOUNT table will also include the Primary Key C-Id of CUSOMER table and descriptive attribute DOO of the DEPOSITOR table. The resulting set of tables will then be CUSTOMER Cid CName Caddress C001 ‘Ajay | 320, Sector-26, Noida C220 Vijay | 110,Sector-8, RKP C310 Ram | 120,Sector-25, Noida C505 Shyam | 303,Sector-22,RKP ACCOUNT ‘Account- Balance | Branch-Name | Customer_Id | Date_of_Operation Number ‘A-101 10000 Sec-18 C220 23-Dec-2006 ‘A203 30000 Sec-26 C310 (03-Feb-2007 ‘A305 50000 cP C505 27-Dec-2007 A310 25000 RKP C101 10-Jan-2007 0 35000 cP C101 22-Dec-2006 ‘A-670 {60000 Sec-18 C310 (O1-Jan-2007 (3) _ Many-to-One Relationship Suppose there is many-to-one relationship between CUSTOMER and ACCCOUNT, which implies that each account can be “Joint” but each customer can hold only one account. In this case, the table DEPOSITOR can be combined with “Many-Side” Entity-Set table CUSTOMER. Example Qo |e ACCOUNT CUSTOMER Q CUSTOMER Dr. Kakoli Banerjee 2 Cid C-Name C-address_ C001 Ajay 320, Sector-26, Noida 220 Vijay | H0,Sector-8, RKP C310 Ram | 120,Sector-25, Noida C-505, Shyam __| 303,Sector-22, RKP ACCOUNT. Account-Number | Balance | Branch-Name A-101 10000 Sec-18 A-203 30000 Sec-26 DEPOSITOR Cid Account-Number_[ _Date-oF Operation C-001 A-101 10-Jan-2007 C-220 A-203 23-Dec-2006 C-310 A-101 03-Feb-2007 C-505, A-203 27-Dec-2007 The rows in the DEPOSITOR table have one-to-one mapping onto the rows in CUSTOMER Table ie. with the “Many-Side Entity Set” Table, Thus, the descriptive altributes of DEPOSITOR can be shifted to “Many-Side” Entity Set CUSTOMER and the DEPOSITOR Table can be with the CUSTOMER Table, without losing any information. ‘The resultant CUSTOMER table will also include the Primary Key Account Number of ACCOUNT table and descriptive attribute DOO of the DEPOSITOR table. The resulting set of tables will then be:~ CUSTOMER Cid Name Caddress Account_ Number DOO 001 ‘Ajay | 320, Sector-26, Noida | A-101 10-Jan-2007 C220 Vijay | 110,Sector-8, RKP__| A-203 23-Dec-2006 (C310 Ram | 120,Sector-25, Noida_| A-101 (03-Feb-2007 C505 ‘Shyam | 303,Sector-22, RKP | A-203 27-Dec-2007 ACCOUNT ‘Account-Number| Balance | Branch-Name A101 10000 Sec-18 ‘A203 30000 Sec-26 (4) Many-to-Many Relationship Suppose there is many-to-many relationship between CUSTOMER and ACCCOUNT, which implies that each account can be “Joint” but each customer can hold many accounts. In this case, the table DEPOSITOR cannot be combined with any Entity Set and it has be created as a separate table. Since, if we combine then we have to combine with both the Entity Sets and that would add unnecessary data redundancy, which is not acceptable. Dr. Kakoli Banerjee sc ) Gomer > Cc CUSTOMER ACCOUNT CUSTOMER Cid Name Caddress C001 Alay 320, Sector-26, Noida 220 Vijay | 110,Sector-8, RKP C310 Ram | 120,Sector-25, Noida C505 ‘Shyam | 303,Sector-22, RKP ACCOUNT ‘Account-Number| Balance | Branch-Name A-101 10000 Sec-18 ‘A203 30000 Sec-26 ‘A-305 50000 cP A310 25000 RKP DEPOSITOR Cid ‘Account-Number Date-of- Operation A-101 10-Jan-2007 A203 23-Dec-2006 A-101 (03-Feb-2007 A-203 27-Dec-2007 A305 30-Dec-2007 A310 (02-Jan-2007 Now, the rows in the DEPOSITOR table do not have one-to-one mapping with CUSTOMER table and also with the ACCOUNT table. So, the DEPOSITOR table can neither be merged with CUSTOMER table nor with ACCOUNT table. Thus, there has to be a separate table for DEPOSITOR as indicated above. Also, the descriptive attributes of the Relationship Set cannot be shifted to the participating Entity Sets; the descriptive attributes have to remain with the relationship set itself. (©) Tabular representation of Weak Entity Sets. Let A be a Weak Entity Set with descriptive Attributes al,a2,.......am. Let B be the Strong Entity Set on which A is existence dependent. Let the primary key of B consist of attributes b1,b2,....bn, The Dr. Kakoli Banerjee 44 each column bn}, Entity Set A is represented by a Table called A with (m+n) column: representing one of the attributes from the set {al,a2,......am} U {bI,b2,. Example: = Crone) There will be Tables LOAN and PAYMENT; the PAYMENT table will also include the Primary Key of Loan i.e. Loan-No. The Primary Key of table PAYMENT will be {Loan-No, Payment-No} where the attribute Payment-No is called a “Discriminator” or “Partial Key” of the table PAYMENT. Redundancy of Tables in Weak Entity Sets The Table for_—_‘Identifying Relationship LOAN-PAYMENT is not required because if we create such a table, it will have only two attributes i.e. Loan-No and Payment-No, which as such form part of table PAYMENT. Thus, no table needs to be created for an Identifying Relationship. In case there exists a Descriptive Attribute of an Identifying that can be shifted to the “Many- Side Entity Set” i.e. the Weak Entity Set. (©) Tabular representation of Generalization. The steps involved are:- Create a Table each for the higher-level entity set and for each lower-level entity set. The table for lower-level entity set will include its own attributes plus all the Primary-Key attributes of its higher-level entity set. ACCOUNT one) RD SAVINGS. ACCOUNT Kakoli Barlerjee ACCOUNT For example, in the above case there will five tables ie, ACCOUNT, SAVINGS- ACCOUNT, CURRENT-ACCOUNT, FD and RD. The table ACCOUNT will have columns Account-Number and Balance; and table SAVINGS-ACCOUNT will have columns Account-Number and Imerest Rate; and table CURRENT-RATE will have columns Account-Number and Over-Drafi. Same is applicable to the tables FD and RD. mbining of Tables in Generalization —_ If a generalization is “Total”, which implies that each entity in the super-class (higher-level entity set) is a member of at least one sub- class (lower-level entity set), no table is required to be created for the higher-level entity set. Instead a table needs to be created for each lower-level entity set; and each such table will also include all the attributes of higher-level entity set, in addition to its own distinct attributes. For example, the table SAVINGS-ACCOUNT will also have columns Account- Number, Balance and Interest-Rate; and the table CURRENT-ACCOUNT will also have the columns Account-Number, Balance and Over-Drafi. The same is applicable for FD and RD tables. () Tabular representation of Aggregation. Take the following Example:~ > © prance] EMPLOYEE JOB BU, > Aggregated Higher-Level-Entity-Set “EBS” © Dr. Kakoli Banerjee 46 | MANAGER In the above scenario, there will be tables for Entity Sets EMPLOYEE, BRANCH, JOB and MANAGER. There will be one table for Relationship Set EBJM having Attributes E#, BH, J# and Mgt-Id. No table is required for the Relationship Set EBJ because this table would be a subset of table EBJM. EBJM EE BE i Mgr-ld SZ SCHEDULE E:R DIAGRAM OF AN AIRLINE RESERVATION SYSTEM DD Ge AIRCRAFT Ge Cae Case NAME Dr. Kakoli Banerjee PREMIUM 49 ACCIDENT CLAIM_PAYMENT l= ‘SURVEYOR REPORT ‘ASSESSED DAMAGE REPAIR ITEM REPAIRS Dr. Kakoli Banerjee

You might also like