You are on page 1of 47
Introduction and Conceptual Modeling 1.1 Introduction to File and Database Systems Database is a collection of data. It contains information about one particular enterprise. A database management system (DBMS) consists of a collection of interrelated data and a set of programs to access that data. The primary goal of a DBMS is to provide an environment that is both convenient and efficient to use in retrieving and storing database information. Database systéms are designed to manage large bodies of information. The database system must provide the safety of the information stored, despite system crashes or attempts at unauthorized access. If the data is to be shared among several users, the system must maintain consistency. 4.4.1 Purpose of Database Systems ‘8 DBMS provides a secure and survivable medium for the storage and retrieval of data. In the real world, the data is shared among several users and is persistent. Also the real world data have a structure, related to one another and have constraints. ‘These features are well represented and can be efficiently managed using a DBMS. ‘Also the different users of the data need to create, access and manipulate the data. ‘The DBMS provides mechanism to achieve these objectives without compromising security and integrity of data. Therefore, if the data is shared, if it is persistent, if the users want it to be secure and easy to access and manipulate, then use of a database management system is the best available alternative. 4.4.2 Conventional File Processing System ‘The information can be either a conventional file processing system or a database system. In the conventional file processing system, each and every subsystem of information system will have its own set of files. As a result, there will be duplication of data between various subsystems of the information system. But, in database ° (1-4) Ni 4 ee SYSTEMS 14 Oe ee 1.1.2.2, Advantages of Database Database is a way to consolidate and control the operational data centrally. It is a better way to control the operational data. The advantages of having, a centralized control of data are : i) Redundancy can be reduced In non-database systems, each application or department has its own private files resulting in considerable amount of redundancy of the stored data. Thus storage space is wasted. By having a centralized database most of this can be avoided. ii) Inconsistency can be avoided When the same data is duplicated and changes are made at one side, which is not propagated to the other site, it gives rise to inconsistency. Then the two entries regarding the same data will not agree. So, if the redundancy is removed, chances of having inconsistent data is also removed. iii) The data can be shared The data stored from one application, can be used for another application. Thus, the data of database stored for one application can be shared with new applications. iv) Standards can be enforced With central control of the database, the DBA can ensure that all applicable standards are observed in the representation of the data. v) Security can be enforced DBA can define the access paths for accessing the data stored in database and he can define authorization checks whenever access to sensitive data is attempted. vi) Integrity can be maintained Integrity means that the data in the database is accurate. Centralized control of the data helps in permitting the administrator to define integrity constraints to the data in the database. 4.1.3 Data Abstraction MN ‘A major purpose of a database system is to provide users with an abstract view of the data. That is, the system hides certain details of how the data are stored and maintained. There are three levels of data abstraction : i) Physical level : It is the lowest level of abstraction that describes how the data are actually stored. The physical lével describes complex low/ level data structures in detail. ~ Database Management Systems 1-5 _ Introduction and Conceptual Modeling ii) Logical level : It is the next higher level of abstraction that describes what data are stored in the database and what relationships exist among those data. iii) View level : It is the highest level of abstraction that describes only part of the entire database. , pudtomt~>*) Fig. 1.2 shows the relationships among the three levels of abstraction. View level View 2 [Logical | level Physical level Fig. 1.2 The three levels of data abstraction For example, cdfisider a banking example with records : account, with fields acc_no, balance + employee, with fields employee_name, and salary © customer, with fields customer_name, and customer_id, address. At the physical level, a customer, account, or employee record can be described as a block of consecutive storage locations (for example, words or bytes). The language compiler hides this level of detail from programmers. ‘At the logical level, each record is specified by type definition. For example, we declare a record as : struct customer { int customer_id; ; char customer_name(20]; é char customer_address [30]; \ he Programmers using a programming language work at this level of abstraction. Similarly, database administrators usually work at this level of abstraction. Finally, at the view level, computer users see a set of application programs that hide details of the data types. or ls Di jatabase Management Systems 1-6 __ Introduction and Conceptual Modeling le 4 Database Languages A database system provides a data definition language to specify the databas, schema and a data manipulation language to express database queries and updates, 1.1.4.1, Data Definition Language We specify a database schema by a set of definitions expressed by a specs, language, called a data definition language (DDL). DDL is a set of SQL commands used to create, modify and delete database structures but not data. These commands are not normally used by a general user who should be accessing the database via an application. They are normally used by the DBA to a limited extent, a database designer, or application developer. Examples of DDL commands are : * Create : To create objects in the database Example: create table account (account_no char (10), balance integer); ¢ Alter : Alter the structure of the database. * Drop : Deletes objects from the database. Truncate : Removes all records from a table, including all spaces allocated for the records are removed. Comment : Add comments to the data dictionary. 11.4.2 Data Manipulation Language “Data manipulation is, The retrieval of information stored in the database. 1 oome ¢ The insertion of new information into the database. © The deletion of information from the database. ¢ The modification of information stored in the database. A data manipulation language (DML) is a language that enables users to access oF manipulate data as organized by the appropriate data model. There are two types of DMLs : i) Procedural DMLs : require a user to specify what data are needed and haw to get those data. - ‘i it ii) Declarative DMLs : (Nonprocedural DMLs) : require a user to specify wit data are needed without specifying how to get those data. ling , The DML Introduction and Conceptual Model Management Systems 1-7 Declarative DMLs are easier to learn and use than procedural DML» component of the SQL language is nonprocedural Example of DML commands are * Insert ; Insert data into a table. For e.g. insert into account values ("A101', 1000); Above SQL DML statement inserts a record with values account_no = ‘A101’ and balance = 1000 into account table. * Update : Updates existing data within a table. * Delete : Deletes all records from a table. * Query : A query is a statement requesting the retrival of information. The portion of a DML that involves information retrieval is called a query language. Following query in the SQL language displays account information of account_no = ‘A101’. select account_no, balance MM from account where account_no = ‘A101'; 1.1.5 Database Users and Administrators People who work with a database can be categorized as : « Database users « Database administrators 1.1.51 Database Users .. a. ..ur different types of database system users, differentiated by the way they interact with the system : a) Naive users [They are unsophisticated users who interact with the system by invoking one of the application programs that have been written previously] For example, a bank teller who needs to transfer $ 50 from account A to account B, invokes a program called transfer. This program asks the teller for the amount of money to be transferred, the account from which the money is to be transferred, and the account to which the money is to be transferred. The typical user interface for naive users is a forms interface, where the user can. fill in appropriate fields of the form. Naive users may also simply read reports generated from the database. ed . ee ag ee ’ Database Management Systems 1-8 Introduction and Conceptual Modalng ») Application programmers There are computer professionals who write application programs. Appllicatig Programmers can choose from many tools to develop user interfaces, Rapi, . application development (RAD) tools are tools that enable an application Programme, ” to construct forms and reports without writting a program. Application Programmers . also uses fourth generation languages to facilitate the generation of forms and the display of data on the screen. ¢) Sophisticated users They interact with the system without writting programs. Instead, they form their Fequests in a database query language. They submit each such query to a que Processor, whose function is to break down DML statements into instructions that the storage manager understands. Analysts who submit queries to explore data in the database fall in this category. Online analytical processing (OLAP) tools simplify analysts tasks by letting them view summaries of data in different ways. For instance, an analyst can see total sales by region (for example, North, South, East and West), or by product, or by combination of region and product (that is, total sales of each product in each region). Another class of tools for analyst is data mining tools, which help them to find certain kinds of patterns in data. d) Specialized users These are sophisted users who write specialized database applications that do not fit into the traditional data processing frame work. Among, these applications are computer aided design systems knowledge base and expert systems, systems that store data with complex data types (for example, graphics data and audio data) and environment modeling systems. 1.1.5.2 Database Administrator ad A person who has central control over the system is called a databas administrator (DBA). The functions of a DBA include : * Schema definition : The DBA creates the original database schema by executing a set of data definition statements in the DDL. ¢ Storage structure and access method definition. Schema and physical organization modification : The DBA. carries out changes to the schema and Physical organization to reflect the changing needs of the organization, or to alter the physical organization to improve performance. ee eee ~ =. Database Management Systems 1-11 Introduction and Conceptual Modeling - Ni = lotus Apicaon a Dataese web-users) programmers (enalysts) administrator use ~ write use use ‘Application Query ‘Administration programs tools tools a corgi"! | DML queries DDL interpreter =e Application program. and organizer Query evaluation engine Query processor Authorization Transaction Buffer manager File manager and integrity ineiager manager Storage manager Disk Storage Indices Data dictionary Data Statistical data Fig. 1.3 joeatese system structure Database Management Systems 1-42 Introduction and Conceptual Modeling 1.2.1. Storage Manager [a orage manager is a program module)that provides tierce between the low level data stored in the database and thé application prog) and queries submitted to the Ne The storage manager is responsible for the teraction with the file storage manager |translates the various DML statementsiinto low-level responsible for storing, retrieving, manager. file system commands\Thus, the storage manager and updating data in the databa The various components of the storage manager are = : It tests for satisfaction of various ers accessing the data. * Authorization and integrity manager integrity constraints and checks the authority of us It ensures that the database remains in a consistent ¢ Transaction manager : and concurrent executions proceed without state despite system failures, conflicting, © File manager : It manages the allocation’ of spac data structures used to represent information stored on disk. It is responsible for fetching data from disk storage into main memory. ctures as part of physical e on disk storage and the * Buffer manager : main memory and deciding what data to cache in The storage manager implements several data stru system implementation: * Data files, which store the database itself. Data dictionary : It contains metadata that is data about data. The schema of a table is an example of metadata. A database system consults the data dictionary before reading and modifying actual data. 4 Indices, which provide fast access to data items that hold particular values. 2 The Query Processor The query processor is an important part of the database system. It helps the database system to simplify and facilitate access to data. The query processor components include : « DDL interpreter, which interprets DDL statements and records definitions in the data dictionary. which translates DML statements in a query language into instructions that the query the » DML compiler, an evaluation plan consisting of low-level evaluation engine understands, ‘A query can be translated into any number of evaluation plan: § the same result. The DML compiler also perform query optimization, that is, picks up the lowest cost evaluation plan from among the alternatives. s that all give it Database Management Syst¢oms 4-15 Introduction and Conceptual Modeling * Query evaluation engine, wh. attribute is an attribute composed of a single the DML compiler. ~istence)Sir "++ san not be further “ ate. 1.3 Data Models a Undérlying structure of the database is called as data model. It is a collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints. — Different types of data models are : Entity relationship model ¢ Relational model__- * Hierarchical model * Network model © Object oriented model Object relational model 1.3.1 Entity Relationship Model The E-R data model considers the real world consisting of a set of basic objects, called entities, and relationships among these objects. It is intended primarily for the database design process by allowing the specification of an enterprise scheme. This represents the overall logical structure of the database. 1.3.1.1 Basic Concepts ‘The E-R data model employs three basic notions : entity sets, relationship sets and attributes. Entities and Entity Sets ¢ An entity is ‘thing’ or ‘object’ in the real world that is distinguishable from all other objects..For example, each person in an enterprise is an entity. An entity has a set of properties, and the values for some set of properties. may uniquely identify an entity. For instance, a customer with customer_id property with value C101 uniquely identifies that person. ¢ Anentity may be concrete, such as person or a book, or it may be abstract, such as a loan, or a holiday. . ‘An entity set is a set of entities of the same type that share the sont properties, or attributes. The set of all customers at a given bank can be defined as the entity set customer. eee ee eee Bombay Pune Nashik Solapur Customer /\ AIT 500 A519 3000 Account Fig. 1.4 Entity sets customer and account ar An entity is represented by a set of attributes. Attributes are descriptive Properties possessed by each member of an entity set, 7 d The. attributes of customer entity set are customer_id, customer_name, city possible attributes of the account entity set are account_no, Each entity has a value for each of its ai and and balance. . tributes. For instance, a particular customer entity may have the. value C101 for attribute customer_id, Hari for istomer_name, and Bombay for city. Ae each attribute, there is set of permitted values, called the domain, or value set of that attribute, The-d lomain of attribute customer_name might be length. set.of all text strings of a certain Attributes are classified as ¢ Simple ° Composite * Single-valued e = Multi-valued e Derived Database Management Systems 1-15 Introduction and Conceptual Modeling i) Simple attribute : A. simple attribute is an attribute composed of a single component with an independent existence.)Simple attributes can not be further subdivided. Examples of simple attributes include Roll_No, Acc_No etc. ii) Composite attribute : An attribute composed of multiple components each with an independent existence is called a composite attribute. Examples of composite attributes are : a) Name, which is composed of attributes like first name, middle name and last name. b) Address, which is composed of other components like street, city, pincode. Single-valued attributes : A single-valued attribute is one that holds a single value for a single entity. Examples are Room No, customer_id. Single-valued attributes are also called-& atomic attributes. iv) Multivalued attributes : A multivalued attribute is one that holds multiple values for a single entity. For example, a student entity can have multiple values for the Hobby attribute such as reading, music, painting. v) Derived attribute : A derived attribute is one that represent a value that is derivable from the value of a related attribute or set of attributes. For example, the age attribute can be derived from the data of birth attribute. Relationships and Relationship Sets A relationship expresses an association among several entities. A relationship set is a set of relationships of the same type. For example, consider two entities Person and Company as shown in Fig, 15. The relationship Works-for represents association between Person and Company. This is a binary relationship set. Fig. 1.5 An E-R diagram 1 Entity role : ‘The function that an entity plays in a relationship is called that entity’s role. A role is one end of an association. ng Database Management Systems 1-17 Introduction and Conceptual Modell Teaches Fig. 1.9 Ternory relationship iv) Quaternary relationship : A quaternary relationship exists when there are four entities associated. An example of quaternary relationship is ‘studies’ where four entities are involved- student, teacher, subject and course-material. It is shown in Fig. 1.10. Teacher Studies Course material Fig. 1.10 A quaternary relationship 1.3.1.2 Constraints ‘An ER enterprise schema may define certain constraints to which the contents of a database system must confgrm. Two main important types of constraints are = © Mapping. cardinalities * Participation constraints Database Management Systoms 4-48 Introduction and Conceptual Modeling Mapping Cardinalities Mapping cardinalities express the number of entities to which another entity cay be associated via a relationship set. For a binary relationship set R between entity sets A and B, the mapping cardinalities must be one of the followin i) One to one : An entity in A is associated with at most one entity in B, and an entity in B is associated with at most one entity in A, as shown in following Fig. 1.11, Fig. 1.11 One-to-one mapping cardinality Example : A customer with single account at given branch is shown by one-to-one relationship as given below. Customer Fig. 1.12 One-to-one relationship ii) One-to-many : An entity in A is associated with any number of entities (zero ot more) in B. An entity in B, however, can be associated with at most one entity in A. Fig. 1.13 One-to-many mapping cardinality = Database Management Systems 1-19 Introduction and Conceptual Modeling Example : A customer having two accounts at a given branch is shown by ‘one-to-many relationship as given below. Depositor) ‘Account Fig. 1.14 One-to-many relationship iii) Many-to-one : An entity in A is associated with at most one entity in B. An entity in B, however, can be associated with any number (zero or more) of entities in A Fig. 1.15 Many-to-one mapping cardinality Example : Many employees works for a company. This relationship is shown by many-to-one as given below. Fig. 1.16 Many-to-one relationship iv) Many-to-many : An entity in A is associated with any number (zero or more) entity in B is associated with any number (zero or more) of of entities in B, and an entities in A. Database Management Systems 1-20 Introduction and Conceptual Modeling Example : number of employees. Therefore, the relationshi a ionship betw je many-to-many as shown below. P penvees employes and Poke! Employee J Fig. 1.18 Many-to-many relationship Participation Constraint The participation of an entity set E in a relationship set R is said to be total if every entity in E participates in at least one relationship in R. If only some entities in E participate in relationships in R, the participation of entity set E in relationship R is said to be partial. “—— For example, we expect every loan entity to be related to at least one custome through the borrower relationship. Therefore, the participation of loan in the relationship set borrower is total. While, a bank customer may or may not have a loan. Therefore, a of customer in the borrower relationship is partial. 1.3.4.3 Keys A key allows us to identify a set of attributes and thus distinguishes entities from each other. Keys also help uniquely identify relationships, and thus distinguish relationships from each other. ' Different types of keys are : ° Superkey © Candidate key ° Primary key » © Foreign key 1) Super key : A superkey is a set of one or more attributes that allows us 10 identify uniquely an entity in the entity set. For example, the Roll_No attribute of the entity set ‘student’ distinguishes one student entity from another. “> 2) Candidate key : A superkey may contain extraneous attributes and we are often interested in the smallest superkey. A superkey for which no subset is a superkey * called a candidate key. : For example, Student_name and Stadent_street, are sufficient to uniquely identify one particular student. Hence, Roll No, and {Student Name, Student,street} candidate keys. Although -the attributes Roll_No and Student_Name together iS distinguish customer entities, their combination does not form a candidate key. Sin® the attribute Roll No alone is a candidate key. Employee works on number of projects and project is handled by ! Database Management Systems 1-21 Introduction and Conceptual Modeling 3) Primary key : It is a candidate key that is chosen by the database designer as the principal means of identifying entities within an entity set. For example, Roll_No is a primary set of ‘student’ entity set. The primary key should be chosen such that its attributes are never or rarely, changed. For instance, the address field of a person should not be part of the primary key, since it is likely to change. Social security on the other hand, are guranteed to never change. 4) Foreign key : An attribute or set of attributes, within one relation that matches the candidate key of some (possibly the same) relation. For example, the inclusion of branchno in branch and staff relation links each branch to the details of staff working at that branch. In the branch relation, branchno is primary key. However, in the staff relation the branchno attribute exists to match staff to the branch office they work in. In the staff relation, branchno is foreign key. 1.3.1.4 Entity Relationship Diagram _ The overall logical structure of a database can be expressed graphically using E-R diagrams. Such a diagram consists of the following major components. Component name ‘Symbol Description 1) Rectangles Represents entity sets 2) Ellipses Represents attributes 3) Diamonds oe Represents relationship sets Links attributes to entity sets 4) Lines & entity sets to relationship sets 5) Double ellipses Represents multivalued attributes 6) Dashed ellipses ¢ ) Represents derived attributes t I \ 7) Double rectangles Represents weak entity sets 8) Double {ines Represents total participation of an entity in a relationship set Fig. 1.19 E-R diagram symbols Database Management Systems 1-22 Introduction and Conceptual Modeling Following Fig. 1.20 shows an E-R diagram for banking system. It consists of ty, entity sets, customer and loan related through binary relationship set borrower. Thy attributes associated with customer are Customer_id, Customer_name, ang Customer_address. The attributes of loan are Loan_no and amount. Customer_name Customer_address Fig. 1.20 E-R diagram corresponding to customers and loans Mapping Cardinality Cardinality in E-R diagram that is represented by two ways : i) directed line (+) ii) undirected line (-) 1) One-to-one relationship : Following Fig. 1.21 represents one-to-one relationship between two entities : ‘Manager’ and ‘Department’, related through a_ binary relationship Manages. Manager_address Manager Department id Fig. 1.21 One-to-one relationship 2) One-to-many relationship : Following Fig. 1.22 shows one-to-many relationship between entities : ‘Department’ and ‘Employee’, related through binary relationship ‘Has’. Fig. 1.22 One-to-many relationship Database Management Systems 1-23 Introduction and Conceptual Modeling 3) Many-to-one relationship : Following Fig. 1.23 shows many-to-one telationship between entity sets ‘Students’ and ‘GFM’, relates through bi inary relationship ‘Have’. The interpretation is, many students have one GFM (Guardian Facility Member). suns} ro Teacher no) Fig. 1.23 Many-to-one relationship 4) Many-to-many relationship : Following Fig, 1.24 relationship between entity sets ‘Employee’ and ‘Course’ relationship ‘Joins’. shows many-to-many related through binary Course_name, Fig. 1.24 Many-to-many relationship Dependency Existence dependencies If the existence of entity x depends on the existence of entity y, then x is said to be existence aependent on y. If y is deleted, so is x. Entity y is said to be a dominant entity, and x is said to be subordinate entity. Consider the entity set ‘loan’ and the entity set ‘payment’ that keeps information about all the payments that were made in connection to a particular loan. The Payment entity set is described by the attributes payment_no, payment_date and Payment_amount. We form a relationship set loan_payment between these two entity sets, which is one-to-many from loan to payment. Every payment entity must be associated with a loan entity. Ifa loan entity is deleted, then all its associated payment entities must also be deleted. In contrast, Payment entities can be deleted from the database without affecting any loan. The entity set loan is therefore dominant entity set which is also called as strong entity set, and the entity set payment is subordinate which is also called as weak entity set, in loan-payment relationship. Following Fig. 1.25 (a) shows the dominant entity set ‘loan’ and subordinate entity set ‘payment’ connected by relationship ‘loan-payment'. Database Management Systems 1-24 Introduction and Conceptual Modeling Amount > Fig. 1.25 (a) Existence dependency Definition : Strong and weak entity sets Entities are classified as being strong or weak entity types. An entity that is existence dependent on some other entity is called a weak entity type. An entity set on which weak entity set depends, is called strong entity set. For example, following Fig. 1.25 (b) shows weak entity set ‘Parent’ which depends on strong entity set ‘Employee’. ” Fig. 1.25 (b) Strong and weak entities Representation of Role The function that an entity plays in a relationship is called its role. They are useful when the meaning of a relationship set needs classification. For example, the relationship works-for might be ordered pairs of employees (first is manager, sec ‘ond is worker). In the ER diagram, this can be shown by labelling the lines connecting entities (rectangles) to relationships (diamonds) as shown in Fig. 1.26, Fig. 1.26 E-R diagram with role Indication tabase Mana, Dat igement Systems 1-29 _ Introduction and Conceptual Modeling 1.3.1.5 Extended E-R Model The E-R model i . the ea merci Supported with the additional semantic concepts is called casera onship model or EER model. The EER model includes all iginal E-R model together with the following additional concepts: * Specialization © Generalization « Aggregation Specialization “Specialization is the process of designating subgroupings within an entity set.” Specialization is a top-down process. Consider an entity set person, with attributes name, street, and city. A person may be further classified as one of the following. © customer * employee Each of these person types is described by a set of attributes that includes all the attributes of entity set person plus additional attributes. For example, customer entities may be further described by customer_id and employee entities by employee_code and salary. Fig. 1.33 shows specialization, which is represented by triangle. The lable ISA stands for “is a’, and represents, for example, that customer “is a” person. The ISA relationship may also be referred to as a superclass-subclass relationship. Generalization “Generalization is the process of defining a more general entity type from a set of more specialized entity types.” Generalization is a bottom-up approach. This approach results in the identification of a generalized superclass from the original subclasses. Consider that the attributes of ‘customer’ entity are customer_id, name, street, city and an ‘employee’ entity attributes are employee_code, street, city and salary. Thus, the entity sets employee and customer have several attributes in common. This commonality can be expressed by generalization which is a containment relationship that exists between a higher-level entity set and one or more lower-level entity sets. ‘As shown in Fig. 1.33 person is a higher-level entity set and customer and employee can lower-level entity sets. In other words, person is a superclass if customer and employee are subclasses. > Database Management Systems 1-30 Introduction and Conceptual Modeling Credit_rating ‘Hours_worked Fig. 1.33 Specialization and generalization Generalization constraints Constraints in specialization and generalization allow us to capture some of the important business rules that apply to the relationships. a) one type of constraint involves determining which entities can be members of a given lower-level entity set. Such membership may be one of the following : * Condition defined : In condition-defined lower-level entity sets, membership is evaluated on the basis of whether or not an entity satisfies an explicit condition or predicate. For example, assume that the higher-level entity set account is having attribute account_type. Only those entities that satisfy the condition account_type = “savings account” are allowed to belong to the lower-level entity set ‘savings-account’. All entities that satisfy the condition account_type = “checking account” are included in checking account. ¢ User defined : These types of constraints are defined by user. For e.g., let us assume that, after 3 months of employment bank employees are assigned to one of four work teams. We therefore represent the teams as four lower-level entity sets of the higher-level employee entity set. A given employee is assigned to one of the four teams by incharge of the teams. — — Mi Database Management Systems 1-31 Introduction and Conceptual Modeling b) A second type of constraint relates to whether or not entities may belong to more than one lower-level entity set within a single generalization. The lower-level entity sets may be one of the following : + Disjoint : A disjointness constraint requires that an entity belong to only one lower-level entity set. For example, an account entity may be either saving-account or checking-account. It satisfies just one condition at a time. * Overlapping : In overlapping generalization, the same entity may belong to more than one lower-level entity set within a single generalization. For example, a single manager may participate in more than one work team. Another example is, assume for person, two entities customer and employee are derived. The generalization is overlapping if an employee can also be a customer. c) A_ final constraint, is a completeness constraint on a generalization/specialization, which specifies whether or not an entity in the higher-level entity set must belong to at least one of the lower-level entity sets within the generalization/specialization. This constraint may be one of the following : «Total generalization or specialization : Each higher-level entity must belong to a lower-level entity set. «Partial generalization or specialization : Some higher-level entities may not belong to any lower-level entity set. Partial generalization is the default. We can specify total generalization in an E-R diagram by using a double line to connect the box representing the higher-level entity set to the triangle symbol. For example : employees are assigned to a team only ofter 3 months on the job. Some employee entities may not be members of any of the lower-level team entity sets. We may charactize the team entity sets more fully as a partial, overlapping specialization of employee. The generalization of checking-account and savings-account into account is a total disjoint generalization. Attribute Inheritance “A ctucial property of the higher and lower-level entities created by specialization and generalization is attribute inheritance”. The attributes of the higher-level entity sets are said to be inherited by the lower-level entity sets. For example, customer and employee inherit the atributes of the person. Thus, customer is described by name, street, city and with additional attribute customer_id. Similarly, employee is described by name, street, city and additional attributes employee_code and salary. iY - eee Database Management Systems 1-32 Introduction and Conceptual Modeling Aggregation One limitation on E-R model is that it cannot express relationships among telationships. Consider a quatenary relationship manages between employee, branch, job and manager. Using the basic E-R modeling constructs, we obtain the E-R diagram as shown below. Employee (Manages) Fig. 1.34 E-R diagram with redundant relationships There is redundant information in the resultant figure, since every employee, branch, job combination in manages is also in Works_on. The best way to model above situation is to use aggregation. Aggregation is an abstraction through which relationships are treated as higher-level entities. Thus, the relationship set Works_on relating the entity sets employee, branch and job is considered as a higher_level entity set called Works_on. We can then create a binary relationship manages between Works_on and manager to represent who manages what tasks. Fig. 135 shows E-R with aggregation. abase Management S; Dal 9 ystems 1-33 Introduction and Conceptual Modeling Alternative E-R Notations Employee (Manages) Works_on) Branch Manager Fig. 1.35 E-R diagram with aggregation [LE] eves SD Aitite Mutvaued Wook ont set Stowe oo flare ot CAD Seem L Tota Ce > Ronan set =] Paripaton of WJ) evrekeentiy sot erty satin reatonship Discriminating Primary hey otra of neak entity set Many-to-many Many-to-one relationship relationship Aaah One-to-one E | Cardinatity relationship limits Rol name Role ISA (Specialization indicator or generalization) Disjoint sgoneralization disjoint Fig. 1.36 Symbols used in the E-R diagram Database Management Systems 1-34 _ Introduction and Conceptual Model ng 1.3.1.6 Reduction of an E-R Schema to Tables We can represent a database that conforms to an E-R database schema by collection of tables. For each entity set and for each relationship set in the databrc there is a unique table to which we can assign the name of the corresponding entity set or relationship set. Each table has multiple columns, each of which has a uniqu «| name. Both the E-R model and the relational database model are logical representation of real-world enterprise. As both the models uses similar design principles, we can convert an E-R design into a relational design. The constraints specified in an E-R diagram, such as primary keys and cardinality constraints, are mapped to constraints on the tables generated from the E-R diagram Tabular Representation of Strong Entity Sets Let E be a strong entity set with descriptive attributes a1,42,.....4,- We represent | this entity by a table called E with n distinct column, each of which corresponds to one of the attributes of E. Each now in this table corresponds to one entity of the entity set E. For example, consider an entity set ‘Student’ with attributes : Roll_No and Name. We represent this entity set by a table ‘Student’ with two columns ‘Roll_No’ and ‘Name’ as shown in Fig. 1.37. Roll_No 2001 Fig. 1.37 The student table The row, (2001, John) in the table Student, means the Roll_No 2001 and name John. We can add new rows and also delete or update existing rows. Tabular Representation of Weak Entity Sets Let A be a weak entity set with attributes a1,42 entity set on which A depends. Let the primary key by ,bp bn» We represent the entity set A by a table ca each attribute of the set. An} U {bi be, “uy. Let “BY be the strong of B consist of attributes led A with one column for bu} {ai 42 Introduction and Conceptual Modeling For example, Debit-amount, Credit-amount and Balance. 1 primary key Account-no. Thus ‘Account-no, Transaction-date, ‘saction’ with attributes Transaction-date, it depends on entity set ‘Account’ with the Transaction table is represented by columns Debit-a Aco mount, Credit-amount, Balance as shown in ———-—_.. Account-no Transaction-date Debit-amount | Credi ‘amount Balance A001 20/12/2000 100 0 2100 ‘A002 12/10/2001 0 200 1800 A009 01/01/2002 200 0 1500 A101 | 2M212003 0 300 2000 Fig. 1.38 A Transaction table Tabular Representation of Relationship Sets Let R be a relationship set. Let a1,42,...,@m be the set of attributes formed by the union of the primary keys of each of the entity sets participating in R, and let the descriptive attributes (if any) of R be bi,b2,...,b,. We represent this relationship set by a table called R with one column for each attribute of the set. £0102 poor Am} U {01 b2 joesBn} For example, consider the relationship set depositor. In Fig. 1.27 with descriptive attribute access-date. This relationship set involves two entity sets. © Customer, with the primary-key customer-id * Account, with the primary key account-no Thus, the depositor table has three columns as shown in Fig. 1.39. Customer_id Account_no Access_date C0001 0001 10/12/2000 coo02 ‘A0002 43/10/2001 coos ‘0003 09/10/2004 C0004 0004 12/12/2001 Fig. 1.39 The depositor table Database Management Systems 1-36 _ Introduction and Conceptual Modeii ing Redundancy of Tables Consider two entity sets loan and payment, which are linked to each other us relationship loan-payment with no descriptive attribute. The weak entity set paymee depends on entity set loan. The primary key of payment is (loar-no, paymentay since loan-payment table has no descriptive attributes, it would have two column {loan-no, payment-no}. Every {loan-no, payment-no} combination in loan-paymen, would also be present in the payment table, and vice versa. Thus, loan-payment tab is redundant. In general, the table for the relationship set linking a weak entity set t, its corresponding strong entity set is redundant and does not need to present a tabula representation of an E-R diagram. Composite Attributes We handle composite attributes by creating a separate attribute for each of the component attributes ; we do not create a separate column for the composite attribute itself, Suppose address is a composite attribute of entity set customer, and the components of address are street and city. The table generated from customer would then contain columns address_street and address_city ; there is no separate column for address. Multivalued Attributes Consider an entity set employee with multivalued attribute dependent name. For such multivalued attribute, create a separate table as dependent_info with attributes dependent_name, and employee_id which is the primary key of employee entity se Tabular Representation of Generalization There are two different methods for transforming to a tabular form an E-R that includes generalization. Consider the generalization shown in Fig, 140 where accour! is a higher-level entity, while saving-account and checking-account are lower-level Le) oe Gecko Fig. 1.40 E-R with generalization a Manageme patabase igement Systems 1-37 Introduction and Conceptual Modeling 1. ee a table for the higher-level entity set. For each lower level entity set crete pe rare table that includes columns for each of the attributes of that ed i f lus a column for each attribute of the primary key of the higher-level entity set. Thus, for E-R diagram in Fig. 1.40 we have three tables i, Account, with attributes account_no and balance ii, Saving_account, with attributes acc_no, interest_rate. iii. Checking_account, with attributes acc_no and overdraft_amount. 2. If the generalization is disjoint and complete, that is, if no entity is a member of two lower-level entity sets directly below a higher-level entity set, and if every entity in the higher-level entity set is also a member of one of the lower-level entity sets. Then, do not create table for higher level entity set Instead, create separate table for each lower-level entity sets that includes column for each of the attributes of that entity set plus a column for each attribute of the higher-level entity set. Then for E-R diagram in Fig. 1.40, we have two tables : ¢ Saving_account, with attributes acc_no, balance and interset_rate. * Checking_account, with attributes acc_no, balance and overdraft_amount. Tabular Representation of Aggregation For E-R diagram in Fig. 1.35, the table for the relaionship set manages between the aggregation of works-on and the entity set manager includes a column for each attribute in the primary keys of the entity set manager and the relationship set works-on. It would also include the descriptive attributes of relationship set, if any exist. 1.3.2 Relational Model The relational model was introduced by Dr. EF. Codd in 1970 and has evolved since then, through a series of writings. The relational model represents data in the form of two dimensional tables. Each table represents some real world entity, or thing. The organization of data into relational tables is known as the logical view of the database. A basic understanding of the relational model is necessary to effectively use relational database software such as Oracle, Microsoft SQL Server, Sybase, which are based on the relational model. > Database Management Systems 1-42 _ Introduction and Conceptual Modeli ng Another source of referential integrity constraints are weak entity sets. Thi scheme for a weak entity set must include the primary Y stts. The relation key of the entity set it depends. Thus, the relation scheme for each weak en Me eon which tity set includes that leads to a referential integrity constraint. a foreign key Referential integrity in SQL : Using SQL, primary key, candidate key, and foreign key are defined as Part of th create table statement as given below : * Example : create table deposit (br_name char(15), acc_no char(10), cust_name char (20) not null, balance integer, primary key (acc_no, cust_name), foreign key (branch_name) references branch, foreign key (cust_name) references customer) ; Nulls “Null represents a value for an attribute that is currently unknown or is not applicable for this tuple.” A null can be taken to mean the logical value unknown. It is a value that is not applicable to a particular tuple. Nulls are a way to deal with incomplete or exceptional data. However, a null is neither a zero numeric value nor a text string filled with spaces. Other integrity constraints are : entity integrity and enterprise constraints. Entity Integrity “In a base relation, no attribute of a primary key can be null.” A primary key is used to identify tuples uniquely. This means that no subset of the primary key is sufficient to provide unique identification of tuples. Therefore, primary key should not be null. For example, as branch_no is the primary key of the Branch relation, we should not be able to insert a tuple into the branch relation with a null for the branch_no attribute. Enterprise Constraints These are additional rules specified by the users or database administrators of a database. ase Management x Databé 9 Systems 1-43 _ Introduction and Conceptual Modeling is also possible rs = i see oe . ce users to specify additional constraints that the data must satisfy- ple, if limit on number of staffs working at a branch is 20, then the \ user must be able to specify it and expect DBMS to enforce it. In this case, it should not be possible to add a new staff at that branch Data Manipulation The manipulative part of the relational model consists of a set of operators known collectively as the relational algebra together with relational calculus. Advantages of Relational Model The major advantages of the relational model are : + Structural independance : When it is possible to make change to the database structure without affecting the DBMS’s capability to access data, we ; can say that structural independence have been achieved. In relational database, changes in the database structure do not affect the data access. So relational database has structural independence. « Conceptual simplicity : The relational database model is simpler at the conceptual level. Since the relational data model frees the designer from the physical data storage details, the designers can concentrate on the logical view of the database. « Design, implementation, maintenance and usage ease The relational database model achieves both data independence and structural independence making the database design, maintenance, administration and usage much easier than the other models. * Good for ad hoc requests «It is simpler to navigate © Greater flexibility. Disadvantages of Relational Model ¢ Significant hardware and software overheads. Not as good for transaction process modeling as hierarchical and network models. + May have slower processing times than hierarchical and network models. 1.3.3. Hierarchical Model ‘A hierarchical database is a kind of database management system that links records together in a tree data structure such that each record type has only one owner, e.g. an order is owned by only one customer, Database Management Systems 1-44 Introduction and Conceptuay Mod Hierarchical structures were widely used in the first main frame g, management systems. ln Haba Following Fig. 1.44 shows a sample hierarchical database model 99]104] 25| 49 13] 21 Fig. 1.44 A sample hierarchical database Advantages High speed of access to large datasets. Ease of updates. Simplicity : The design of a hierarchical database is simple. Data security : Hierarchical model was the first database model that offered the data security that is provided and enforced by the DBMS. Efficiency : The hierarchical database model is a very efficient one when the database contains a large number of 1 : n relationships and when the uses tequire large number of transactions, using data whose relationships are fixed. Disadvantages Implementation complexity ; Although the hierarchical database model is conceptually simple and easy to design, it is quite complex to implement. Database management problems : If you make any changes in the database e, then you need to make the necessary Lack of structural inde, storage paths to nevig: | 4.4.1 The Relational Algebra tabase Management Syst pa = 1-47 Introduction and Conceptual Modeling ) procedural language in procedural language, the use i " ser instructs the sys scform a sequence operations on the database to compute the desired ae aaa ii) Nonprocedural language Ina nonprocedwn Z P) ral language, the user describes the desired information without giving a specific procedure for obtaining that information The relational algebra is a procedural query language. It consists of a set of operations that take one or two operations as input and produce a new relation as their result. Formal definition of the relational algebra : A basic expression in the relational algebra consists of either one of the following * A relation in the database_- « A constant relation ~ ‘A constant relation is written by listing its tuples within | }, for examy Downtown, 500) (A~215, mianus, 700) }. A. general_ expression in relational algebra is constructed out d Fy be relational algebra expressions. Then, ple ((A - 101 of smaller sub expressions. Let E, an following, all are relational algebra expression : ° EVE, ° B-E © ExEp + op (Ex), where p is a predica ¢ $ is a list consisting of some te on attributes in Ei of the attributes in E, © Hs (Ex), wher new name for the result of E, © ps (Ex), where x is the ns algebraic operations are divided into two groups : tions. Since each relation is defined as a i) The first group includes the set opera set of tuples, the set operations are applicable to the relational data model. Set operations include the following operations. The relational Database Management Systems * Union * Intersection + Set difference © Cartesian product ii) The second group of relational relational databases. Some of the operations of this grou © Select y © Project “WV * Rename + Join « Division. —_ 1.4.1.1. Union, Intersection and Difference 4-48 Introduction and Conceptual Mog “a algebraic operations is developed specially p are : fe Union, intersection and difference operations require that the tables involved be union compatible. Two relations are said to be conditions are satisfied : «The two relations/tables must the same degree). Consider following two relations. depositor Each column of the first relation/table must be either the same data type as the corresponding column of the second relation/table or convertible to the same data type as corresponding column of the second. l wy | customer_name city Hayes Pune Johnson Mumbai Jones ‘Solapur Lindsay Nashik ‘Smith Pune ‘ Turmer Mumbai e union compatible if the following contain the same number of columns (have ' yonsse Management Systems 1-49 —_ Introduction and Conceptual Modeling __borrower |__customer_name - “elt = city Adams as a Mumbai cary or Hayes Pune | Jackson Williams Kolhapur Fig. 1.46 A depositor and borrower relation The two relations are union compatible. Now we will see four set oriented opefations. ‘Union The result of operation is denoted by depositor U borrower, is the relation that includes all tuples that are either in depositor or borrower or in both. Duplicates are diminated. Query : Find the names of all bank customers who both. | ‘The result of union is have an account or loan or depositer U borrower customer_name Williams b » Database Management Systems 1-50 _ Introduction and Conceptual Mody, ( ng Intersection The result of intersection operation is 2 relatio both depositor and borrower. The intersection opera! borrower. The result of intersection is : nn that includes all tuples that ay tion is denoted by deposity ” ; depositor > borrower custome_name city Hayes [ame | Jones ‘Solapur Smith Pune Fig. 1.48 Result of intersection operation Difference The difference operation is denoted by deposit difference operator is the relation that contains all jor - borrower. The result of the | tuples in depositor but not in borrower. depositor - borrower city Mumbai Lindsay Nashik | Turmer Mumbai Fig. 1.49 Result of difference operation 7 Both union and intersection operations are commutative and associative operations. This means that following are true : *AUB = BUAand ANB=BNA * AUBUC) = (AUB)UC and An(BNC)=(ANB)AC The difference operation is not commutative. This means that A - B is not the | same as B - A or in other words A-B#B-A. | Cartesian Product The cartesian product is also known as CROSS PRODUCT or CROSS JOINS. It is denoted by ‘x’. The cartesian product of two relations A and B is denoted by A x B. ‘The result of cartesian product of two relations which have X and Y columns is 4 relation that has X + Y columns. The resulting relation will have one tuple or each _patabase Managem — igement Systems 1-54 Modeling Introduction and Conceptual combination of tupl and m tuples ae cach participating relation. Hence if the relations have 1 Consider followi ely, then the CARTESIAN PRODUCT will have 8" ™ tuples. ing two relations Publisher_Info and Book_Info. Publisher_Info Publisher_code Name 0001 DBMS 0002 Compiler a Fig. 1.50 Publisher_Info and Book_Info relation The relation Publisher_Info has 2 columns and 3 tuples. The relation Book_Info has ples. So the cartesian product has 4 columns (2+2) and 6 tuples 2 columns and 2 ty g. 1.51. (3* 2). The cartesian product of Publisher_Info and Book_Info is given in Fi Publisher_Info, x Book_Info Publisher_ID Name Book 1D Title P0001 McGraw-Hill 80001 DBMS P0002 PHI 80001 DBMS P0003 Pearson 80001 DBMS P0001 McGraw-Hill B0002 Compiler P0002 PHI B0002 Compiler P0003 Pearson B0002 Compiler Fig. 1.51 Cartesian product of relations Publisher_Info and Book_Info a2. The Select Operation ; ‘The select operation selects tuples that satisfy a given predicate. | Database Management Systems 1-52__ Introduction and Conceptual Mody; ng The select operation is represented as follows ©