You are on page 1of 15
Chapter7 The Relational Datahase Model ‘There are four types of database models: relational, hierarchical, network and object- oriented. The relational database model is however by far the most widely used data model today (although the object-oriented model is also gaining popularity in recent years). The relational model was first introduced in the early 1970s by E.F. Codd. It was the result of an attempt to produce a precise, formal mathematical representation of data, The term relational model (also called relational data model or relational database ‘model) refers wo the arrangement of data as a set of tables while the term relational database management system (RDBMS) refers to the software that manages those tables. What exactly is a relational model? This chapter explains the basic concepts of the relational data model 11 The Relational Model ‘The relational database model represents data in the form of tables (or relations) and a mathematical language called relational algebra is used to manipulate the tables. ‘The relational data model has the following characteristics: 1. Data structure, Data are stored in the form of tables or relations. 2, Data manipulation, Powerful operations are used to manipulate the data stored in the relations. 3. Data integrity. Facilities are included to specify business rules that maintain the integrity of data when they are manipulated, 1:2 ¢ Chapter 7 12 Relations A relation is a named, two-dimensional table of data. It consists of a set of named columns and an arbitrary number of unnamed. rows. Each column (e.g., student name) in the relation corresponds to an attribute of that relation. Each row corresponds to a record that contains data values for an entity (e.g., student). ‘The table below shows an example of a relation named Student. The relation contains the following attributes describing students: Studid, StudName, Age, Sex and Address. There ere seven rows in the table corresponding to seven students. Adding or deleting rows doesn't change the relation. student ‘StwiName | Age | Sex | Address 444 | Marina 25 | F [44 Lake Avenue, 60000 Kuala Lampur 222_[ Anthom 27_[-M | 22 Forest View, 47000 Petaling Jaya 555__| Zainal 26_|_M_| 55 Beach Road, 33000 Penang 111 | MeiLing | 25 | F | 11 Orchard Road, 22000 Johor 333 | Ozem 30_|M_[33 Mountain View, 72000 Langkawi 771__| Shoba 24_[F [77 Ocean View, 45000 Kuantan ‘666. | Moti Lal 28 | M__| 66 Garden Road, 50000 Kuala Lumpur ‘You can represent the structure of the above relation using the following shorthand notation: student (StudId, StudName, Age, sex, Address) (VeSIBe See Arie Relation name Field Names Note that the key field (Studd) is underlined in both the table and in the shorthand notation. (Key fields are used for linking tables and searching information in a database.) 18 Properties Of Relations We have defined relations as two-dimensional tables; however, not all tables are relations. Relations have certain properties and only those tables that have these properties are relations. A relation has the following properties: The Relational Database Model ¢ 7-3. 1. Entries in columns are atomic. There is only one entry at the intersection of a row and a column, That means, there can be no multi-valued attributes or repeating groups in a relation. The above table meets this definition, 2. Entries in columns are from the same domain. The set of values that can appear in a column or field is called its domain. For example, the domains for the above table may be defined as follows: Field Domain StudId Three digits in the range 000 to 999. StudName A string of alphabetic characters (including blanks) up to 20 characters long. Age An integer in the range 15 to 35. Sex The characters F,£,M or m. Address —_A string of characters up to 30 characters long. ‘The domain of an attribute is the set of all possible values for that attribute. It is important to know the domain of an attribute in order to determine’ whether a given data item is valid or not valid for that attribute. For example, the domain of Sex must be either F (£) or M(m). Anything else will be invalid, 3. Each row is unique, No two rows in a relation are identical. The above table meets this requirement. This property assures that each row in the table is ‘meaningful and that a user can easily locate the desired data. The primary key in a relation guarantees the uniqueness in a relation, The key can be a single field or a combination of fields. The primary key values cannot be null (empty) as they ‘would then not identify a row. This is known as entity integrity constraint. 4, The sequence of columns is insignificant. The columns of a relation can be interchanged without changing the meaning or use of the relation, That means, they can be arranged and stored in any sequence. The columns however must be referenced by name and not by position within the table (since position is not significant), 5. The sequence of rows is insignificant. As with columns, the rows of a relation can be interchanged. That means, you can insert a new record anywhere in the table - at the beginning, end or middle. 6. Ifa value in one relation refers to a row in another relation, then that row must exist, This is known as the referential integrity constraint. Hi ¢ Chapter 7 7.4 Well-Structured Relations ‘What is a well-structured relation? Simply put, a well-structured relation contains a minimum number of redundancies and allows users to insert, modify and delete the rows in a table without errors or inconsistencies, The above table is a well-structured relation. Each record describes one student. Any modification is confined to one row of the table. Redundancies in a table may result in errors or inconsistencies when a user attempts to update the data in the table, These errors are called anomalies. There are three types of anomalies: insertion, deletion and modification anomalies. To illustrate these anomalies, let's use the following table (which is not a relation): ‘Stud | Stud | Tel [Major | Course | Course | Teacher | Teacher ] Grade Jd_|_Name Id Title_| Name _| Room Ti_| Marina | 1234 [BU | BUIO0 | Bus Org_| Wong. AL B 1 Marina [1234 [Bu | 8U200_|Econs [Aziz AZ, A iii _| Marina [1234 [BU [18300 — | Database _[ Shoba B3 Cc 222 [Anthony | 2345 | 1S 18200 | Info Sys_[ Allen BS B 222 [Anthony [2345 [1S 18300 | Database | Shoba BB A Insertion anomaly. Suppose that we want to insert a new course. We cannot insert it into the table until a student has registered for that course. Deletion anomaly. Suppose that the student number 222 is deleted from the table. ‘This will result in losing the course IS200. Modification anomaly. Suppose that the student number 222 changes his telephone number, we must record this fact in several rows in the table. The above anomalies indicate that the table given above is not a well-structured relation. The problem with this table is that it contains data on more than one entity, eg., Student and Course. ‘The solution to these anomalies is normalization. The Relational Database Model ¢ 7 1.5 Normalization Normalization is a process of converting complex data structures into simpler, stable data structures. It is based on the analysis of functional dependence (discussed below). The process of normalization is often accomplished in stages, each of which corresponds to @ normal form. The dependencies are removed successively by applying few simple rules. ‘The steps in normalization are as follows: 1, Remove repeating groups. This produces a single value at the intersection of each row and column of the table (explained in the example below). The resulting table is in First normal form (INF). 2. Remove partial functional dependencies. ‘This results in Second normal form QNF). 3. Remove transitive dependencies. This results in Third normal form (3NF). 4. Remove remaining anomalies resulting from functional dependencies. This results in Boyce-Codd normal form. 5. Remove multi-valued dependencies. This results in Fourth normal form (4NF). 6, Remove remaining anomalies. This results in Fifth normal form (SNF). A functional dependency is a particular relationship between two attributes, For any relation R, attribute B is functionally dependent on attribute A, if for every valid ‘occurrence of A, that value of A uniquely determines the value of B. The functional dependence of B on A is represented using the arrow symbol (>) as follows: A > B. An attribute may also be functionally dependent on two or more attributes. If AB and BC, then AC. That is, if B depends on A and C depends on B, then C depends on A. This is called transitive dependency, Although five normal forms are listed in the literature, there is seldom a need to go beyond than the Third normal form (3NF). So here we will only discuss up to the SNF. To illustrate normalization, let's use the table given below. 16 ¢ Chapter 7, Grade Stud [Stud [Tel | Major | Course | Course | Teacher | Teacher | Grade Td_|_Name ta_| Tite | Name _| Room TH [Marina [1234 [BU [BUIO0 | Bus Org | Wong AL B BU200 | Econs | Aziz a2 A 1$300__| Database_| Shoba B3 c 22 | Anthony” [2345 [1S [18200 | Info Sys f Allen Bt B 1$300__| Database _| Shoba BB A ‘The above table contains ‘repeated entries: the course data is repeated for cach student. As a result there are multiple values at the intersection between certain rows and columns. For example, there are three values for Course Id (BU100, BU200 . and 18300) for Marina. To normalize the above table, use the following steps: StepI: Remove repeating groups. This is done casily by copying the information above downwards as shown below. The table (Grade2) is now in INF. Grade2_(1NF) ‘Stud [Stud] Tel | Major | Course | Course | Teacher | Teacher | Grade Ta_| Name Id Tide _|Name__| Room Tt | Marina_| 1234 [BU | BU100 | BusOrg | Wong AL B Tit _| Marina [1234 | BU | BU200_[Beons [Aziz AD a I11_| Marina | 1234 | BU | 18300 _ | Database | Shoba BB Cc 222 [Anthony {2345 | 1S 18200 | Info Sys | Allen BA B 222 Anthony | 2345 [18 18300__| Database _| Shoba BS A ‘As explained, the above relation suffers from the insertion, deletion and modification anomalies. That means, the relation is not well-structured; it requires futher normalization. Step 2: Remove partial functional dependencies. To apply step 2, we must frst analyze the functional dependencies in the above relation and then select a primary key for the relation. ‘The above relation reveals the following dependencies: 1, Studtd > StudName, Tel, Major 2. Courseld > CourseTitle, TeacherName, TeacherRoom 3. Studtd, Courseta > Grade (dependent on two fields) 4, TeacherName > TeacherRoom The Relational Database Model ¢ T- In (1), StudName, Tel and Major are functionally dependent on Studtd. Similarly, in (2), CourseTitle, TeacherName and TeacherRoom are functionally dependent on Course1d. And so on. A candidate key for the relation is one that uniquely identifies the right-hand side attributes, and is non-redundant. Careful examination shows that there is one such key containing both StudTa and CourseT4. It is called a composite primary key = a primary key that contains more than one attribute. The relation is shown as follows: GRADE? (StudId, StudName, Tel, Major, Courseld, CourseTitie, TeacherName, TeacherRoom, Grade) The composite key (Stuaté, StucName) is underlined. Next we examine the functional dependencies on this composite key. Three attributes (Studd, Tel and Major) are functionally dependent on-part of the key (Studd) as shown in Figure 7.1. Similarly, the attributes (Course title, Teacher and TeacherRoom) are functionally dependent on another part of the key (Coursetd). ‘These six attributes are partially functionally dependent on the primary key. Only the attribute Grade is functionally dependent on both StudId and CourseZd, That ‘means, we must know both the StudId and the CourseTd taken in order to identify Grade. Since partial dependencies exist, the above relation is not in 2NF. Figure 7.1: Functional dependencies KEY 1} “TeacherRoom EB ¢ Chapter 7 To convert to 2NF, we must remove the partial dependencies. Close examination tells. that we can do this by creating three new relations as follows: Student (StudzD, studName, Tel, Major) CourseTeacher(CourseId, CourseTitle, TeacherName, ‘Teacher_room) Registration(StudID, Courseld, Grade) ‘The three new relations are as follows: Student Stud | Stud | Tel_| Major Id_| Name Til | Marina | 1234 | BU 2iz_| Anthony | 2345 [1S CourseTeacher Course | Course | Teacher | Teacher Id Title | Name | Room BU100 | Bus Org_| Wong AL BU200_|Bcons | Azz AD 1S300_| Database | Shoba BB 1S200 | Info Sys | Allen. Ba 18300 | Database_| Shoba Ba Registration ‘Stud | Course ] Grade ta_| aa Tit | BU100 Ti_| BU200 11118300. 222_| 18200. 222 [1s300 a >|elol>| ‘The above relations are in 2NF as cach non-key attribute in the relations is fully dependent on the key for that relation. Step 3: Remove transitive dependencies. The Student and Registration are already in INF. However, CourseTeacher is still in the 2NF. That means, it is still subject to the anomalies discussed above. For exemple, if we want to change the location for the teacher Shoba from room B3 to A3, we must make this change in ‘multiple rows. Similarly, if we delete the course 18200 and there is only one row The Relational Database Model ¢ 7-9, for S200, then we may lose the information that the teacher Allen is located in room B4. ‘The anomalies in the CourseTeacher relation exist because data concerning the entities ‘Teacher are hidden within CourseTeacher. The functional dependencies in this relation are: CourseId > CourseTitle, TeacherName, TeacherRoom TeacherName > TeacherRoom As Teacher room is functionally dependent on Teacher . name (a non-key attribute), there is transitive dependency in the relation. Therefore, the relation CourseTeacher is not yet in 3NF. To remove this transitive dependency from CourseTeacher, we split it into two relations Course and Teacher as follows: course, ‘Course ‘Teacher Ia Name BUIOO Wong BU200 ‘Avie 18300 Shoba 1S200 ‘Allen, 18300 Shoba Teacher “Teacher | Teacher Name | Room Wong AL ‘Aziz AZ ‘Shoba B3 ‘Allen Ba Shoba B3 The relation Course contains the following attributes: CourseTd (key), Course title, and Teacher. The relation Teacher contains two attributes: Teacher name (key) and TeacherRoom. Thus the TeacherName becomes the primary key in the new relation Teacher and a foreign key in the new Course relation. A foreign key is an attribute that appears as a non-key attribute in one relation and as a primary key in another relation. transformed the original table to a set of four relations in 3NF through a series of simple steps. The complete set is as follows: Student (StudId, studName, Tel, Major) Stud [Stud [Tel | Major Ta_| Name Tit_[Marina [7234 | BU 222__| Anthony [3345 iS Course (Courseld, CourseTitle, TeacherNane) Course | Course | Teacher 1d Title | _Name BUIO0 | Bus Org | Wong BU200 | Beons | Aziz, 18300 | Database | Shoba 1200__| Info Sys —_| Allen 18300] Database | Shobs Teacher (TeacherName, TeacherRoom) Teacher | Teacher Name | Room ‘Wong AL ‘Aziz AD ‘Shoba B3 Allen Ba ‘Shoba BS Registration (StudId, Courseld, Grade) ‘Stud | Course | Grade w_| td i _|Bu100 |B ii | BU200 1i1_[ 18300. 222 [18200 222 [1300 > fo]o}> The above relations are free from the insertion, deletion and modification anomalies, As cach entity is desoribed in a separate relation, we can insert or delete data on each entity without reference to other entities, Modification to an entity can also be done easily since any change is confined to only a single row. No information has been lost through the normalization process. ‘The Relational Database Model + J-11 ‘As mentioned previously, the relational database management system (RDBMS) is the most widely used database model today. Some of the main reasons for this are: 1. The RDBMS is well documented in the literature. 2. Itis well understood. 3. It is widely taught in schools, colleges, universities and training centers. That means, it is easier (and probably less expensive) to recruit database personnel who are familiar with the RDBMS than with other the hierarchical or network database systems. 4. Itis also fast as it uses power database engines such as the Jet Database Engine. 5. Itis user-friendly as it uses graphical user interface (GUI) 6. It provides good backup and security features such as password, encryption and views. 7. Ithas Fourth Generation Language (4GL) features. For example, it comes with an easy-to-use Structured Query Language (SQL) that allows end-users to retrieve information quickly without having to write code (programs). 8. Some of the RDBMS also provides documentation features. This not only simplifies and reduces the software development and maintenance time, but it also improves the software quality. 9. It supports rich data types such as currency, date, logical, memo and general (for images, sound and video). 10. It provides a lot more features than the other database models. 11, It has the support of powerful software vendors (such as Microsoft and Oracle). To retain their customers, they always provide an easy upgrade path. That means, customers do not have to redo their work they change to a newer version of the database software, operating system or hardware, 12. It supports network applications including concurrency and locking mechanisms. That means many users can simultaneously work on an application without running into coordination or synchronization problems. 13. It supports the popular client/server architecture, 14, It can be linked to Internet applications. HA2.¢ Chapter 7 1.6 Database Design Considerations A good database should exhibit the following characteristics: 1, Minimal data redundancy. Duplication of data in a database should be minimal. Although there will be some redundancies in a database (c.g., the key fields), such redundancies however are controlled. 2. Data consistency. If the same data (or information) is stored in multiple tables, then they should be identical. For example, if the occupational code of an employee is modified, then all other occurrences of that information in the database must also be updated. Otherwise, the data will be inconsistent. 3. Data integration, Data stored in one table should be easily accessible from other related tables. For example, if the employee record has a job code, then it should be possible to obtain his or her job ttle, 4. Data sharing, Often there will be many tisers who share the same database. Each user may be given a view of the database. A view roughly corresponds to a subset, of the database fields. 5, Provide security, privacy and integrity controls. There must be proper controls for accessing, updating and protecting data, This often requires setting standards and procedures. 6. Data accessibility and responsiveness. Users should be able to access the information stored in a database using simple commands such as an SQL. 7. Data independence. Data should be separated from the application programs that use the data. In other words, the data stored in a database can change without necessitating change in the application programs. This calls for some kind of data dictionary or repository to store information on the data stored in the database. 8. Simplify application development and maintenance. Developing and documenting applications should be easy and fast. This implies the use of 4GLs or CASE (computer aided software engineering) features. “The Relational Database Model + 7-13. 7.7 Relational Database Management Systems The term relational database management system (RDBMS) refers to software that supports the relational data model. There are several RDBMS currently available in the market. These include Access, FoxPro, dBASE, Paradox, Clipper, Informix, Oracle, DB400 and DB2. Some of them run only on mainframes and minis (e.¢., DB2, DB400); some only on PCs (¢.g., Access, FoxPro); while others may run on both types (e.g., Oracle and Informix). ‘The use of RDBMS greatly simplifies the design of databases. Most RDBMS today provide many sophisticated features such as. graphical user interfaces (GUI), variety of data types (eg, text, date/time, number, currency, memo, generaVOLE (for objects)), automatic code generation and even diagramming support. They also provide support for designing forms, queries and reports. Many also provide query Tanguage (¢.g., Structured Query Language (SQL)) support. All these features greatly simplify the task of designing databases. Sometimes, the term Fourth Generation Language (4GL) or Computer-Aided Software Engineering (CASE) is used to refer to these RDBMS. 7.8 The RDBMS Components ‘The RDBMS conment includes the following components: 1, CASE tools. These are automated tools used to design databases and application ‘programs. 2, Repository, Stores information about the database such as data definitions, screen and report formats. 3. RDBMS. The relational database software for creating, calculating, maintaining and accessing information in the database. It also manages the information repository. 4, Database. A shared collection of logically related data, designed to meet the information needs of multiple users in an organization. 5. Application programs. Programs written in a database software for creating and manipulating databases. 114 ¢ Chapter 7, 6. User Interface. Languages, menus, icons and other features by which users interact with the system components such as the application programs, DMBS, CASE and repository. 7. Data administrators. People who are responsible for the overall planning and management of the data resource. 8. System developers. People (e.g,, system analysts, programmers) who develop new applications, 9. End users. People who actually use the system - those who add, modify and delete information as well as those who receive information from the system. Figure 7.2 shows the above components and their relationship with one another. Figure 7.2: Components of a database environment Date End ‘administrators users CASE ‘Application tools programs LY [_ooms | Repository Database The Relational Database Model ¢ 7-15 1. Define the term relational data model. List the main characteristics of this data model. 2. Distinguish between a table and a relation. 3. Explain the terms (a) primary key, (b) foreign key and (c) composite key. 4, What is a well-structured relation? 5. Define the term normalization. Describe the steps for normalizing an ill- structured relation, 6. Given the following table: represent simple relations. ‘Order | Onder | Cosiomer [Tem [hen Tem Unt] Vendor | Vendor Now | Date | Name | Cade | Quintiy | Description | Price | Code™ | name 7) 2098 | Yasmin 355 | 20 | Pen 7.00 | VP__| Penco 77 | 202798-| Yasmin | $55 | —15__| Stapler [3.00 VS__[ Staples 797 [207098 [Yasmin [385 | 10” | Ruler | 0.50” | VR | Rulerco Normalize the above table as far as necessary so that the resulting tables 7. A real estate agency rents houses to customers on behalf of the house owners ‘The table gives information about houses, customers and owners: Cust | Cust [House | House ‘Stat [End | Rent | Owner | Owner | No | Name | No | Address Date | Date No | Name Tit [Wong [HI | SNewSt, KL | Wis | 6nm7 | 1000, Wi | Karen Hz | 6oldst, KL | 7718 | 61697 | 900] wi | Karen 13 __| 7™Me view,P9 | 7/1097 | 6298 | 1200 | wo | Zainal 22 [Yasmin | Ha] $OneSe,KL | 5/595] 4/597 | 1100] WI | Karen 5 |otwosepr | 5/786 | we57 | 1000 | w2 | Zainal Normalize the above table as far as necessary so that the resulting tables represent simple relations. 8, List and describe the characteristics of a good database design. Describe the major components of a relational database environment,

You might also like