You are on page 1of 4

International Journal of Wisdom Based Computing , Vol.

1(3), December 2011

A Web Based Relational Database Design Tool to Perform Normalization

Radhakrishna Vangipuram
Associate Professor of CSE Tallapadmavathi College of Engineering Warangal, A.P, INDIA e-mail:

Raju velpula
Department of CSE Ramappa Engineering College Warangal, INDIA email:

Department of CSE TallaPadmavathi College of Engg Warangal, INDIA

AbstractIn the literature several textbooks and technical papers have been published with an aim to explore normalization. But most of them have restricted their work only to the definition of various normal forms and left the students or readers to normalize the relations. However most of the readers fail to understand the database design process which is extremely essential for a CSE/IT/IS curricula student. Some authors made an effort to bring the concept of e-learning tools but their work was restricted to only specified number of functional dependencies say 5 to 10. In this work we have come out with a interactive web based Normalization tool which can handle as many as 30 redundant attributes in the FDs and more than 50 complex functional dependencies presently. The Database Design tool that we have developed can form an asset to faculty, students and can even be helpful for the Database Design engineers of the industry to verify their work. Keywords-Normalization, Functional dependency, Normal forms lossless and lossy decompositions,web-based tools

understand the concept of database normalization process and as a result the students fail to understand the process involved in the Normalization. Some database textbooks include normalization algorithms to find the canonical cover by removing redundant/spurious attributes of functional dependencies (FDs) and then convert each FD in the canonical cover to a relation/table [3,6]. In this paper, we propose a web based relational database design tool and show each and every step in the normalization process. The user has to just input the name of the relation, number of attributes, set of functional dependencies. Once this is done the tool computes the set of all possible candidate keys, primary key, prime and non-prime attributes. The functional dependencies are also classified into full and partial FDs and finally the relation is decomposed into 3NF . II. BASIC CONCEPTS



All the computing disciplines specified by the IEEE/ACM/AIS curricula provide the general guidelines of a database design course [1] but there are no guidelines given to meet the guidelines. How to take the students through all the stages of normalization is also an important problem to be considered. In the literature several books and papers have been published on databases. The classical database normalization technique has often relied on the definition of various normal forms. Understanding the normalization process require extensive relational algebraic backgrounds that most CSE/IS/IT students lack. Most systems analysis and design textbooks simply state the definition of first, second and third normal forms (1NF, 2NF and 3NF) and hope that students will be able to apply the definitions and normalize a set of tables. This approach may not always help the students to effectively

The first step before performing normalization is to obtain the candidate keys of given relation schema say R. For this reason we start by explaining the procedure to obtain candidate keys. Step 1: Computing Candidate keys From the given set of FDs Step a: Obtain the attributes which are present only on LHS of each FD Step b: Obtain the attributes which are present only on RHS of each FD Step c: Obtain the attributes which are present both on LHS and RHS of each FD Step d: Obtain the CK by using following rules

International Journal of Wisdom Based Computing , Vol. 1(3), December 2011

Rule1: Include all the attributes of R obtained in step (a) and call this as set S. For this set S of attributes obtain the closure set S+. If this closure set S+ contains all attributes of the given relation schema R then this forms the candidate key, CK of R Rule 2: Never include any attribute obtained in step (b) to find the candidate key of R. Rule 3: If Rule1 does not give CK of R then add each possible combination of attributes obtain in step (c) to set of attributes obtained in step (a) and compute the closure for each newly added combination. Include only the combinations whose closure set derives all the attributes of given relation R. Call them as possible keys. Rule 4: Choose a key having minimum number of attributes from the above obtained set of keys. Step e: Finding Primary key In this step, choose the PK as the CK with minimum number of attributes. Step 2: Classify the functional dependencies into full and partial dependencies and place the relation in 2NF. Full Functional dependency A dependency FD: X Y is said to be a Full FD if removal of any attribute or set of attributes A from X doesnt make the relation hold good. Partial Functional Dependency A dependency FD: X Y is said to be a partial FD if removal of any attribute or set of attributes A from X still makes the relation hold good. Step 3: Decompose the relation into 2NF by using the Heaths theorm. Def : A Relation schema R is said to be in 2NF if each non prime attribute of R is fully functional dependent on the candidate keys of R. In other words there should not be any Partial FDs in R. Decomposing to 2NF Once we find the partial and full FDs, by considering each FD that violates the definition of the 2NF, decompose the relation using Heaths theorm into two sub relations till no other existing FDs in the FD set violates 2NF. Heaths Theorm: Let R be a relation schema with attributes a,b,c denoted by R(X,Y,Z) with the FD: X Y holding on it , then the relation R can be decomposed into two sub relations

i) R1(X,Y) with X Y holding on R1 and ii) R2(Y,Z) with Y Z holding on R2. Such decomposition is said to lossless. Step 4 : Decompose the relation into 3NF A Relation schema R is said to be in 3NF if no non prime attribute of R is transitively dependent on the prime attributes of R. In other words there must be no transitive dependencies in R. we have to eliminate the Transitive dependencies if exists in R. III. CASE STUDY AND SCREEN SHOTS

Consider the following relation schema R with R(A,B,C,D,E,F,G,H,I,J) with a set of FD satisfying on R given by FD:- { AB C A DE, B F F GH D IJ } The various steps followed to perform normalization is explained below: Step 1 : Compute the Candidate Keys. Here the only primary key possible is PK=CK1={A,B}.

Fig 1 : Screenshot showing the computation of all the possible CKs, PKs, Prime and Non-Prime attributes

International Journal of Wisdom Based Computing , Vol. 1(3), December 2011

Step 2: Classify the FDs into full and partial FDs. Here A DE and B F are the partial FDs as LHS of these FDs is subset of CK {A,B}. So the given relation schema R is not in 2NF.

R1(A,D,E) R2(D,I,J) R3(B,F) R4(F,G,H) R5(A,B,C)

The final set of relations in 3NF are R1(A,D,E) R2(D,I,J) R3(B,F) R4(F,G,H) R5(A,B,C)

Fig 2 : Computation of PFDs and FFDs

Step 3: Placing the relation in 2NF. To place the relation in 2NF, remove the partial FDs A DE and B F. After eliminating the partial FDs the new set of relations are Solution 1: R1(A,D,E,I,J) R2(B,F,G,H) R3(A,B,C) Solution 2: R1(A,D,E) R2(B,F) R3(A,B,C,G,H,I,J)
Fig 3: Sample screen showing decomposition of given relation R into 3NF



Step 4: Placing the relation in 3NF. To place the relation in 3NF, remove the transitive FDs. After eliminating the transitive FDs the final set of relations are

The Relational Database Design tool to perform normalization process has been developed using JDK 1.6.0 and Apache Tomcat server 5.0 and uses MS-Access 2007 database is a web based application that has been extensively tested. The tool takes as the input a) Name of the relation say R b) Number of attributes in R c) Set of functional dependencies of R. A provision is made to find the set of all the possible candidate keys of R, Primary key and Super key. The tool computes Prime and Non-Prime attributes of R based on the CKs generated.

International Journal of Wisdom Based Computing , Vol. 1(3), December 2011

One important feature of this tool is that it generates all possible solution set of Normalized relations because there is a chance that more than one solution may exist. A provision is made to show the decomposition of the given relation schema R into 2NF and 3NF separately and obtain all the possible solutions. The step by step procedure to evaluate the candidate key can also be seen with the help of this Tool which makes the concept of finding keys very clear and pleasing. A provision is made to check if the decomposition is lossless or dependency preserving. V. CONCLUSION

process within less time. Presently this tool can normalize any relation whose set of FDs consist of 30 redundant attributes which is not done in any e-learning tool earlier and can handle as many as 50 complex functional dependencies and computes all the possible solutions that exist. Above all the tool is so designed that it the user can interact with it and solve the problem by applying his decision at each stage in the Normalization Process which makes this tool unique of its kind REFERENCES
[1] [2] [3] C .J. Date, An Introduction to Database Systems, 8th Edition Addisson Wesley (2004). T. Connoly and C. Begg, Database Systems: A Practical Approach to Design, Implementation and Management,4th Edition Addison Wesley (2005). A.Silberchatz, H.Korth and S.Sudarshan, Database System Concepts, 5th Edition Mc Graw Hill (2005). R. Elmasri and B. Navathe, Fundamentals of Database Systems, 5th Edition Addison Wesley (2007). S.Sumathi, S.Esakkirajan, Fundamentals of RDBMS, Springer International Edition, 2008

In this work we have developed an Interactive Relational database design tool to make the normalization process user friendly by showing each step carried out in the process which is unique of its kind. The tool was evaluated by students and faculty of our organization and found to be very helpful. Students responses to the tool were mostly favorable. The students indicated that they had found the tool easy to use and the step-by-step feature helped them gain understanding of database normalization

[4] [5] [6]