You are on page 1of 2

To properly normalize your database there are a number of processes we need to go

through
and to fully understand these processes we need to understand the terms around
normalization
so that we can go through and correctly normalize our database.
First one, an entity.
Now an entity is a thing, could be a person, could be an order and we represent
entities
in databases as tables.
So an entity you may think of actually an entity as an individual here we treat as
a
group of individual records, individual people, orders and that becomes our table
in our database
design.
So entity formally refers ultimately to a table.
The entities or tables have attributes.
So an attribute is if you like if it was a spreadsheet that would be the columns of
our
spreadsheet, so it could be a person’s name, it could be the value of an order, it
could
be the date.
So all of those things are descriptions or attributes of our entity.
We need to uniquely identify each of the records within our entity and for that we
have a thing
called a primary key.
Now the primary key is the unique identifier that is going to be that core value or
values
that identify the individual record of that entity.
The primary key can actually be made up of multiple columns, so it may be that it
is
a single column, it could be my employee ID but it also could be multiple columns
and
often we will have a scenario where you say okay that value on its own isn’t
unique,
none of the values are unique but a combination of values become unique.
Something plus a date makes it unique, an order plus the customer in combination
might
make it unique.
So we have multiple columns and we can have two or more columns providing a primary
key
and that is called a composite key.
When we are actually creating these things, when we are designing those things, we
start
with things called candidate keys, because often there will be multiple different
columns
you might think are suitable as a primary key.
You can only have one primary key so we have to then have a look at the candidates
and
then we have to justify which one of those is the best primary key from our data.
Now of course if we have multiple candidate keys there may be multiple suitable
columns
or suitable attributes that could be used as a primary key and if we retain those
values
we may well have surrogate keys as well.
So that is where we could have another column, it is not a primary key but when we
look at
the design we might want to just check that the normalization rules work with
surrogate
keys as well.
An example of that would be that you have in a car a vehicle identification number
and
also a license plate number, both unique identifiers for the car, both values you
want to store,
only one of them is a primary key but actually when we do normalization rules, let
us just
check they work if you switch and if you use an alternative surrogate key because
normalization
should still work with that as well.
That was just covering some of the key things around that individual table.
Now when we are looking at normalization those attributes that we have can have
dependencies.
They can actually depend on each other and actually when we are going through and
performing
our normalization processes what we are looking for a lot of the time is these
dependencies.
Now dependency is where something can be derived from another attribute.
So for example I can say I can work out the employee’s name because in that table
you
have got the employee ID.
In an Order table, the salesperson ID is in there.
If you had salesperson name as well that would be a dependency.
The salesperson name is dependent on the salesperson ID.
I give you the ID, you could tell me the name, okay that is a dependency and that
is one
of key column steps we are going to use when we do normalization.
Those are just some of the key terms that we use when we look at normalizing our
database.

You might also like