You are on page 1of 10

A-Level Information Technology 9626

1.4 Key concept: Impact of Information Technology


Coding, encoding and encrypting data

 describe the coding of data (including: M for male, F for female) and more intricate
codes (including: clothing type, sizes and colour of garment)

 discuss the advantages and disadvantages of the coding of data

1. Introduction to coding of data


When you are designing a database system to hold data, one of the first decisions that
you will need to make is about how the data will be collected and stored.

GIGO

Stands for "Garbage In, Garbage Out." GIGO is a computer science acronym that
implies bad input will result in bad output.

Because computers operate using strict logic, invalid input may produce unrecognizable
output, or "garbage."

For example, if a program asks for an integer and you enter a string, you may get an
unexpected result. Similarly, if you try to open a binary file in a text editor, it may display
unreadable content

One of the things that you can consider is to code some or all of your data in order to
improve the efficiency of your system.

Over the next few pages we will examine why it can be useful to code data. We will also
look at some of the drawbacks.

What is coding of data?


Any system will need to have data collected, entered and stored.

One method of storing data is to assign codes to it. This usually means shortening the
original data in an agreed manner. The agreement is between the users of the system.
This coding scheme could be part of the training of how to use the system, and it could
also be documented within the system for new users.

www.mrsaem.com
A-Level Information Technology 9626

If the coding is completely obvious then there is no such need for formal documentation.
For example if a field called 'Gender' has only two values 'M' and 'F'. It should be
obvious from the field name that this refers to Male and Female.

Example 1

Original data: Monday; Tuesday; Wednesday; Thursday; Friday


Coded data: Mon; Tues; Wed; Thurs; Fri

Example 2

Original data: Xtra Large; Large; Medium; Small


Coded data: XL; L; M; S

Example 3

The above codes are fairly easy for anyone to recognise and understand. Some codes
however are more complicated. What do you think the following codes might represent?

RG935LR

CV183TP

The above examples could be postcodes. They represent a street name, a particular
part of the street and the town where the street is located.

Example 4

How about:

SH12BN

TR14GN

Let's propose that these are codes for clothes in an online shop.
These might be a little bit more difficult because the code is made up from different
representative parts. Let's have a closer look.
The first part represents a piece of clothing, so 'SH' represented 'Shirt' and 'TR'
represented 'Trousers'
The middle part of the code is the dress size.
The final part of the code represented a colour, so 'BN' represented 'Brown' and 'GN'
represented 'Green'.
You should be able to see from that information that the first code is a size 12 brown
skirt. What piece of clothing would the second code represent?
It is common for much of the data collected and entered into a system to have some
degree of repetition and redundancy i.e. extra information that does not add anything.

www.mrsaem.com
A-Level Information Technology 9626

Advantages
And this pattern or repetition is why it is efficient to code the data in some way.

Speeding up data entry

Let's take the example of collecting data about a person's gender. People can be either
'Male' or 'Female'.

Whilst these two options are easily understood by all, imagine having to enter the word
'Male' and 'Female' into a system many hundreds of times. It is a waste of time and
effort because no extra information is contained in the full words compared to a single
letter.

Increase accuracy of data entry

The other issue is that no matter how accurate a person is at data entry, at some stage
they are likely to make a mistake and might spell 'Male' as 'Mail' or 'Female' as 'Femal'.
This type of mistake will make any results from your database queries unreliable.
Instead of entering 'Male' or 'Female' you could code the data and instead enter it as 'M'
or 'F'.
Simply having to enter one letter instead of a possible six will speed up data entry. It will
also cut down on the risk of mistakes being made with spelling.

Use of validation

In our example, the words 'Male' or 'Female' have been coded so that they become 'M'
or 'F'.
When data has been coded it makes it easier to use validation to check if the data
entered is sensible. With the example above, the person entering the data could still
make a mistake and enter 'S' instead of 'M' or 'F'.
But if you set up validation so that the field will only accept the letters 'M' or 'F' and
absolutely nothing else then that should further cut down on possible mistakes.
Note that validation can only check if the data is sensible and within reasonable limits, it
cannot check whether the data is accurate. Somebody could still enter 'F' instead of 'M'.

Less storage space required

Every letter that you store in your database system will take at least one byte of storage.
If you store 'Female' as 'F' then you will save five bytes of storage space. If the system
belongs to a large organisation, there might be many thousands or millions of records
stored - simply by coding one field, a huge amount of hard disk storage can be saved.

Faster searching for data

It stands to reason that the smaller the size of your database, the faster it will be to
search and produce results.

www.mrsaem.com
A-Level Information Technology 9626

Thus by coding data and keeping the size of the system to a minimum the more time
you can save in the long run when running queries.

Summary: Reasons for coding


 Speeding up data entry
 Increase accuracy of data entry
 Use of validation
 Less storage space required
 Faster searching for data

Further examples of data coding


In our everyday lives we come across many examples of how coding is used to
represent data. Here are just a few more ideas:

Country names
The name Airline flight codes

When you fly you may have noticed that your flight is given a code.
This code consists of two letters to identify the airline that you are flying with. The letters
are usually followed by numbers to represent a particular route.

Examples:
So for example, a British Airways flight from Heathrow to Oslo might be coded as
BA766.
A flight operated by the airline company Emirates which depart from Dubai and arrives
at Heathrow might be coded as EK029

Great Britain - GB
France - FR
Canada - CA
What do you think these two country codes represent?
CN and US

Problems cause by coding data


Whilst coding data can bring many benefits it can also lead to some problems.

Coarsening of data

This means that during the coding process some of the subtle details in the data are
lost.
Look at the image below:

www.mrsaem.com
A-Level Information Technology 9626

The colours of the houses could be classed as:


Light pink, pale blue, black and mid blue
However, when these colours are coded they may become:
PK (pink), B (blue), BK (black), BE (blue)
In this case, no allowance has been made for shades of colour so the results from the
above coding would end up as this:

The fine detail have been lost. This is what is meant by 'coarsening of data'.

Coding can obscure the meaning of the data

A reader seeing the 'gender' data as M/ F is pretty likely to know that it means Male/
Female.
But some codes are more obscure, for example the country code for Switzerland is
CHE. Many people might not recognise what this code represents.
If you were given the code, 244/5838 would you know what this represented? Have a
search on the Argos site to see if you can find this product.
In order for the code to be useful, you need to be given a complete list of possibilities.

Coding of Value Judgments

When you are collecting data about


people's opinions it might be difficult to
code their answers with accuracy.
For example, you might ask the question,
"was is that curry too spicy?". Your plan
is to give their answers a code from 1-4
with 1 being mild to 4 being 'blow your
head off'. However, what is spicy to one
person will be mild to another. The code
they give will depend on their individual
opinion.
Coding of value judgments will inevitably
lead to coarsening of the data since there will be a wide range of opinions that could be
held and only a limited number of codes available.

www.mrsaem.com
A-Level Information Technology 9626

Pros and cons of coding

Encoding
What does Encoding mean?
Encoding is the process of converting data into a format required for a number of
information processing needs, including:
 Program compiling and execution
 Data transmission, storage and compression/decompression
 Application data processing, such as file conversion
Encoding can have two meanings:
 In computer technology, encoding is the process of applying a specific code,
such as letters, symbols and numbers, to data for conversion into an equivalent
cipher.
 In electronics, encoding refers to analog to digital conversion.

Encoding is also used to reduce the size of audio and video files. Each audio and video
file format has a corresponding coder-decoder (codec) program that is used to code it
into the appropriate format and then decodes for playback.

Encoding should not be confused with encryption, which hides content. Both techniques
are used extensively in the networking, software programming, wireless communication
and storage fields.

www.mrsaem.com
A-Level Information Technology 9626

The type of code used for converting characters is known as American Standard
Code for Information Interchange (ASCII), the most commonly used encoding
scheme for files that contain text. ASCII contains printable and nonprintable
characters that represent uppercase and lowercase letters, symbols, punctuation
marks and numbers. A unique number is assigned to some characters.

What is Data Encryption?


Data encryption is the act of changing electronic information into an unreadable state by
using algorithms or ciphers. Originally, data encryption was used for passing
government and military information electronically. Over time as the public has begun to
enter and transmit personal, sensitive information over the internet, data encryption has
become more widespread. Nowadays web browsers will automatically encrypt text
when connecting to a secure server. You can tell you are on a secure, encrypted
website when the URL begins with "https", meaning Hypertext Transmission Protocol,
Secure.

What is considered sensitive data?


Whenever you enter sensitive data on the internet you want to make sure you are on
secure website that encrypts the information. Some sensitive data you should make
sure you are entering on a secure, encrypted website are:
 Full name
 Social Security Number
 Credit and Debit card numbers
 Billing and shipping addresses
 Bank account and bank account log in information
 Financial/salary information
 Driver's license number
 Date of birth
 Health and patient information
 Student information
 Ensuring that you use data encryption software or use a secure website
beginning with "https" when you enter personal, sensitive data will help prevent
identity theft.

How we use encryption today


Devices like modems, set-top boxes, smartcards and SIM cards all use encryption or
rely on protocols like SSH, S/MIME, and SSL/TLS to encrypt sensitive data. Encryption
is used to protect data in transit sent from all sorts of devices across all sorts of
networks, not just the Internet; every time someone uses an ATM or buys something
online with a smartphone, makes a mobile phone call or presses a key fob to unlock a
car, encryption is used to protect the information being relayed. Digital rights
management systems, which prevent unauthorized use or reproduction of copyrighted
material, are yet another example of encryption protecting data.

www.mrsaem.com
A-Level Information Technology 9626

Symmetric Key Encryption


Symmetric key, or shared secret key, encryption is the process of encrypting and
decrypting data with the same key. Assuming the symmetric key stays safe, this type of
encryption is very secure. The encrypted data can be transmitted or stored without fear
of unapproved disclosure. Once encrypted, anyone viewing it would only see the
unintelligible data and would have no way of determining the true contents of the data

This method of encryption is very secure, but it has one major drawback. Each step of
the encryption and decryption process requires the same symmetric key. It is essential
to ensure that all approved users have access to the correct key prior to viewing the
data. There are many ways to pass this key, such as printing a hard copy of the key or
using a thumb drive to pass a soft copy. A serious consideration for symmetric keys are
that the larger the number of people that have access to the key, the more likely it is
that there will be a breach. Due to this limitation, “symmetric key encryption is
particularly useful when encrypting your own information as opposed to when sharing
encrypted information”

www.mrsaem.com
A-Level Information Technology 9626

Asymmetric Key Encryption

Asymmetric key, or public key, encryption is a system that uses multiple keys in the
encryption and decryption process. It is very similar to private key encryption, however
the data is encrypted with a private key and is then decrypted with a corresponding
public key. The process can be reversed as well. For example, Bill can send Sally a
message encrypted with his private key. Then Sally, or anyone with Bill’s public key,
can decrypt the message. When Sally wants to respond to Bill, she can use his public
key to encrypt the message and Bill can use his private key to decrypt that message.
Only Bill’s private key can be used to decrypt messages encrypted with his public key.

A limitation of asymmetric encryption is speed. The computational resources required


for asymmetric encryption are much higher than those of symmetric encryption
Additionally, for distributing the public key often times a certificate authority is required.
Certificate authorities are third party entities that are used to verify the identities of
parties on the Internet

www.mrsaem.com

You might also like