You are on page 1of 20

Data and File Management

Information and Data


Data:
Raw facts that will be processed. In a computer, before it can process information, it
is coded into a form that the computer can accept. This coded information is called
DATA.

e.g. - a letter stored in a disk,


- a telephone conversation converted into electrical signals for transmission,
- a date encoded as a six-figure number.

Information:
-consists of facts and items of knowledge. Anything that is meaningful to people.
Information can be expressed in words, numbers, pictures, sound or measurements.

e.g. – a list of names and addresses,


- contents of a letter,
- what is said in a telephone conversation,
- song words or even a map.

Types of Data
Analogue Data

It is the data represented by a quantity that varies continuously. The value of the data item
at a given time is represented by the size of the quantity, measured on a fixed scale.

Examples of analogue systems are

- Some watches have an analogue display, where the hands move continuously
round a dial. The time is represented by the position of the hands on the dial.
- Conversations travel on an old telephone circuits as an analogue signal. The size
of the signal depends on the loudness of the speech. The words spoken show up
changes in the frequency of the signal.

Digital Data

1
Data is digital if some quantity in it can be set to a number of different separate values or
states. The combinations of these values represent data. Digital devices are usually binary
and the data is represented as a succession of 0s and 1s.

Examples of digital systems are

- An electronic calculator. The display is digital with the numbers in decimal. Each
digit can have any of the ten separate states (that is the numbers 0, 1, 2, 3, 4, 5, 6,
7, 8, 9). The circuits inside are digital with binary digits represented by 0 volts
and 5 volts.
- Some watches have a digital display – digits shown on a little display screen
represent the time.
- Telephone conversation can be digital.

Characteristics of analogue and digital devices

Analogue
- The quantity use to represent data gets bigger or smaller depending on the
size of the data itself.
- Any value can be represented because the quantity can take any value in
the range used.

Digital
- With the device the quantity used can only take few different values –
usually only two.
- The data is held as a code.

Data Conversion

Data has to be converted into information by data converters

Conversion Process Conver


(At the (calculatin sion (at
Data keyboard Data being data
g, sorting, output the Information
or by storing etc screen,
mouse etc input printer,
plotter

2
Converting information to data (Encoding and Decoding)

Encode

To encode data means to convert data into a form ready for processing.

Information about an item is encoded into the bar codes, which are then printed on item
labels. This data can be input into a computer via a laser scanner on a POS terminal at the
till.

Decode

To decode means to convert data back to a form in which it can be understood.

 On a school data file, the names of the teachers are stored. For this two (2) letters
of the surname are used. Thus, Mr. Gaongalelwe can be stored as GN, Miss
Mmepe as MP and Mr. Williams as WI. The computer has a reference file of
these codes. To print out the name the computer uses the reference file to decode
the two letters. It can then print out the full name.

 An electronic circuit can be made to decode binary numbers into decimal


numbers.

Computers work with data, which, to them, is more than a string of symbols. These
symbols are letters, numbers, other characters, can be pictures and graphs. Inside the
computer, all this data is represented in the form the computer will understand (in the
computer’s own codes). These codes are based on two symbols only – the digits 0 and 1.
The digits ( 0 and 1) are also used in the base two (binary) number system. They are
given names bits short for binary digits

Computers use a common code called American Standard Code for Information
Interchange (ASCII). An ASCII code is an 8-bit character code. A convenient grouping
of bits inside a computer is in sets of 8 bits. A set of eight bits is called a byte. A byte can
store an ASCII character with only one bit left.

3
ASCII characters and their binary equivalent are shown below.

Binary ASCII Binary code ASCII Binary ASCII


Code Character Character code Character
00100000 SPACE 01000000 @ 01100000 `
00100001 ! 01000001 A 01100001 a
00100010 “ 01000010 B 01100010 b
00100011 # 01000011 C 01100011 c
00100100 $ 01000100 D 01100100 d
00100101 % 01000101 E 01100101 e
00100110 & 01000110 F 01100110 f
00100111 . 01000111 G 01100111 g
00101000 ( 01001000 H 01101000 h
00101001 ) 01001001 I 01101001 i
00101010 * 01001010 J 01101010 j
00101011 + 01001011 K 01101011 k
00101100 . 01001100 L 01101100 l
00101101 - 01001101 M 01101101 m
00101110 . 01001110 N 01101110 n
00101111 / 01001111 O 01101111 o
00110000 0 01010000 P 01110000 p
00110001 1 01010001 Q 01110001 q
00110010 2 01010010 R 01110010 r
00110011 3 01010011 S 01110011 s
00110100 4 01010100 T 01110100 t
00110101 5 01010101 U 01110101 u
00110110 6 01010110 V 01110110 v
00110111 7 01010111 W 01110111 w
00111000 8 01011000 X 01111000 x
00111001 9 01011001 Y 01111001 y
00111010 : 01011010 Z 01111010 z
00111011 ; 01011011 [ 01111011 {
00111100 < 01011100 \ 01111100 |
00111101 = 01011101 ] 01111101 }
00111110 , 01011110 ^ 01111110 ~
00111111 ? 01011111 _ 01111111 DEL

01000011 01101111 01101101 01110000 01110101 01110100 01100101 01110001


00100000 01010011 01110100 01110101 01100100 01101001 01100101 01110011
00100000 01101001 01110011 01101101 01111001 00100000 011000010 01100101
01110011 01110100 00100000 01110011 01110101 011000010 01101010 01100101
01100011 01110100 00100111

4
Exercise

1. Why is data represented in binary form?

2. One bit can have two possible values: 0 or 1. Two bits can have four values: 0 0, 0 1, 1
0, 1 1. How many possible values can the following have:

a) three bits,

b) four bits,

c) one byte?

3. Decode the following sentence, which is in ASCII code:

01000011 01101111 01101101 01110000 01110101 01110100 01100101 01110001


00100000 01010011 01110100 01110101 01100100 01101001 01100101 01110011
00100000 01101001 01110011 01101101 01111001 00100000 011000010 01100101
01110011 01110100 00100000 01110011 01110101 011000010 01101010 01100101
01100011 01110100 00100111

Answer:

Data Types

It is the term used to describe the kind of data used e.g. whether it is a number or a letter.

Character: One of the symbols used to make up data. e.g. a letter (A…Z), a punctuation
mark or any digit of a number (0…9) etc. All keyboard combination characters.

A string: A group of letters is called a string. e.g. “This is a string of letters”.

Alphanumeric data: made up of letters and numbers e.g. B 363 AHE.

Numeric data: Only numbers – both whole and fractional numbers.

5
Integers: complete numbers (whole numbers) either positive or negative.

Character set: this is a set of letters, digits and other symbols used for representing data.
These include numeric characters (digits), alphabetical characters (letters) and even
special characters (punctuation marks, mathematical symbols, etc).

Different data types are used in databases.

Data Capture/Collection

Data capture means collecting information for a computer.

Examples are:

 Asking people to fill in the questionnaires,


 Making measurements and keying in them into a key board,
 Collecting documents which have been filled in, and preparing them for a
keyboard operator.
 Taking pictures using a digital camera.

Methods of Data Capture

1. Automated data Capture


This is obtaining data directly by an input device.

For example
 using a document reader,
 Scanning pictures and text from documents,
 Using sensors for data logging,
 Scanning coded data such as bar codes and magnetic stripes

Advantages

 No data has to be keyed.


 There are a few errors or accuracies.

Disadvantages

 Many automated data entry/capture systems are expensive to set up. Therefore a
small shop may decide not to use the system.

Data Capture Forms

It is a form designed to have computer input data, written or filled on it.

6
e.g.
 A membership subscription form,
 A questionnaire,
 A turn around document.

Advantages

 Data is standardised – all records are set out in the same way.
 People collecting the data know exactly what data is required.

Disadvantages
 It can be slow to enter data.
 Transcription (data entry) errors can occur.
 Handwriting recognition can be unreliable

Turnaround Documents

A form, which is produced by a computer, with more data, added to it and then input to
the computer again for processing.

E.g.
 At Omang offices renewals forms of the IDs can have an ID number already
written on it.
 In a club membership form, the computer will print the person’s membership
number on the renewal form.

Advantages

 Data, which is already known to the computer, does not have to be written or
keyed again.
 The computer can recognise each individual document, using information it has
already printed on it.

Data Verification

It is the checking of the data, which has been copied from one place to another to see if
that, it still represent the original data.

Example:

7
 In a computer bureau, data is being encoded onto a disc. A keyboard operator
reads the data from a source document and keys it at a key station, the data being
recorded on disc.
 The second operator, who re-keys it all, then verifies this data. The computer
controlling the key station checks the data stored against the data now being
typed and reports any discrepancies/differences, so that any errors can be
corrected.

Data Validation
It is the checking of data at the time of input. The software carries out the checking. The
check is to ensure that the data is reasonable.

Note: Validation is not the checking of data to make sure that it is correct. Verification
does that. Validation checks many data entry errors, but not all.

Data Entry Checks

Range Check
The software can be set to check that data falls within certain limits.

Examples
 On a job application form, the date of birth can be validated to ensure that the age
of the applicant is greater than 17 and less than 61.
 The readings taken from someone’s water meter can be validated to make sure
that they are within reasonable limits. This could prevent the customers getting
huge bills because of the operator’s error.

Length Check
A field may have been set up to hold only certain numbers of characters. The software
can prevent the operator entering many.

Presence Check
Some fields must not be left blank. For example, an application to sign up to an Internet
chat room may require a user name and an e-mail address.

Type Check
This makes sure that the data type is as expected. If someone accidentally enters a
number in someone’s name, such as Nic9las Cage, the software can easily pick up this.
Also, if someone entered a letter ‘o’, where a digit is required, this can also be noticed by
a type check.

Check digit

8
A check digit is an extra digit added on to a reference number.

e.g.
 Bank account numbers, the ISBN of a book and scanned bar codes contain check
digits.
 ATM card pin number may contain a check digit.
 Bar codes for items sold in a supermarket.

Presenting Information
Presenting information can be done using a word processor or desktop publishing.

Word Processing

Word processing means producing text such as letters and reports using IT. A piece of
text produced by a word processor is called a document.

What is needed for word processing

A system for word processing consists of a PC with:


1. a word processing program-stored on ROM or hard disc,
2. a disk unit to store the documents. This may be a hard disc or the computer may
be on the network,
3. a good quality screen with high resolution,
4. a good printer-usually a laser printer

Entering, editing and improving data

To enter data into a word processor the user simply types it. The user can end the
paragraph by pressing ‘ENTER’ key. At the end of the line the word processor goes
to the end of the line automatically.

Word Wrapping is the process of moving the cursor on to the new line automatically
when the next word will not fit on the present one.

Editing text

When one wants to change the text, he or she can do it in two ways:

1. Overtype: as you type you rub out the character your cursor is on to the right.
This can be done by pressing the ‘insert’ key (Twice or once)
2. Insert: the letters you type are inserted and all the rest of the text moves to
make a room for them.

9
Appearance and Style

A word processor allows you to change the text is displayed and printed. Common
features are:
1. underlining text,
2. making bold ,
3. centering,
4. italics,
5. double line spacing

Fonts

Many word processors offer a choice of fonts, i.e. character designs.

A font is a set of printed characters of a particular size, style and design.

Examples

1. This line is Times New Roman 12 pt,


2. This line is Courier New 8 pt,

3. AND THIS LINE IS IN VERDANA 12 PT SMALL CAPS.

Spelling

A Spell Checker is a program, which checks the spelling of the words against those in a
dictionary.

Notes:
1. The dictionary is a file of words stored with the spell checker to which you can
add words that are not in the main dictionary.

2. Usually if the word is not found in the dictionary, you are given a choice. You can
ignore or skip the word.

Tabs

A TAB is usually at the left hand of the keyboard or marked with the word ‘TAB’ or with
two arrows pointing in opposite directions. When the TAB key is pressed the cursor
jumps across the page several positions at a time. Usually about 5 character spaces.

Margins and indents

A word processor can change the way the text fits on to a page.

10
The margins are the limits, which have been set for text near the edges of the page. They
can be changes so that the text is nearer the edge or further away from it.

Moving the left or right margin makes the text wider or narrower. Moving the top or
bottom margin makes the text longer or shorter.

The indent is the distance text is moved in from the margin. You can indent part of the
text without moving the actual margin.

To justify the text means to keep the letters in a straight line at the edge of the page.

This is a screen showing indented text with margins and well justified.

Decrease indent button increase indent button

Indented text Page margins justified text

Other ways of Presenting Information

Search and Replace

11
A word processor allows you to:

1. search for a word or words(sometimes called ‘find’). You simply key in the
word and the cursor moves to the first it occurs in the document,

2. search and replace. The computer searches for a word and replaces the word
with another one, wherever it finds it. Usually you have the choice of deciding
whether to replace it or not when it is found.

Mail Merge
Many word processing can produce a set of letters by adding to them the name and
address of each person on a mailing list.

A standard letter is a letter, which an organisation stores on files because it is used


frequently. A personalized letter is a letter, which is made to look like a personal letter by
adding the recipient’s name, address and possibly other details.

A mail merge is the operation of producing a set of personalized letter by merging the
personal details with the standard letter.

Example

Standard letter Personal details (Data source)

Dear <<title>><<surname>> Title surname prize date


You have been selected to take part
an art competition to win a Mr Gaongalelwe Car 12/05/03
<<prize>>. Just buy clothing worth
P 500 or more and you will receive Mrs Dikala House 19/10/03
an automatic entry to the
competition. Drawing date is the Mr Nko P100000 31/12/03
<<date>>

merge

Dear Mr Gaongalelwe Dear Mr Nko Dear Mrs Dikala


You have been selected to take part You have been selected to take You have been selected to take part
an art competition to win a Car. part an art competition to win a an art competition to win a House.
Just buy clothing worth P 500 or P 100000. Just buy clothing Just buy clothing worth P 500 or
more and you will receive an worth P 500 or more and you more and you will receive an
automatic entry to the competition. will receive an automatic entry automatic entry to the competition.
Drawing date is the 12/05/03 to the competition. Drawing Drawing date is the 19/10/03
date is the 31/12/03
12
Notes:

Other names for this technique include mail shot and mass mailing.
The following are needed to carry out the mail merge:

 a file of names and addresses;

 the standard letter with gaps;

 the instruction on how to merge them – these may be codes within the standard
letter.

Advantages:

 Mail merge allows an organization to produce a large quantity of letters quickly


and cheaply.

Disadvantages:

 A mail merge sometimes makes it to easy to produce letters, which people do not
want. They may be regarded as ‘junk mail’.

Desktop Publishing (DTP)

Desktop publishing is the use of a computer system to produce page layouts of high
quality for printing or publication.

Characteristics

 Ability to divide the page into columns.

 A wide range of fonts, sizes and styles of text.

 Guides to position text and graphics.

 Flowing of text around the object and from page to page

13
 Options with package to produce text and graphics.

 Facilities to for moving pictures and pieces of text on the page and adjusting their
size to fit spaces.

Application of DTP

 Produce leaflets and posters.

 Produce newsletters and magazines.

File Organisation

File

The term file is used to describe any data or program stored on a backing store such as a
tape or a disc.

Examples of files:

When they have been saved any of the following can be regarded as a file.

 A computer program.

 A piece of text stored by a word processor.

 A spreadsheet.

 A computer drawing.

A data file is an organised collection of data. It usually consists of a number of separate


parts called records.
A record is a subdivision of a data file. It consists of a set of items of data, which together
can be treated as a unit. These items all relate to one person or object. Each record in the
file is similar to the other in the way it is set out.

Examples of data files


School records

 A student file. Each record in this file holds the data on one student.
 A teacher file. Each record in this file contains all the data about one teacher.

14
A field is an area of a record reserved for one particular type of data item. Each field
contains one data item. An item of data here means the smallest piece of data that would
be dealt with separately – a single name or a single number etc.

Regist_no Owner Make Model Colour

B 666 HHH M. Malaakatse Toyota Fong kong White

B 123 JKL J. Selomo Daihatsu Cuore Red

B 789 WQR S. Disiile Kia Sportage Navy blue

B 001 AAB BMW 318i Silver


S. Leso

Records An item Fields

Storage of Files

To create a file means to organize data into a file, e.g. when the fields are set up in a
database and the records are keyed into it.

To save a file means to copy all the records of the file from the main store to the backing
store.
To load the file means to read all the records of the file from the backing store into the
main store.
To open a file means to prepare it so that data can be read from it or written to it.
To close a file is the procedure, which is necessary when the user has finished using a
file.

15
Directories

A directory is a small file on a disc, which is used by the operating system to locate the
other files on the disc.
The directory contains a list of names of files and the information needed to access the
files on the disc. The information given in a directory can include:

 The size of the file in bytes,


 The time and date the file was written or used.

A directory can mean the area of a disc where files are stored. The main directory on a
disc is called the root directory. A sub-directory is a part of the root directory

Root directory Sub-directories

The screen shot below shows part of the directory structure on a PC.

16
Under folders, there is a directory and sub- directories. In the ‘my document’ sub-
directory, there are files created by database, word processing and spreadsheet application
packages.

The operating system provides a means of organizing files into directories or folders so
that they can be easily managed. Most operating systems use a hierarchical or tree
directory structure, where folders are stored in other folders. This is shown in the diagram
below.

Hard disk

Windows Applications Data files

Word Database Spreadsheet Personal Work


processor

Types of Files
17
There four different types of files. These are:

 Serial files
Serial files can be used in batch processing systems to hold transaction data before
it is sorted out.

 Direct (random) file


Direct files are useful in transaction processing systems where a number of
terminals have access to the same file.

 A sequential file
Example can be a club membership file where the order depends on the
membership number.

 Indexed sequential file

The method of organisation of a file refers to:

 The way in which the records are arranged within the file;
 The method of working out where each record is stored in the file.
The method of access to a file refers to the way in which a program reads data from a file
or writes data to it.

Methods of file access

A serial access file has data stored on it in the order in which it was written. Each new
record goes to the end of the file. To read a record it is necessary to read through all the
preceding records first.

A sequential access file has data stored on it in the order of the data in a primary key.
A file of stolen cars can be a sequential access file.

A Direct access file is the one where any record can be accessed without having to access
other files first. Also known as random access.

Notes:

18
 File stored in a tape cartridge are always serial or sequential access. A direct
access would involve too much movement of the tape forward and backward.
Direct access files can only be stored on a direct access medium (such as
magnetic tape)

Advantages: of direct access over sequential

 Selected records can be accessed far more quickly from direct access.

 Records can be accessed in any chosen order.

 Records do not have to be put into any particular order before the file is created.

Advantages: of sequential over direct

 Sequential files can be stored on the magnetic tape as well on discs.

 It is usually easier to write programs to handle sequential files.

Reasons for choosing different methods of access

The choice of access depends on:

 The number of records to be accessed.


If not many records are to be accessed, direct access should be used.

 The size of the record.


For large files sequential searches takes a long time and direct access is better.
For a small file the time delay is not important and a sequential access is
acceptable.

 The type of storage medium being used.


On magnetic tape files have to be serial or sequential – direct access to tape files
is not practical.

 Whether or not the application is interactive.


Sequential access is often suitable for batch processing.
On-line applications such as information retrieval usually need direct access.

Updating files

19
To update a file means to alter it with new information.

Updating can involve:

 Insertion – adding a new record to a file;

 Deletion – removing a record from the file;

 Amendment – changing the items within the existing records.

20

You might also like