You are on page 1of 71

Data,

information,
knowledge
and
processing
Data,
Information
&
Knowledge
KEYWORDS
o Data – Raw facts without meaning
o Information – Data items with context and meaning
o Knowledge – Understanding of what information is about.
Understanding information in such a way that it can be applied for a
specific purpose.

data has no meaning, and it becomes


information through context and meaning
Data

• Data are raw facts and figures


that on their own have no
meaning

• These can be any alphanumeric


characters i.e. text, numbers,
symbols
Data Examples
• Yes, Yes, No, Yes, No, Yes, No,
Yes
• 42, 63, 96, 74, 56, 86
• 111192, 111234

• None of the above data sets


have any meaning until they are
given a CONTEXT and
PROCESSED into a useable form
Data Into
Information
• To achieve its aims the organisation
will need to process data into
information.
• Data needs to be turned into
meaningful information and
presented in its most useful format
• Data must be processed in a context
in order to give it meaning
Information

• Data that has been processed within a


context to give it meaning

OR

• Data that has been processed into a


form that gives it meaning
Examples

• In the next 3 examples explain


how the data could be
processed to give it meaning

• What information can then be


derived from the data?
Example 1

Yes, Yes, No, Yes, No, Yes, No,


Raw Data Yes, No, Yes, Yes

Responses to the market


Context research question – “Would
you buy brand x at price y?”
Processing

Information ???
Example 2

Raw Data 42, 63, 96, 74, 56, 86

Jayne’s scores in the six


Context AS/A2 ICT modules

Processing

Information ???
Example 3

111192, 111234
Raw Data

The previous and current


Context readings of a customer’s
gas meter
Processing

Information ???
Suggested answers to examples
• Example 1
• We could add up the yes and no responses and calculate the percentage of customers who
would buy product X at price Y. The information could be presented as a chart to make it
easier to understand.
• Example 2
• Adding Jayne’s scores would give us a mark out of 600 that could then be converted to an A
level grade. Alternatively we could convert the individual module results into grades.
• Example 3
• By subtracting the second value from the first we can work out how many units of gas the
consumer has used. This can then be multiplied by the price per unit to determine the
customer’s gas bill.
Knowledge
• Knowledge is the understanding of rules needed to interpret
information
• Information to which human experience has been applied

“…the capability of understanding the relationship between pieces of


information and what to actually do with the information”
- Debbie Jones – www.teach-ict.com
Knowledge Examples
• Using the 3 previous examples:
• A Marketing Manager could use this information to decide whether or not to raise or
lower price y

• Jayne’s teacher could analyse the results to determine whether it would be worth
her re-sitting a module

• Looking at the pattern of the customer’s previous gas bills may identify that the
figure is abnormally low and they are fiddling the gas meter!!!
Knowledge Workers
• Knowledge workers have specialist knowledge
that makes them “experts”
• Based on formal and informal rules they
have learned through training and
experience

• Examples include doctors, managers, librarians,


scientists…
Summary

Information = Data + Context + Meaning

Processing
Data – raw facts and figures

Information – data that has been processed (in a context) to give it meaning
Sources of data
Direct data
Sources of sources
Data
Indirect data
sources
Direct and
Indirect Data
source
• Direct data source (original
data) : data that is collected
for the purpose for which it
will be used
• Indirect data source
(secondary source: data has
been collected for a
particular reason but then
that data is used for
something else
Direct Data
Sources
What is it?
• Gathered for a specific
purpose or task. (e.g.
questionnaires or data
logging)
• gathered without having to
go to a third party. - ‘original
source data’.
Methods of collecting
direct data (sources)

• Questionnaires

• Interviews

• Observations

• Data logging
Questionnaires
• Questionnaires are a commonly used method of collecting
data from people. They are simple to administer and most
respondents are familiar with filling them in, either in a
paper format or online.

• Questionnaires make it easy to collect information in a


standardised way and so long as the questions have been
carefully thought out are straightforward to analyse.
Face to face
interviews
• Interviews allow you to
collect a greater depth of
data and understanding from
people than is possible by
just using a questionnaire.
Observation
• The data gatherer observes
what is happening during a
process or event and
produces some kind of data
file as a result.
Data Logging
An automated method of gathering physical data by using sensors.
Example
•The local council have given
you the task of adding a traffic-
calming feature in a village. So
before you decide on the best
method you will need some
original data.
Data required:
• Traffic flow through the village
• How traffic behaves
• What do residents think about traffic in their village
• What is their preference in the method of traffic calming
(LED warning message, speed hump, swerve obstacle
etc.

How will you use ?:


• Questionnaire
• Face to face interviews
• Observation
• Data logging
Answers part 1

Traffic flow : method used - Data-logger.


• A vehicle sensor is laid across the road. This sensor is attached to a roadside data-
logger. As cars and lorries drive over the sensor, their speed, time of day, size and
frequency is logged.
• The advantage of a data logger is that gathers physical data automatically.
Traffic behaviour: method used - Observation
• You spend a day watching how traffic behaves as they approach the village - Do drivers
have any problems seeing existing speed signs? Do they approach fast then brake? Do
they bunch close together or do they keep well apart?
• The advantage of observation for original data gathering is that it can capture data that
a data-logger cannot, such as human behaviour.
Answers part 2
Resident opinion: method used - face to face interviews
• Face to face interviews allows the project to gather personal opinions, views and
attitudes. What are their main concerns? And so on.
• The advantage of interviews is that they may glean some unexpected data and also
capture general attitudes that a simple questionnaire cannot. But they do take time.
Resident preference: Questionnaire
• Every house in the village is sent a questionnaire to fill in. It asks straight-forward
questions such as 'How satisfied are you with existing traffic calming' Scale 1 - 10
• The advantage of questionnaires is that you get answers to very specific questions and
you can do so fairly quickly. The disadvantage is they can be difficult to fill in for more
complex questions such as attitude. You may also only get a few back and so the sample
size is too small.
Advantages of direct data sources

How much? How little data?

Can we sell it?


Where? Reliable ?

Addresses specific issues –


control method to fit need
Disadvantages of direct data sources

Cost
Time

Sample size may be small


Indirect data
sources
• Data obtained from a third party, not necessarily related to
current task

• data that was collected for a particular reason but is then used
for something else.

• It often occurs when one organisation collects data about


individuals and then sells this data to another organisation.
Sources of
Indirect Data
Sources
• Electoral register
• Businesses collecting data from
third parties
• Weather data
Example 1 - Online shop

• An online shop stores your email address and details


of items you have bought. They use this data to help
with their stock control and to send you an order
confirmation. This is direct or original source data.

• But then they might sell your data (email address) to


a similar company (with your permission).

• This second company might then email you with a list of related items you might be interested in.

• For example, you buy a computer game from one online company, then an email arrives from a
different company asking if you would like to buy a strategy book for the game.
Example 2 – Weather data
Weather data

• Data loggers are set up all over the country to


measure local weather conditions.

• All this data is gathered together by the 'Met office' to allow


weather forecasts to be made. (Direct or original source data)

• But this 'data set' may also be purchased by a local business who
wants to see how sales of their ice-cream relates to the weather.

• This information is used to plan ice-cream production ahead of


time. (indirect data)
Example 3 - Electoral Register
Electoral Register
• By law, all residents must provide
details of who is living in the
house/flat, how old they are and
their gender. The purpose of this
data gathering to allow local
authorities to handle voting in
political elections. (direct or original
source data)
• But (with the person’s permission)
some of the list can be sold to
commercial firms such as a
marketing company planning a new
mail-shot campaign. (indirect data).
Advantages of indirect data sources
Larger data set of data for less time and money
Can be done at a relatively low cost
Larger sample size

No need for physical access


Easily accessible location

Information can be of a higher quality.


Sorted and cleaned
Disadvantage of indirect data sources

Initial purposes may differ from need –


data needs to be filtered out

Missing data – not required in initial source

Sampling bias
First, Second- & Third-Party Data
extra
First Party (direct)
extra
Second party (extra – not in syllabus)
extra
Third Party
extra
Can you do this?
Answer
Quality of Information
• accuracy
• relevance
• age (up-to-date, out-of-date)
• level of detail
• completeness.
Accuracy, Accuracy, Accuracy

Any raw, inaccurate data

introduced will devalue all

further work
Validation
• Validation = 1 way trying to reduce errors in data entered into system.

• The validation is performed by the computer at the point when you enter data. It is the process of checking
the data against the set of validation rules which you set up when developing your new database or
spreadsheet system.

DEFINITION: Validation aims to make sure that data is sensible, reasonable,


complete and within acceptable boundaries.
• So while validation can help to reduce the number of errors when entering data, it cannot stop them – be
very clear about that.

• NB. ‘validation does NOT checks that the data is correct’.


Accuracy - Validation

I *
• Presence Check • Format Check

• Range Check • Lookup Check

• Limit Check • Consistency Check


String

• Type Check • Check Digit

• Length Check
Range check
• A range check is commonly used when you are working with data which consists of numbers, currency or
dates/times.
• A range check allows you to set suitable boundaries:
Type check
• When you begin to set up your new system you will choose the most appropriate data type for each field.
• A type check will ensure that the correct type of data is entered into that field.
• For example, in a clothes shop, dress sizes may range from 8 to 18. A number data type would be a suitable
choice for this data. By setting the data type as number, only numbers could be entered e.g. 10, 12, 14 and
you would prevent anyone trying to enter text such as ‘ten’ or ‘ten and a half’.
• Some data types can perform an extra type check.
• For example, a date data type will ensure that a date you have entered can actually exist e.g. it would not
allow you to enter the date 31/02/07.
Check Digit
• This is used when you want to be sure that a range of numbers has been entered correctly. There are many
different schemes (algorithms) for creating check digits.
• For example, the ISBN-10 numbering system for books makes use of 'Modulo-11' division. In modulo
division, the answer is the remainder of the division. For example
• 8 Mod 3 = 2 i.e. the remainder of dividing 8 by 3 is 2.
• Consider the ISBN number:
• ISBN 1 84146 201 2
• The check digit is the final number in the sequence, so in this example it is the final ‘2’.
• The computer will perform a complex calculation on all of the numbers and then compare the answer to the
check digit. If both match, it means the data was entered correctly.
Length Check
• Sometimes you may have a set of data which always has the same number of characters.
• For example a UK landline telephone number has 11 characters.
• A length check could be set up to ensure that exactly 11 numbers are entered into the field. This type of
validation cannot check that the 11 numbers are correct but it can ensure that 10 or 12 numbers aren't
entered.
• A length check can also be set up to allow characters to be entered within a certain range.
• For example, postcodes can be in the form of:
• CV45 2RE (7 without a space or 8 with a space) or
• B9 3TF (5 without a space or 6 with a space).
• So you could set a length check for postcode to accept data which has a minimum number of 5 characters
and a maximum number of 8.
Can you do this?
Answer
Accuracy - Verification

• Visual Verification • Control Total

• Double Data Entry • Parity check

• Hash Total • Checksum


Visual verification
Double Entry
Hash Total
Control Total
Parity check
Checksum
Encryption - protocol

/ Cypher text

• VPN (virtual private network) – used to connect to companies WAN or LAN

• SSH (secure shell) – used for remote connection to computer

• SSL (secure socket layer)

• TLS (Transport layer security – secures communication to website and personal data
Symmetric encryption
Encryption Encryption
Algorithm Algorithm

Public key Public key


Plaintext Ciphertext plain text
(encrypted)
Asymmetric encryption
Encryption Encryption
Algorithm Algorithm

Public key private key


Plaintext Ciphertext plain text
(encrypted)
Batch processing

• Sets of data are processed all at one time without user interaction
• A table in a database containing information about one set of things, e.g.
employees
• Permanent
• Updated periodically
• Existing master files is used with a transaction file to produce new
updated master file

Master file
• Temporary files
• File that contains all ongoing transactions in a batch-processing system
• Has changes to be made to master file (additions, deletions and updates)
• Data that is used to update the master file

Transaction file
Master file Transaction file

New updated master file


Advantages and disadvantages of batch processing
Advantage Disadvantage

• Single, automated process requiring little • Delay because data is not processed
human input – reduce costs until specific time period
• Can be scheduled when little demand on • Only data of the same type can be
computer recourses (at night) processed because identical automated
• No transcription and update errors process is being applied to all data
produced by humans – process are
• Errors cannot be corrected until the
automated
batch process is complete
• Fewer repetitive tasks for human
operators

You might also like