You are on page 1of 30

+

1
About this book ...................................................................................................................................... 3
1.1 The characteristics of contemporary processors, input, output and storage devices ............ 5
1.1.1 Structure and function of the processor ............................................................................................................... 5
1.1.2 Types of processor ........................................................................................................................................................ 6
1.1.3 Input, output and storage ............................................................................................................................................ 7
1.2 Software and software development............................................................................................. 7
1.2.1 Systems Software .......................................................................................................................................................... 7
1.2.2 Applications Generation .............................................................................................................................................. 9
1.2.3 Software Development ................................................................................................................................................ 11
1.2.4 Types of Programming Language ........................................................................................................................... 12
1.3 Exchanging data .............................................................................................................................. 14
1.3.1 Compression, Encryption and Hashing.................................................................................................................. 14
1.3.2 Databases ........................................................................................................................................................................ 14
1.3.3 Networks ..........................................................................................................................................................................16
1.3.4 Web Technologies .........................................................................................................................................................18
1.4 Data types, data structures and algorithms................................................................................19
1.4.1 Data Types ........................................................................................................................................................................19
1.4.2 Data Structures........................................................................................................................................................... 20
1.4.3 Boolean Algebra .......................................................................................................................................................... 20
1.5 Legal, moral, cultural and ethical issues...................................................................................... 21
1.5.1 Computing related legislation ................................................................................................................................... 21
2.2 Problem solving and programming............................................................................................. 22
2.2.1 Programming techniques .........................................................................................................................................22
2.2.2 Computational methods ........................................................................................................................................... 24
2.3 Algorithms .....................................................................................................................................25
2.3.1 Algorithms ......................................................................................................................................................................25

2
As Salamu Alaykum Wa Rahmatullahi Wa Barakatuh World i.e. whoever has this
revision guide (yep, that’s you)!
This revision guide (KEYWORD “revision”) is intended to provide you with
(arguably) the absolute best preparation possible for your OCR A level
Computer Science H046/H446 spec exams.
By no means does this book cover the specification completely (at least
not this version) and this book should only be used for revision purposes.
That means it probably isn’t the best idea to learn the course content
through this book because it’s extremely concise and is only really meant
for revision purposes, hence the term “revision guide”.
For those of you who don’t know, The Muslim CompSci (me) is not a teacher
or an examiner – I’m just a student trying to do my best to create
resources for OCR A level Computer Science that are of good quality, but
more importantly, are for free. That’s why this revision guide, along with
all the other resources on my blog are for free so please take full
advantage of them.
Anyway, let’s actually start talking about this book now. The contents
page has a fairly detailed list which maps to the OCR spec exactly to show
which parts are covered. As aforementioned this version (v1.0) is missing
a few things so I guess I should point them out now. Here’s a list of all
the spec points not covered in this version:

• 1.2.3 (c) Writing and following algorithms.


• 1.3.4 (a) HTML, CSS and JavaScript.
• 1.4.1 (b) Represent positive integers in binary.
• 1.4.1 (c) Use of sign and magnitude and two’s complement to
represent negative numbers in binary.
• 1.4.1 (d) Addition and subtraction of binary integers.
• 1.4.1 (e) Represent positive integers in hexadecimal.
• 1.4.1 (f) Convert positive integers between binary hexadecimal and
denary.
• 1.4.1 (g) Representation and normalisation of floating point numbers
in binary.
• 1.4.1 (h) Floating point arithmetic, positive and negative numbers,
addition and subtraction.
• 1.4.1 (i) Bitwise manipulation and masks: shifts, combining with
AND, OR, and XOR.
• 1.4.2 (c) How to create, traverse, add data to and remove data from
the data structures mentioned above.
• 1.4.3 (a) Define problems using Boolean logic.
• 1.4.3 (b) Manipulate Boolean expressions, including the use of
Karnaugh maps to simplify Boolean expressions.
• 1.4.3 (c) Use the following rules to derive or simplify statements
in Boolean algebra: De Morgan’s Laws, distribution, association,
commutation, double negation.

3
• 1.4.3 (d) Using logic gate diagrams and truth tables.
• 1.4.3 (e) The logic associated with D type flip flops, half and full
adders.
• 1.5.2 The individual moral, social, ethical and cultural
opportunities and risks of digital technology.
• 2.1 Elements of computational thinking
• 2.2.1 (f) Use of object oriented techniques.
• 2.2.2 (a) Analysis and design of algorithms for a given situation.
• The Programming Project
Looking at this list, you may be thinking that this book is missing a lot
but if I made a list of all the stuff that is covered, then the list would
be like 3 times longer. Anyway, version 1.0 covers about 75% of the A
level specification and version 2.0 will cover all of it InShaAllah. All
the theoretical parts of the spec are fully covered here, it’s just a few
technical/practical topics that aren’t.

Moving on to all the other pages after the contents, each section begins
with the spec point that it’s addressing. In other words, the title of the
section is the spec point that it’s covering. Each section or chapter
should I say, has a different colour scheme so you can easily find the
right chapter quickly. Speaking of colour schemes, the A level only parts
of the spec are highlighted in dark red.
I guess I should also mention that this revision guide took basically a
week or so to write and nearly summarises the entire spec in 25 odd pages
(which is quite impressive if you consider all the other revision guides
available for this course). I guess that’s all the info you need to know
about the book itself.
In regard to version 2.0, I already have a fairly good idea of what I’m
going to add to it so let me just give you a little taste of what’s to
come InShaAllah:

• Cover the entire specification (duh)


• Have brief chapter overviews to emphasise the importance of
computer science to the real world because let’s face it, most
people have no idea why they have to learn this stuff if they’re
not going to use it in they’re life.
• Each chapter will have a review exercise with questions from past
papers to help you consolidate your learning.
• For the mathematical parts of the spec i.e. Boolean algebra and
binary, worked examples of questions (probably from past papers)
will be provided to illustrate model solutions to common exam
problems.
• This last one is a bit ambitious but at the end of the book (along
with an index and a glossary) there will be 2 examination style
papers; one for each component.

Now that I think about, I may not be able to implement all of this in the
next version but I definitely plan to over the next few InShaAllah. As
always, I hope this revision guide helps!

4
1.1 The characteristics of contemporary processors, input, output and storage devices

1.1.1 Structure and function of the processor


The Arithmetic and Logic Unit (ALU), Control Unit (CU) and Registers
CPU is a general-purpose processor that executes instructions in a computer system through the FDE cycle.
Carries out arithmetic calculations and logical decisions. The calculation results are stored in the
ALU
ACC. Acts as a conduit and gateway to the processor through which all I/O to computer is done.
Manages execution of instructions using control signals to coordinate how the processor works and
CU how data moves between the CPU and memory. Synchronises actions using inbuilt clock. Controls
FDE cycle and decodes instructions.
Registers are memory locations in the processor used for a specific purpose during the FDE cycle when
frequent and faster access than RAM is needed to temporarily store data or control info.
GPR temporarily stores data being used by processors instead of sending to/from slower memory.
PC controls sequence of
instructions retrieved and ACC is temporary storage of
MAR contains memory location
executed. Stores address of next intermediate arithmetic results in
address of next instruction sent
instruction processed and passed ALU that holds data currently
from the PC or of next data sent
to the MAR. Incremented by 1 being processed during
from the operand part of the
after being read each FDE cycle. calculations. Is used as a buffer
instruction in the CIR to be
Is altered to the address from to deal with I/O data in the
accessed in memory.
instruction in CIR if operation processor.
jump.
MDR acts as a buffer temporarily CIR holds the most recently fetched instruction while being decoded
storing contents of the memory and executed. Contents of the MDR are copied to the CIR if it is an
location address specified in the instruction and then splits contents into 2 component parts: 1st part is
MAR. Stores instructions/data the opcode held while it is decoded so the CU knows what to do. Rest
when being transferred between of the contents are the address of the data used in the operation or the
memory and processor. Copied to actual data if an immediate operand is used which is then copied to
CIR if instruction or into address if the MAR if address for accessing data or MDR if data. Sends address
data used with instruction. to PC for jump instruction and determines type of addressing used.

Buses
Buses are parallel groups of communication channel wires through which data is transmitted in groups of
bits together from one register to another in the processor.
Data Carries data transmitted from one register to another between the processor and memory.
Address Carries memory location address of the register where data is being read from/written to.
Control Transmits control signals from the CU to allow synchronisation of signals to rest of the processor.
The Fetch-Decode-Execute (FDE) Cycle
Fetch:
1. Instruction is fetched from the memory into the processor
2. PC holds address of next instruction and copies contents to the MAR sent via the Address Bus
3. PC is incremented and the fetch signal is sent via the Control Bus
4. Contents of memory location are sent from memory to processor via Data Bus and stored in MDR
5. Contents of the MDR and the ACC are sent to the ALU and the result is stored back in the ACC
6. Load instruction pointed to by the MAR to the MDR which copies it to the CIR
Execute: Processor carries out instruction and the
Decode: Instruction is decoded by the CU in CIR
5 FDE cycle is repeated
The factors affecting the performance of the CPU (Central Processing Unit)
Clock controls process of executing instructions and fetching data. Overclocking gives
more cycles per second so more instructions are executed so program takes less time to
Clock Speed
run. Increases performance but produces heat so additional cooling necessary to
prevent CPU damage.
Distinct processing units on the CPU can run different apps when multitasking or
Cores
multiple cores can speed up small problems.
Small memory works faster than main memory by anticipating data/instructions likely to
be regularly accessed, overall speed at which processor operates is increased as more
Cache Memory
space for data/instructions. Access also quicker for future use so RAM is accessed less
frequently.
The use of pipelining in a processor to improve efficiency
Pipelining: while one instruction is being executed, the next one is being decoded and the one after is being
fetched.
Von Neumann, Harvard and contemporary processor architecture
Von Neumann
Single processor CU manages program control. 1 instruction is processed at a time in a linear sequence
using the FDE cycle. Data/instructions are stored together in the same memory format. Simple OS and
easier to program but slower processing large sets of data.
Harvard
Data/instructions are stored in separate memory units with separate buses so while data is being read
from/written to from the data memory, the next instruction can be read from the instruction memory.
1.1.2 Types of processor
The differences between and uses of CISC and RISC processors
CISC (Complex Instruction Set Computer) RISC (Reduced Instruction Set Computer)
Complex processor design Simple processor design
Complex instructions take multiple machine cycles Simple instructions take one machine cycle
Larger number of instructions available Limited number of instructions available
Instructions perform complex tasks Instructions perform simple tasks
Doesn’t allow pipelining Allows pipelining
Single register set Many register sets
Many addressing modes Fewer addressing modes
Instructions have variable format Instructions don’t have variable format
Programs run more slowly Programs run more quickly
Expensive due to integrated circuitry Cheaper due to simple circuitry
Complex tasks only performed by combining
multiple instructions taking multiple machine cycles
Used by desktops and laptop computers and GPRs reduce need to constantly send data to
and from memory.
Used by smartphones and tablets

6
GPUs (Graphics Processing Unit) and their uses
GPU: Designed specifically for graphics so has built in circuitry and instruction sets for common graphics
operations. Cost efficient way of tackling problems as large number of cores can run on highly parallelisable
problems where able to perform SIMD e.g. transforming points in polygon or shading pixels so can perform
transformations to onscreen graphics quickly. Has uses outside graphics e.g. modelling physical systems,
audio processing, breaking passwords, machine learning and in science/engineering problems.
Multicore and Parallel systems
Parallel Processing is where a computer carries
Multicore processors have more than one
out multiple computations simultaneously to solve a
processor incorporated into a single chip to
given problem by SIMD and MIMD. Not suited to all
distribute the workload across multiple cores to
problems as most only partially parallelisable and
achieve a higher performance.
writing algorithms is more challenging.

SIMD MIMD
Same instruction applied to multiple data Different instructions applied to different data
simultaneously. concurrently.
1.1.3 Input, output and storage
How different input, output and storage devices can be applied to the solution of different problems
Input Device Hardware used to put data in computer e.g. keyboard, mouse, scanner and microphone
Output Device Hardware used to get info from computer e.g. printer, speaker and monitor
Storage Device Non-volatile peripheral that stores data e.g. HDDs, CDs and SSDs
The uses of magnetic, flash and optical storage devices
Magnetic Flash Memory Optical
Uses magnetisable material and Solid State technology where data Works by using a laser to look
works by reading magnetic stored using memory chips. at its reflection. Cheap to
patterns off platters that Contents can be erased and distribute and resilient e.g. CD,
mechanically spin at high speeds. overwritten when electrical charge DVD, Blu ray discs.
Can be noisy and susceptible to applied. No moving parts so
damage. High capacity at low cost requires less space and portable.
e.g. HDD and magnetic tape used R/W speed faster and consumes
to backup servers. less power than magnetic or
optical media but more expensive.
RAM (Random Access Memory and ROM (Read Only Memory)
RAM – Volatile, editable, large. Stores apps
ROM – Non-volatile, unalterable, small. Stores BIOS
software, OS, user files currently in use and allows
bootstrap program so immediately available when
the user to alter contents. Offers fast and direct
computer turned on as is already present in
access to data. Faster R/W speed than storage
memory.
devices and reduces buffering.
Virtual Storage
Virtual Storage is the combination of physical storage devices into a single remote virtual storage
software/device that is accessible anywhere so no need to hire specialist staff, back up or security.
1.2 Software and software development
1.2.1 Systems Software
7
The need for, function and purpose of operating systems (OS)
Software is programs that run on a computer function to make hardware work.
OS is the software designed to control the hardware and access of a system through resource, task and
memory management, system software and device drivers.
Job scheduling provides fair access to the processor according to a set of rules.
The OS provides a platform for apps to run, a UI with the operator to allow communication between the user
and hardware e.g. CLI, utilities to carry out housekeeping tasks to maintain hardware and security for user
files e.g. through password system.
It also handles and governs communications between devices using protocols e.g. across a network,
translation of code from High/Low level language to machine code using translators, interrupts and deals
with software issues e.g. file handling.
Memory Management
Memory Management organises use of main memory to enable Paging – Partitions memory into
sharing by partitioning, allocating and reallocating memory when fixed size physical divisions made
necessary. Allows programs larger than main memory and separate to fit sections of memory to allow
processes to run at the same time. Ensures no space is wasted by programs to run despite
converting logical addresses to physical addresses. Protects OS and insufficient memory. Used for
programs from accessing each other’s memory unless required. virtual memory.
Virtual Memory holds the part of the program not currently in use to
Segmentation – Partitions
allow large programs to run when insufficient memory available. Uses
memory into variable sized logical
a backing store as additional memory and swaps pages between the
divisions holding complete
main memory and an area of a secondary storage device to make
sections of programs according to
space for pages needed. May cause disk threshing when more time
their function.
spent transferring pages than processing causing computer to hang.
Interrupts, the role of interrupts and Interrupt Service Routines (ISR), role within the FDE Cycle.
Signal to processor causes a break in the execution of a current routine to obtain processor
Interrupt time, indicates a process or a device needs servicing for a higher priority task to take
precedence to avoid delays and data loss and use the processor efficiently.
Interrupt Register checked when the current FDE cycle is completed by comparing the priority
of the incoming interrupt and the current task with the Interrupt Register. Contents of registers
are stored in a LIFO stack in the memory. Location of the relevant ISR is loaded by loading the
ISR
relevant value into the PC. When the ISR is complete, flags are reset to an inactive state,
further interrupts are checked and serviced if necessary and the previous state is popped from
the stack and loaded back into registers to resume processing.
Scheduling
Scheduler is a program that processes the maximum number of jobs in the least possible time and ensures
all jobs are processed fairly so long jobs don’t monopolise the processor. Changes priorities where
necessary and maximises interactive users with fast response times and no apparent delay. Utilises
resources and processor time efficiently depending on priorities.

Pre-emptive scheduler allocates each user a small time slice in turn and at the end of
Round Robin each slice it moves to the next user and jobs go to the back of the queue. Repeats for
all users. Order may depend on user priorities but users are unaware of any delays.

First Come First Jobs processed in order of arrival so others are queued up for their turn. Can waste
Served processor time if a job is using a slow resource.

8
Multilevel Uses a number of queues each with different priorities. Algorithm can move jobs
Feedback Queues between queues depending on the jobs behaviour.
Shortest Job First Picks job with the shortest time and runs until finished.
Shortest Scheduler estimates the length of each process then picks the one with the least
Remaining Time amount of time and runs that but if job added with shorter time it switches to that one

Distributed, embedded, multi-tasking, multi-user and Real Time operating systems


Distributed Shares processing between multiple computers to work together as a single system.
Embedded Built into device with limited resources.
Runs multiple programs apparently simultaneously using separate windows for each task.
Multitasking Each receives a slice of processor time before going onto the next task. User can switch
between programs.
One computer with multiple terminals each given a small time slice in turn. Allows multiple
Multiuser users to access computer resources at the same time using flags and priorities and
separating user data and rights.
Real Time Data processed immediately with no delay guaranteeing response within a given time frame
BIOS (Basic Input/Output System)
Gets computer up and running when first switched on. Processor PC points to BIOS memory which checks if
the computer is functional, memory installed and accessible and processor working. Stored on flash memory
so can be updated and allows settings e.g. boot order of disks to be changed and saved by the user.
Device drivers
Device drivers are software supplied with devices which contain instructions to the OS using the peripheral
to enable communication and configure hardware.
Virtual machines
Theoretical computer provides environment in which translator available. Uses interpreter to run intermediate
code which can then run off any computer with virtual machine. Slower than compiler and has limited access
to low level features e.g. GPU access. Used to run an OS inside another when testing program compatibility.
Intermediate Code
Partly translated simplified code between high level and machine code produced by a compiler. Runs on any
computer using an interpreter improving portability between machines. Same code can be obtained from
different High Level Languages. Allows sections of code to be written in different languages by different
programmers. Suitable for specific tasks as is error free and protects intellectual property. Needs translation
and additional software each time to run so slower than exe code.
1.2.2 Applications Generation
The nature of applications

Applications Software: Collection of compatible software complete with electronic or hard copy user guides
that allow the user to perform a task or produce something useful e.g. word processors, spreadsheet
packages, presentations software, desktop publishers, image editors, web browsers.

Utilities

Utilities are relatively small housekeeping programs that perform specific tasks to do with the maintenance
of the system e.g. antivirus programs, disk defrag, compression, file managers, backup utilities.

9
Open source vs closed source
Open Source Software (OSS) Closed Source Software (CSS)
Software with the source code freely available Propriety software with the source code not
allowing others to make amended versions to available but provides automatic updates and
contribute to the program’s development support
e.g. Linux, Libre Office, Firefox e.g. Mac OS, Microsoft Office, Google Chrome
Translators
Translator converts code from one language to another i.e. from source code to object code and detects
errors in source code.
Interprets and runs high level code by translating one line then runs before the translation of
the next. Reports one error then stops to indicate the position. Must be present each time
Interpreter the program runs so slow due to translation but is machine independent. Used with virtual
machines and during program development. Source code is visible so can be easily copied
and amended but can be obfuscated.
Translates the whole High Level Language program to machine code as a unit. Creates exe
program when completed so quicker. Easier to get access to lower level features e.g. GPU
access. Protects program from malicious use and preserves intellectual property. Compiled
Compiler
code is not human readable and is architecture specific. Gives a list of errors at the end of
compilation which may be spurious. Optimises to improve program speed and is no longer
needed once the exe code is produced.
Program translates assembly code into machine code by reserving storage for
data/instructions, replacing mnemonic opcodes by machine code and symbolic addresses
Assembler
by numeric addresses, creates a symbol table to match labels to addresses, checks syntax,
gives a list of errors and offers diagnostics at the end.
Stages of compilation
Converts source code into a series of tokens which are fixed length strings of
binary digits created from individual symbols and reserved words in the program.
Lexical Analysis Redundant code, white spaces and comments are removed. A symbol table is used
to store info about variable names and subroutines. Error diagnostics are given to
prepare the code for syntax analysis.
Compiler accepts the output from lexical analysis, checks program is syntactically
correct against the rules about the structure of programming language, adds further
Syntax Analysis
detail to the symbol table e.g. data type/scope/address. An abstract syntax tree is
built and diagnostics are given and if no errors the code is passed to code generation
Abstract syntax tree is converted to object code by producing the machine code
Code Generation equivalent to source program. Variables are given addresses and relative addresses
are calculated.
Object code checked and made as efficient as possible increasing processing
Optimisation speed by reducing the number of instructions. Programmer can choose between size
and speed.

Linkers and loaders and use of libraries

Linker Links compiled program code with complied library code all into one exe program.

10
Part of the OS that loads the exe program and associated libraries into the memory and
Loader
handles addresses before the program runs.
Compiled software performs common tasks e.g. searching or sorting as it is already
available and tested so error free, reliable, reduces repeated code saving amount of work and
Libraries
time, can be used multiple times, written in different source languages allowing programmers to
use others’ expertise and modules can be shared to make programs shorter.
1.2.3 Software Development
Software development
Feasibility Study – Analysts carry out initial enquires to decide if the problem is realistically solvable
before production by considering parameters e.g. budget, time, work force, economic, environmental, legal,
social and technical feasibility to revise plan if the study highlights problems.
Requirements Specification – Specification document developed between the client and the software
developers that unambiguously states everything the new system is expected to do e.g. Input/Output,
processing, client agreement and hardware/software requirements.
User Documentation – Ensures the user can use the system so contains info e.g. contents, index, glossary,
trouble shooting, help, contacts, warranties, FAQs etc.
Technical Documentation – Explains how the system works allowing it to be maintained and further
developed in the future e.g. Data Flow Diagrams, System Flow Charts, Flow Charts, ERDs.
Waterfall Lifecycle – Consists of a series of linear stages presented in order that need to be completed to
produce the working system. Requirements are established in the early stages and are focussed on by the
subsequent stages. Each stage feeds info into the next. May be necessary to return to one or more stages to
collect more info or check on data collected. After returning, all intervening steps are revisited to improve the
solution.
Waterfall Lifecycle Pros Waterfall Lifecycle Cons
Tends to suit large scale projects with stable Inflexible as dependent on clear requirements
requirements Produces excessive documentation so time
Ideal for supporting inexperienced teams consuming
Orderly sequence ensures quality documentation Missing components discovered during design and
with reliability and maintainability of system coding
Progress easily measurable System performance can’t be tested until system is
Progresses forward with slight ‘splash back’ almost fully coded
Agile Development is a group of methods designed to cope with changing requirements through producing
software in iterative versions each building on the previous and increasing requirements met so, if seeing a
version the user realises a requirement is not fully considered, they can add it in a future iteration.
XP is an agile iterative approach where program is coded, tested and improved repeatedly. Representative
of the customer becomes part of team to help decide requirements and tests used to ensure they are
correctly implemented and answer any questions about any problem areas the programmers might have.
Extreme Programming (XP) Pros Extreme Programming (XP) Cons
New requirements can be adopted as end user Client must ensure they are represented on the
integral throughout, team to accept completed code and discuss any
PP and SP standards used to ensure code in each potential changes
iteration is of good enough quality Emphasis on coding instead of design so lack of
Final product well tested, efficient and robust documentation
Created quickly so modules become available for Unsuitable for larger projects
use by client

11
Spiral Model progresses by evaluating and dealing with risk. Analyst begins by collecting data followed by
other stages leading to evaluation leading to a return to data collection to modify results. Riskiest parts of
project identified and dealt with first. Different stages refined with each spiral iteration until project complete.
Spiral Model Pros Spiral Model Cons
Large risk analysis ensures issues are addressed Risk analysis needs highly skilled team
early in development Costs high due to number of prototypes created and
Used where risks involved e.g. mission critical increased customer collaboration
projects, games
RAD is where a prototype design with reduced functionality is produced to a set deadline and is tested and
evaluated to refine the design of the next prototype using feedback from users. Cycle is repeated with each
iteration improving the program until the prototype is accepted and the final product is produced.
Rapid Application Development (RAD) Pros Rapid Application Development (RAD) Cons
Something can be seen working early in the project Speed may impact overall system quality and
so cheaper consistency of designs as lacks administration and
End user more involved documentation
Can change requirements as product becomes Not suitable for safety critical systems
clearer
Concentrates on essentials needed for user so
overall development time quicker than alternative
methods
Used where requirements not well defined and
development team authorised to make
design decisions without need for detailed
consultation with senior managers
Black Box Testing White Box Testing
Sets of different possible inputs are tested against Uses the source code to test the actual steps of the
the expected output according to the design so algorithm to ensure all parts work as intended by
need to test all situations without considering how checking every possible path and condition
the program works. statements with dry runs and trace tables.

Alpha Testing Beta Testing


Testing is carried out by programmers within the Nearly complete program is tested by 3rd party users
software company playing the role of the user under normal operating conditions to aim to find any
during development to find bugs in the program. bugs the programmer overlooked to make the final
version more robust.
1.2.4 Types of Programming Language
Procedural languages
Procedural Languages are 3rd Generation high level imperative languages which use sequence, selection
and iteration. The program states what to do and how to do it with a series of statements in a specific order.
Breaks down the solution into subroutine blocks which are rebuilt and combined to form the program. Logic
of the program is given as a series of procedure calls.
Assembly language
Assembly Code is a machine specific Low Level Language (LLL). It is at a higher level and is easier to write
than machine code but is more difficult than High Level Languages (HLLs). Uses descriptive names for data
stores, mnemonics for instructions and labels to allow selection. Each instruction is translated into 1 machine
code instruction. May use macros.

12
LMC is a fictional processor designed to illustrate the principles of how processors and assembly code work.
Mnemonic Function Example Instruction Explanation
ADD Add ADD n Add the contents of n to the ACC
SUB Subtract SUB n Subtract the contents of n from the ACC
STA, Store STA n Store the number n
STO
LDA, Load LDA n Load the contents of n into the ACC
LOAD
BRA, Branch BRA number Unconditional jump to number label
BR always
BRZ, Branch if BRZ number Jump to number label if ACC contents is zero
BZ zero
BRP, Branch if BRP number Jump to number label if ACC contents is positive
BP positive
INP, Input INP Prompt for a number to be input
IN,
INPUT
OUT Output OUT Outputs the contents of the ACC
HLT, End HLT Stops program execution
COB, Program
END
DAT Data n DAT 10 Creates data location n and stores the number 10 in it
Location
Modes of addressing memory
Data in operand is the value used by the operator and is added to the value in the ACC.
Immediate
Used in assembly language.
Simplest method of addressing uses data in operand without modification as the address of
Direct the data and adds it to the value in ACC. Memory locations that can be addressed are limited
by the size of the address field as the code is not relocatable. Used in assembly language.
Uses the address field as a vector to the address to be used. Used to access library routines
Indirect as it increases the size of the address that can be used allowing a wider range of memory
locations to be accessed.
Modifies the address given by adding the number from the Index Register to the address in
Indexed the instruction. Allows efficient access to a range of memory locations by incrementing the
value in the Index Register e.g. used to iterate through an array.

Object oriented languages


OOP is where programs are built by considering components as objects that interact with each other.
Template to construct a set of objects that have state and behaviour defines methods and
Class
attributes an object should have.
Object Self-contained instance of a class made from attributes and methods.
Method Subroutine that forms the actions an object can perform.

13
Constructor Method describes how an object is created.
Attribute Value stored in variable associated with an object.
Encapsulation is the process of Polymorphism applies the
Inheritance is when a class takes hiding data within objects to keep same method to objects of
all the methods and attributes of attributes private. Private different types as they are
its parent class, may override attributes are only accessed and treated in the same way. Code
some of these and has additional changed via public methods written is able to handle
extra methods and attributes of its defined for class to maintain data different objects in the same
own. The class code is used as a integrity. Objects only interact in way to reduce the volume
base for similar objects to save the way intended and prevents produced e.g. apply same
time. unexpected changes to attributes method to array with objects of
having unforeseen consequences. different classes.

1.3 Exchanging Data


1.3.1 Compression, Encryption and Hashing
Lossy vs Lossless compression
Compression is necessary to reduce file sizes and time taken for data transmission by considering available
bandwidth, expected processing power and storage requirements of the user’s computer.
Lossy Compression is where the algorithm makes
Lossless Compression is where files are stored
the file size smaller by removing data. Lost data is
and transmitted intact. The algorithm used retains
not recoverable and the accuracy of the data
all info in the file while reducing the size so the
represented is reduced as it assumes there is
original can be reconstructed from the data e.g. ZIP,
enough data remaining to be acceptable. Used for
GIF, PNG, program code.
sound and image files e.g. JPEG, MPEG, MP3.
Run length encoding and dictionary coding for lossless compression
Run Length Encoding is where a dictionary is used Dictionary Coding is where a compression
to store items e.g. pixels, words, bits. Repeated algorithm searches through a text to find suitable
occurrences are stored in a dictionary/table with the entries in a known/own dictionary and translates the
number of occurrences. Used in TIFF, BMP files. message accordingly, used in ZIP, GIF, PNG files
Symmetric and asymmetric encryption
Encryption is used when transmitting info on Encryption Keys are long random numbers needed
networks as data can be intercepted. Important in to encrypt and decrypt a message. The public key is
VPNs as numerous users share a physical network. available to all but private key confidential to owner.
Symmetric is where the same key is used to Asymmetric is where different keys are used to
encrypt and decrypt. Requires both parties to have encrypt and decrypt using public/private keys so it is
a copy of the key. Can’t be transmitted over Internet more secure.
as eavesdropper monitoring message may see it.
Different uses of hashing
Hashing Algorithm is used to transform data e.g. network passwords stored in abbreviated form. Has low
chance of collision, quicker to calculate and compare than bitwise comparison, provides smaller output than
input, difficult to regenerate original from hash value but easy to check but vulnerable to brute force attacks.

1.3.2 Databases
RDB, FFDB, PK, FK, SK, entity relationship modelling, normalisation and indexing
14
• DB is a structured non-volatile store of data for ease of processing by allowing data to be retrieved
quickly, updated easily and filtered for different views.
• RDB is based on linked tables to avoid data duplication, inconsistency, redundancy and improve data
integrity and security. Easier to add and change data, data format and control access.
• FFDB is a simple data structure that is easy to maintain but of limited use due to data inconsistency,
duplication and redundancy
• PK is a unique identifier used to define every record in a table.
• FK is when a PK in one table is used as an attribute in another. Provides links between tables and
represents 1-m relationship to avoid data duplication.
• SK is an alternative index that takes up extra space in a database. It allows tables to be sorted and
searched quickly and differently from the PK based on different attributes. When data tables changed,
indexes rebuilt.
• ERD is necessary when planning a relational database using diagrams to show relationships between
tables. Helpful in reducing redundancy.

One-to-One (1-1) relationship One-to-Many (1-m) relationship Many-to-Many (m-m) relationship

Methods of capturing, selecting, managing and exchanging data


Serial File is where records are
stored chronologically and new
records are always appended to Indexed Sequential File is where records are sorted according to a
existing records. Simple short file PK. A separate index is kept allowing groups of records to be
easy to implement as adding and accessed directly and quickly. Data needs to be inserted in the correct
searching records easy but slow position and indexes updated to keep in sync with data. The file is
as all preceding records must be difficult to manage but accessing specific records faster as sequential
searched to access a record. search from the beginning is not needed. More space efficient so
Sequential File is a serial file suited to large files.
where data is sorted according to
a key field.
DBMS is software that:
• Creates, maintains and handles complexities of managing a database
• Provides UI, different views of data for different users, security features
• Means to create queries, views, tables, interfaces, outputs
• Finds, adds and updates data
• Maintains indexes, enforces data integrity rules, manages access rights and uses SQL to communicate
with other programs.
Query – Isolates and displays subset of data e.g. QBE
Normalisation to 3NF (Third Normal Form)
Normalisation is a formal and methodical process that resolves many-many relationships to design data
tables optimally by going through distinct stages leading to at least 3NF. Minimises repetition to reduce data
redundancy and ensure all attributes in a table depend on one another to avoid the need to update multiple
data entries when changing a single attribute to reduce chances of mistakes.
2NF: Removes data items 3NF: Non-key dependencies
1NF: Separates out multiple occurring in multiple rows into a removed to their own linked tables
items/sets of data in a row. new table linked by repeated so every field/attribute definable
fields. by key.

SQL (Structured Query Language)


15
SQL is a declarative database language that allows the creation, interrogation and alteration of a database.
Structure Example Instruction Explanation
SELECT SELECT “X” Extracts data from field “X”
FROM FROM “tblX” Lists data from table “tblX”
WHERE WHERE “Y” = ‘X’ Adds conditions to list “Y” where it equals “X”
LIKE “Y” LIKE ‘X’ Matches “Y” when similar to “X”
AND “X” AND “Y” < 1 Both “X” and “Y” must be < 1
OR “X” OR “Y” > 0 Either “X” or “Y” must be > 0
DELETE DELETE FROM “tblX” Removes rows of data from table “tblX”
INSERT INSERT INTO “tblX” Adds data to table “tblX”
DROP DROP TABLE “tblX” Removes table “tblX”
JOIN JOIN “tblX” ON “tblY” Joins tables “tblX” and “tblY” to return linked data
* SELECT * FROM “tblY” Extracts everything from table “tblY”
% ‘X’ LIKE ‘%Y’ Extracts 1 or more characters when “X” is like “Y”
Referential integrity
Referential Integrity – Keeping a database in a consistent state. Enforced by DBMS so data changed in
one table considers data in linked tables e.g. can’t delete data linked to existing data in another table.

Transaction processing, ACID, record locking and redundancy

Transactions are changes in the state of a database e.g. addition, deletion, alteration of data and conforms
to ACID rules.
• Atomicity: Succeed or fail but never partially succeed.
• Consistency: Only changes database according to rules of database.
• Isolation: Each transaction protected against others concurrently being processed.
• Durability: Change preserved no matter what happens.

Record Locking – Prevents simultaneous access


Data Redundancy – Unnecessary repetition of data
to objects in a database to prevent updates being
that leads to inconsistencies and space wasted
lost or inconsistencies in data. Record locked
but allows data to be recoverable if part of the
whenever a user retrieves it for editing or updating
database is lost. Provided by RAID setup or
so others are denied access until the transaction is
mirroring servers.
completed or cancelled. Can cause deadlocking.
1.3.3 Networks
Characteristics of networks and the importance of protocols and standards
Ring Topology – Each node Star Topology – Layout for
Bus Topology – Nodes attached
connects to exactly 2 other nodes. most networks characterised by
to a single backbone. Vulnerable
Data frames are sent in one a separate physical link from
to breakages and prone to data
direction minimising collisions but each node to a switch or hub so
collisions so uncommon.
is easily disrupted. resilient.
Protocol is a set of rules governing network communication between devices e.g. TCP/IP developed for the
Internet.
16
Private Network
Security, control of access, confidence of availability Need specialist staff, organised backups, security
MAC Address is a 48-bit address which is a unique identifier associated with the network interface. Provides
addressing capability in a network usually assigned by the manufacturer.
IP Address is a numerical address is made of 4 numbers each between 32 hexadecimal digits that uniquely
identifies a device on a network and is a logical identifier used to route messages.
• Static Addressing – IP addresses assigned permanently to a device.
• DHCP – IP addresses automatically assigned as needed.
• Subnet – Have own IP addresses to conserve addresses.
The internet structure

DNS – Hierarchical system for naming resources on a network


TCP/IP – Suite of protocols cover
providing human readable equivalents to IP addresses. Domain name
data formatting, addressing,
sent to DNS servers which map to IP address and if server can’t
routing and receiving. Equivalent
resolve it passes request recursively to another server which sends IP
to layers 7,4,3,2 of OSI model.
address to browser so it can retrieve website hosted from server.

Layering is a form of abstraction that divides a complex system into its component parts to allow work to
be done piecemeal. Efficient in problem solving as each problem dealt in isolation. Used in networks
as each layer communicates only with the adjacent layers.
OSI Model – Non-proprietary network model provides 7 layers where top layers are closer to the user and
the bottom layers are closer to physical transmission.
Layer Name Function
7 Application Collecting, packaging and delivering data to and from users
6 Presentation Data conversions
5 Session Manages connections
4 Transport Packetizing, checking, establishing and terminating connections via routers
Transmission of packets, routing from sender to recipient, IP addressing and
3 Network
direction of datagrams
Access control, error detection and correction, passing datagrams to physical
2 Data Link
devices
and media
1 Physical Network devices and media that provides connections
WAN is a network over a geographically remote
LAN is a group of shared computer devices
distance. Uses a modem to act as a gateway and
connected over small geographical area using
different forms of communication media supplied by
hardwired/wireless communication. Infrastructure is
a 3rd party e.g. telecoms. Data is subject to
owned by the network owner so more secure and
interception and computers have own peripherals
requires no extra communication devices.
instead of sharing.
SAN is block level storage where devices are PAN is linked personal devices
consolidated so not visible to users.
Data Packet is an equal sized block of data where the structure is defined by the protocol used. 3 parts:
Header – sender/receiver IP address, protocol, packet number and order. Payload - Data transmitted. Trailer
– End of packet marker and error correcting codes.
17
Packet Switching is connectionless mode. The Circuit Switching has 3 phases: connection
message is divided into packets which take the establishment, data transmission and connection
most convenient route as no established route from termination. Connection mode so devices remain
source to destination. Secure as message can’t be connected for duration of data transmission. Route
intercepted and avoids message failure if route established so all packets sent down circuit on
disrupted. Allows efficient use of a network as each same route but message can be intercepted if route
channel only used for a short time. Packets arrive tapped into. Packets remain in correct order but
out of order so must be reordered at destination must be reassembled. Ties up large areas of a
and only as fast as slowest packet. Error checking network so no other data can use any part of the
promotes successful transmission so used by circuit until transmission complete.
TCP/IP.
Network security and threats, use of firewalls, proxies and encryption

Firewall – Application controls Proxy – Computer placed SSL – Protocol that enables
traffic in and out of a network to between network and remote encrypted links between
prevent access to system by resource that intercepts traffic and computer systems
unauthorised sources. isolates network from the outside to stop 3rd party access.
world.
Network hardware
NIC – Hardware that generates
Router – Hardware that connects WAP – Hardware usually
and receives electrical signals.
networks and forwards data connected to router. Works at
Works at physical and data link
packets. Works as network layer. data link layer.
layers.
Client-server and peer to peer (P2P)
Client Server P2P
Client computer requests service from high end All computers have equal status so can act as client
computer server that provides services e.g. file and or server or both. Useful on Internet as traffic can
print, web email, data processing and storage. avoid servers so is private. Doesn’t rely on company
Client code less complex so can be implemented on servers’ connection to the Internet so faster. No
multiple platforms. Servers can be upgraded to fix need to buy expensive hardware or bandwidth so
security problems and provide more features. cheaper and system is more fault tolerant.
1.3.4 Web Technologies
HTML (Hypertext Mark-up Language), CSS (Cascading Style Sheets) and JavaScript
HTML is the standard for making webpages and text files using tags to attach to items affecting how they are
rendered.
CSS determines how tags affect objects. Promotes consistent and simpler HTML as an external CSS file is
used to store style info instead of embedding it into a static HTML file directly. Can keep the content and the
formatting separate to standardise the appearance and behaviour of webpages. Affects the whole site when
changes are made saving time as they are not rewritten for every page ensuring consistency, used in
multiple HTML files, cached by browser so the site is quicker to access, changed for different themes or
devices and allows different display characteristics of the same webpage on different platforms.
JavaScript is a scripting programming language that requires a runtime environment e.g. a browser to
provide the necessary objects and methods. Embedded in HTML with the script tag to add functionality and
interactivity to a page e.g. validation, animation, loading new content. Is interpreted as it is likely to run on a
variety of machines with different architectures and high level code and is machine independent but slow.
Used on the client side to reduce unnecessary load on a server but can be amended and circumvented so
also used on server side.

18
Search Engine Indexing (SEI)
Search Engine – Software
program finds info on the web,
Browser – Software that SEI – Process of collecting and
builds and searches indexes
renders/displays HTML pages. Is storing data from websites so
using content e.g. metatags and
useful to find web resources by search engines can quickly
various algorithms, supports many
accepting URLs and following match content against search
languages and improvements
links. Used on private networks. terms.
made have allowed success with
misspelled searches.

PageRank Algorithm (PRA)


PRA ranks pages according to their usefulness by considering the number of sites linking to the site, the
PageRank of the linking sites and the number of outward links from the site. Iteratively calculates importance
of each website so more links from a website with high importance is given a higher ranking than those of
low importance. The more the links, the better the site.
Server and client side processing
Server Side processing takes place on the webserver.
Server Side Pros Server Side Cons
Essential for data security as data is sent from the Puts extra load on the server costing the company
browser to the server which processes it and sends hosting the website
the output back to the browser
Takes away the reliance of a browser having the
correct interpreter
Hides the code from the user to protect copyright
and avoid it being amended
Is best used where processing is integral e.g. generating content and accessing data including secure data
so any data passed must be checked carefully.
Client Side processing takes place in the web browser.
Client Side Pros Client Side Cons
Reduces data traffic and load on the server so it can Code is visible so can be copied
do more processing Browser may not run the code as it doesn’t have the
Gives quick feedback to the user capability or the user intentionally disabled client
Sends better quality data to the server code
Doesn’t require the data to be sent back and forth
Code is more responsive

Is best used when no critical code runs and where a quick response is needed e.g. games, validation.

1.4 Data types, data structures and algorithms


1.4.1 Data Types
Primitive data types
Type Description Example
Integer Whole number values with no decimal part 1, -7, 152
Real/Floating Point Numbers with decimal or fractional parts 1.23, -18.63, 3.14159
19
Character Single letter, digit, symbol or control code R, f, 6, @
String A string of alphanumeric characters Cat, Gh8, ^6*(k
Boolean One of two values True or False
How character sets (ASCII and UNICODE) are used to represent text
Character Set
Normally equates to the symbols on the keyboard that are represented by the computer by unique binary
numbers and may include control codes. Number of bits used for one character is 1 byte, number of
characters tend to be a power of 2 and uses more bits for an extended set.

ASCII is where each character of the alphabet, UNICODE was originally a 16-bit coding system that
special symbols and control codes are represented is now updated so it continuously grows as it is not
by agreed 8 bit binary patterns. The number of a fixed size set. Assigns a unique code to all the
characters is limited to 256 so it is impossible to possible symbols available in the world using a
display a wide range for other alphabets or symbols series of code pages representing the chosen
sets. language symbols. Original ASCII representations
are included with the same numeric values.
1.4.2 Data Structures
Arrays, records, lists, tuples
Array is a static data structure of the same data Record is a data store organised by attributes
type grouped under a single identifier. Individual
List is a data store organised by an index
items can be accessed directly using an index.
Contents are stored contiguously in memory and Tuple is an immutable list that cannot be modified
can be multidimensional. once set up
linked-lists, graphs, stacks, queues, trees, binary search trees (BSTs), hash tables

Linked List - Dynamic data structure uses index values and pointers to sort list in specific way and organise
on more than one category. Data is added in next available space and pointers updated accordingly. Item
removed by putting pointer in previous item set to value of item to be removed effectively bypassing removed
item. Needs to be traversed until desired element found. Contents may not be stored contiguously.
Graph – Collection of data nodes with edges between them that can be directional/bi-directional,
directed/undirected or weighted/unweighted. Represented by an adjacency matrix.
Stack – Dynamic LIFO data structure uses 2 pointers: top and bottom. Top is the stack pointer and data is
added and removed from the stack using PUSH and POP commands.
Queue – Dynamic FIFO data structure uses 2 pointers: start and end. Start is the queue pointer and data is
removed from the start and added to the end using PUSH and POP commands.
Tree – Branching data structure consists of nodes that have children. Root node is at the start of the tree
and the children are further down with branches joining the nodes. Depth first requires less memory than
breadth first but is quicker if looking at the deep parts of the tree. Depth first isn’t guaranteed to find thr
quickest solution and may not even find it if precautions are not taken to not revisit previously visited states.
BST – Each node only has 2 children. If the child node is less than the parent node it goes to the left of the
parent and if it is greater than it goes to the right. Data ia added by placing it at the end of the list in the first
available space and added to the tree following the same rules. Depth first visits all nodes to the left of the
root then the right of the root and then the root node itself and repeats for each node visited. Breadth first
visits the root then its children then all the children of the children and repeats for each child visited.
Hash Table – Enables access to data not stored in a structured manner.
Hash Function – Generates address in a table for data that can be recalculated to locate that data.

1.4.3 Boolean Algebra


20
Boolean algebra
The output is true if both inputs are true. Notation
The AND gate used:
(Conjunction)

The output is true if either or both inputs are true. Notation


The OR gate used:
(Disjunction)

The output is the negative of the input. Notation used:


The NOT gate
(Negation)

The NAND The output is true if one or both inputs are false. Notation used:
gate
(Negative
Conjunction)

The NOR The output is true if both inputs are false. Notation used:
gate
(Negative
Disjunction)

The output is true if only one of the inputs are true. Notation
The XOR gate
used:
(Exclusive
Disjunction)

1.5 Legal, moral, ethical and cultural issues


1.5.1 Computing related legislation
The Data Protection Act (DPA) 1998
The purpose of the Act is to control the storage of data about individuals. There are 8 provisions that include
requirements that data should be:
1. Processed fairly and lawfully
2. Only used for the purpose specified to the Data Protection Agency and should not be disclosed to other
parties without the necessary permission
3. Relevant and not excessive
4. Accurate and up to date
5. Only kept for as long as necessary
6. Allowed to be accessed by the data subjects to check and update it if necessary
7. Secured by the data controller to prevent unauthorised access to it
8. Not transferred outside the EU unless the country has adequate data protection legislation

Any data processed in relation to the following is exempt from the Act:
• National security
• Crime and taxation
• Domestic purposes

21
The Computer Misuse Act (CMA) 1990
Under this Act it is an offence to make unauthorised access to computer material:
• with intent to commit or facilitate commission of further offences
• with intent to impair, or with recklessness as to impairing, operation of computer e.g. distributing viruses

This law is aimed at illegal hackers who exploit weaknesses and attack systems to access data e.g.
usernames, passwords, personal info, etc. Denial of service attacks send requests from multiple users or
bots to disrupt service for political reasons or to blackmail the service owner but features e.g. digital
signatures or certificates use encrypted messages to confirm the identity of the sender.
The Copyright Design and Patents Act (CDPA) 1988
This Act protects intellectual property so it is illegal to copy, modify or distribute software or other intellectual
property without permission from the copyright holder as it prevents the developer from receiving some or all
earnings for their work.
The Regulation of Investigatory Powers Act (RIPA) 2000
This Act is about criminal and terrorist use of the internet so it gives certain organisations the right to:
• Demand ISPs provide access to customer communications
• Have mass surveillance of communications
• Demand ISPs fit equipment to facilitate surveillance
• Demand access be granted to protected information
• Monitor an individual’s internet activities
• Prevent existence of such interception activities being revealed in court.
Communications Act 2003 makes it illegal to steal Wi-Fi access or send offensive messages or posts.
Equality Act 2010 makes it illegal to discriminate against individuals by not providing a means of access to
a service for a section of the public so web service providers must make websites more accessible by
including features e.g. screen readers, magnifier options, larger fonts, image tagging, contrasting colours,
transcripts of sound tracks or subtitles.

2.2 Problem solving and programming


2.2.1 Programming Techniques
Programming constructs
Sequence example
Sequence – All instructions executed once in order Input X
which they appear. Input Y
Z = X - Y
Iteration – Group of instructions repeated for set number of times or until condition.

FOR-NEXT LOOP – Controlled by an automatic counter so number of iterations are Example


for i=0 to 5
fixed according to the start and end values of a variable set at the beginning. Can
print(i)
count in steps other than 1 or even backwards.
next i
WHILE-ENDWHILE LOOP – Controlled
at entry point so condition is tested before Example
each iteration and statements will execute while answer!=”computer”
if true. Some may not be until condition answer=input(“Enter your password?”)
met. Useful when number of iterations endwhile
needed are unknown.
22
Example
count=1
repeat
REPEAT-UNTIL LOOP – Controlled at exit point so must execute at least once.
print(count)
count=count+1
until count=5
Selection – Condition used to determine which statements are executed so as a result some may not.
SELECT CASE – Allows branching on multiple values of same variable. Can have a default option to make
code easier to read and write. Can avoid multiple nested IFs to avoid numerous repeats of similar conditions.
IF statement example SELECT CASE example
if entry==”a” then select entry:
print(“You chose A”) case ”A”:
elseif entry==”b” then print(“You chose A”)
print(“You chose B”) case “B”:
else print(“You chose B”)
print(“Unrecognised choice”) default:
endif print(“Unrecognised choice”)
endselect
Recursion
Function calls itself from within so original call is halted until the subsequent calls return. Eventually reaches
a stopping condition. Code is shorter so is more natural to read and quicker to write. Some functions are
naturally recursive so are suited to certain problems e.g. using trees. Reduces the size of the problem with
each call but creates new variables so uses memory inefficiently. Can run out of stack space causing it to
crash but can be avoided with tail recursion. Difficult to trace as each frame on the stack has own set of
variables. Slower than iterative methods due to stack maintenance.
Global and local variables
Global Variable
Declared outside subprograms so visible throughout Local Variable
program. Allows data to be shared between Declared inside and only visible inside subprogram.
modules. Value is accessible from various parts of Can be used as parameters. Destroyed when
the program as it is the same wherever accessed subprogram exits. Helps make function reusable.
e.g. today’s date, VAT rate, 𝜋. Difficult to integrate Overrides global variable with same identifier and
as program complexity increases and may cause same variable name may be used in different
conflicts with names in other modules that may be modules e.g. loop counter without causing errors.
accidently altered so is dangerous.
Modularity, functions and procedures, parameter passing by value (ByVal) and by reference (ByRef)
Subroutine is a subprogram that makes a program modular by accepting parameters so it can be used
multiple times with different data. Each module is a small part of the problem so is easy to solve, test/debug
independently and maintain. Update part of the system as program will be well structured with clearly defined
interfaces without affecting the rest of the system. Development is shared between a team of programmers
so program is developed faster. Easier to monitor progress. Modules are allocated according to expertise
improving quality of final product. Different modules are programmed in different languages suitable for the
application. Reduces amount of code produced as code is reusable or standard library modules are used
reducing the time of development.

Function is a subroutine which performs a specific task, uses local variables, returns a single value, is called
inline using an identifier as part of an expression and value returned replaces the function call.

23
Procedure is a subroutine which performs a specific task, uses local variables, may return a value, receives
parameter values, is executed when called from the parent program to be used as a statement and control is
passed back when complete. Most programming languages nowadays use functions.
Parameter is a description of item of data used as a local variable which is supplied to a subroutine ByRef or
ByVal. Is given identifier when subroutine is defined and is substituted by an actual value when called.
ByVal ByRef
Uses new memory space to make a copy of variable Uses existing memory space for the address of the
passed into procedure so changes only happen to value passed into the procedure so changes are
copy. No unforeseen effects occur in other modules. only made to the original data.
Use of an IDE (Integrated Development Environment) to develop/debug a program
IDE is a program used for developing programs made from components that provides features for editing,
program building, version control, debugging tools, testing, compilation, translator diagnostics, breakpoints,
stepping and variable watches.
1. Debugging Tools - Allow inspection of variable values to allow run-time detection of errors. Code can
be examined as it is running allowing logical errors to be pinpointed. Can produce a crash dump
showing state of variables at point where an error occurs. Displays stack contents showing sequencing
through modules.
2. Break Points - Cause program to halt in execution to test it at strategic points, inspect state of current
values of variables and flow of control to determine logic errors and undeclared identifiers.
3. Stepping - Executing code one statement at a time observing path of execution, changes to variables to
determine runtime error positions and reasons. Can be used with break points or watches.
4. Translator Diagnostics - Reports syntax errors and suggests solutions to programmer who can then
correct error and translate again but sometimes error messages are in the incorrect place.
5. Variable Watch - Monitors status of variables and objects as it steps through code and causes program
to halt in execution if condition met e.g. variable changing.
2.2.2 Computational Methods
Computational methods
Stepwise Refinement/Problem Decomposition – Each module is defined in simple terms and then is split
into smaller sub-modules which are successively split until each is small enough to be programmed. Allows
divide and conquer and order of execution needs to be considered as data may need to be processed by
one module before another can use it and some may need to be accessible in an unpredictable way.
Abstraction – Process of separating ideas from reality by removing unnecessary complexities, design,
details, elements, features and programming using symbols to show real life features to make a
representation of reality and allow a computational solution. Saves memory and computational resources to
increase response speeds and not detract from the main purpose of the program.
Backtracking – Strategy moves systematically towards solution by looking at progression through stages of
solving problem. If the pathway fails at some point it goes back to the last successful stage e.g. logic
problems, Prolog and repair strategies for fixing computers.
Data Mining – Process of searching through vast quantities of unconnected data which may be from
different databases for relationships between components not immediately obvious. Includes pattern
matching, anomaly detection, cluster and regression analysis. There may be no predetermined matching
criteria and brute force approach is possible with high speed computers. Used to plan for future eventualities,
business modelling, identify spam for email filters and purchasing habits.
Heuristics – Rule of thumb approach is used when unfeasible to analyse all eventualities. Leads to a good
enough result but is not reliable for life and death scenarios as is only based on experience. Useful with
many ill-defined variables and used to assess potential malware by behaviour.

24
Performance Modelling – abstraction performs virtual actions to predict behaviour before implementation if
it is expensive or dangerous. Makes use of existing data to make predictions and randomness built in where
real-life parameters not fully understood e.g. climate modelling.
Visualisation – Computer process presents data in an easy-to-grasp way as trends and patterns are better
comprehended in visual displays. Allows more creative models to be produced e.g. graphs and maps.

2.3 Algorithms
2.3.1 Algorithms
Big O notation
Shows highest order component with any constants removed to evaluate the complexity and worst-case
scenario of an algorithm. Shows how time increases as data size increases to show limiting behaviour.
Complexity measures time or space needed for an algorithm increases as data size increases.
• O(1) – constant complexity e.g. printing first letter of string.
• O(n) – linear complexity e.g. finding largest number in list.
• O(nk) – polynomial complexity e.g. bubble sort.
• O(kn) – exponential complexity e.g. travelling salesman problem.
• O(logn) – logarithmic complexity e.g. binary search
Algorithms for the main data structures
Stack PUSH Operation Stack POP Operation
PROCEDURE AddToStack (item): PROCEDURE DeleteFromStack (item):
IF top == max THEN IF top == min THEN
stackFull = True stackEmpty = True
ELSE ELSE
top = top + 1 stack[top] = item
stack[top] = item top = top - 1
ENDIF ENDIF
ENDPROCEDURE ENDPROCEDURE
Queue PUSH Operation Queue POP Operation
PROCEDURE AddToQueue (item): PROCEDURE DeleteFromQueue (item):
IF ((front - rear) + 1) == IF front == min THEN
max THEN queueEmpty = True
queueFull = True ELSE
ELSE queue[front] = item
rear = rear - 1 front = front + 1
queue[rear] = item ENDIF
ENDIF ENDPROCEDURE
ENDPROCEDURE
Search For Item In Linked List
FUNCTION SearchForItemInLinkedList ():
Ptr = start value
REPEAT
Go to node(Ptr value)
IF data at node == search item
OUTPUT AND STOP
ELSE
Ptr = value of next item Ptr at node
ENDIF
UNTIL Ptr = 0
OUTPUT data item not found
ENDFUNCTION
25
Output Linked List In Order
FUNCTION OutputLinkedListInOrder ():
Ptr = start value
REPEAT
Go to node(Ptr value)
OUTPUT data at node
Ptr = value of next item Ptr at node
UNTIL Ptr = 0
ENDFUNCTION

Depth First (Post Order) Graph Traversal


FUNCTION dfs(graph, node, visited):
markAllVertices (notVisited)
createStack()
start = currentNode
markAsVisited(start)
pushIntoStack(start)
WHILE StackIsEmpty() == false
popFromStack(currentNode)
WHILE allNodesVisited() == false
markAsVisited(currentNode)
//following sub-routine pushes all nodes connected to
//currentNode AND that are unvisited
pushUnvisitedAdjacents()
ENDWHILE
ENDWHILE
ENDFUNCTION

Breadth First Graph Traversal


FUNCTION bfs(graph, node):
markAllVertices (notVisited)
createQueue()
start = currentNode
markAsVisited(start)
pushIntoQueue(start)
WHILE QueueIsEmpty() == false
popFromQueue(currentNode)
WHILE allNodesVisited() == false
markAsVisited(currentNode)
//following sub-routine pushes all nodes connected to
//currentNode AND that are unvisited
pushUnvisitedAdjacents()
ENDWHILE
ENDWHILE
ENDFUNCTION

Standard algorithms

26
Bubble Sort – Uses a temp element to move through the data repeatedly in a linear way. Is intuitive and
easier to program for small data sets as few changes are needed. Fast but worst-case O(n2) so inefficient.

PROCEDURE BubbleSort (items):


swapMade = True
WHILE swapMade == True
swapMade = False
position = 0
FOR position = 0 TO length(list) - 2
IF items[position] > items[position + 1] THEN
temp = items[position]
items[count] = items[count + 1]
items[count + 1] = temp
swapMade = True
ENDIF
NEXT position
ENDWHILE
PRINT(items)
ENDPROCEDURE
Insertion Sort – List of sorted numbers is built up with one number at a time being inserted into the correct
position until all items are checked. It is the simplest algorithm with best case O(n) and worst O(n2) so is less
efficient for large sets of data.

PROCEDURE InsertionSort (list):


item = length(list)
FOR index = 1 TO item - 1
currentvalue = list[index]
position = index
WHILE position > 0 AND list[position - 1] > currentvalue
list[position] = list[position - 1]
position = position - 1
ENDWHILE
list[position] = currentvalue
NEXT index
ENDPROCEDURE
Binary Search – Discards half the data at each step so is faster as fewer items are checked and more
efficient for large files but must be in order to allow the appropriate items to be discarded. Doesn't benefit
from an increase in speed with additional processors and can perform better on large data sets with one
processor than linear search with many processors. Best case O(1), worst O(log2n), average O(log2n-1).

FUNCTION BinarySearchRecursive (list, value, leftPtr, rightPtr):


IF rightPtr < leftPtr THEN
RETURN error message
ENDIF
mid = (leftPtr + rightPtr)/2)
IF list[mid] > value THEN
RETURN BinaryS (list, value, leftPtr, mid-1)
ELSEIF list[mid] < value THEN
RETURN BinaryS (list, value, mid+1, rightPtr)
ELSE
RETURN mid
ENDFUNCTION

27
FUNCTION BinarySearchIterative (list, value, leftPtr, rightPtr):
Found = False
IF rightPtr < leftPtr THEN
RETURN error message
ENDIF
WHILE Found == False
mid = (leftPtr + rightPtr)/2)
IF list[mid] > value THEN
rightPtr = mid - 1
ELSEIF list[mid] < value THEN
leftPtr = mid + 1
ELSE
Found = True
ENDIF
ENDWHILE
RETURN mid
ENDFUNCTION
Linear Search – Slower for large sets of data as all data must be checked from the beginning until the item
is found or the end of the list is reached. Doesn’t need an ordered list and can have multiple processors
searching different areas at the same time. Best O(1), worst O(n), average O(n/2).

FUNCTION LinearSearch (list, value):


Ptr = 0
WHILE Ptr < length(list) AND list[Ptr] != value
Ptr = Ptr + 1
ENDWHILE
IF Ptr >= length(list) THEN
PRINT("Item is not in the list")
ELSE
PRINT("Item is at location "+Ptr)
ENDIF
ENDFUNCTION
Dijkstra SPA – Finds shortest path between 2 nodes on a graph. Works by keeping track of the shortest
distance to each node from the starting node and continues until the destination node is found.

FUNCTION Dijkstra ():


start node distance from itself = 0
all other nodes distance from start node = infinity
WHILE destination node = unvisited
current node = closest unvisited node to A
// initially this will be A itself
FOR every unvisited node connected to current node:
distance = distance to current node + distance of edge to
unvisited node
IF distance < currently recorded shortest distance THEN
distance = new shortest distance
NEXT connected node
current node = visited
ENDWHILE
ENDFUNCTION

28
A* Algorithm – Improvement on Dijkstra SPA by using heuristic values to speed up the process of finding a
solution. Decides which node to take next by f(x) = g(x) + h(x) which effects all adjoining nodes from the new
node taken. Other nodes are compared again in future checks and assumes the node is the shorter distance
but the adjoining nodes may not be the shortest path so may need to backtrack to previous nodes.

FUNCTION AStarSearch ():


start node = current node
WHILE destination node = unvisited
FOR each open node directly connected to the current node
Add to the list of open nodes.
g = distance from the start
h = heuristic estimate of the distance left
f = g + h
NEXT connected node
current node = unvisited node with lowest value
ENDWHILE
ENDFUNCTION
Merge Sort – Uses divide and conquer to divide the data in a list into sub lists and sorts the split data as it
puts it back together. Fast on all data sets but more efficient with a larger volume to sort. Is recursive so may
require more memory space. Worst case O(nlogn) scales up well but not with O(kn).

PROCEDURE MergeSort (listA, listB):


a = 0
b = 0
n = 0
WHILE length(listA) > 1 AND length(listB) > 1
IF listA(a) < listB(b) THEN
newlist(n) = listA(a)
a = a + 1
ELSE
newlist(n) = listB(b)
b = b + 1
ENDIF
n = n + 1
ENDWHILE
WHILE length(listA) > 1
newlist(n) = listA(a)
a = a + 1
n = n + 1
ENDWHILE
WHILE length(listB) > 1
newlist(n) = listB(b)
b = b + 1
n = n + 1
ENDWHILE
ENDPROCEDURE

Quick Sort – Uses divide and conquer with 2 pointers and compares the numbers at the pointers then
swaps if they are in wrong order and moves one pointer at a time. Alternative method uses a pivot. Quick for
large sets of data but initial arrangement of data affects time taken and is harder to code. Average O(nlogn).

29
PROCEDURE QuickSort (list, leftPtr, rightPtr):
leftPtr = list[start]
rightPtr = list[end]
WHILE leftPtr! != rightPtr
WHILE list[leftPtr] < list[rightPtr] AND leftPtr! != rightPtr
leftPtr = leftPtr + 1
ENDWHILE
temp = list[leftPtr]
list[leftPtr] = list[rightPtr]
list[rightPtr] = temp
WHILE list[leftPtr] < list[rightPtr] AND leftPtr! != rightPtr
rightPtr = rightPtr - 1
ENDWHILE
temp = list[leftPtr]
list[leftPtr] = list[rightPtr]
list[rightPtr] = temp
ENDWHILE
ENDPROCEDURE

Programming
Project Content
Coming Out
With Version
2.0…
30

You might also like