You are on page 1of 54

UNIVERSITY OF BOSASO

FACULTY OF COMPUTER SCIENCE AND TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE

SEMESTER FOUR

COMPUTER ORGANIZATION AND ARCHITECTURE

LECTURER: ANAS ABDISALAM SAID

FEB – MAY, 2022

1
Unit One – Introduction to Digital Computers

Basics of Digital Components

Integrated Circuit(IC)
Complex digital circuits are constructed with integrated circuits. IC is a small silicon
semiconductor crystal, called a chip, containing the electronic components for the digital gates.
The various gates are interconnected inside the chip to form the required circuit. The chip is
mounted in a ceramic or plastic container and the connections are welded to the external pins to
form an IC. The number of pins of IC varies from 14 to several thousand. Each pin is identified
by a unique number printed on its body.

Categories of Integrated Circuits

1. SSI(Small Scale Integration Device)

It contains several independent gates in a single package. The inputs and outputs of gates are
connected directly to the pins in the package. The number of gates is usually less than 10.

2. MSI(Medium Scale Integration Device)

It contains 10 to 200 gates in a single package. They perform elementary digital functions such
as decoders, adders, registers.

3. LSI(Large Scale Integration Device)

It contains gates between 200 to few thousand in a single package. They include digital systems
such as processors, memory chips etc.

4. VLSI(Very Large Scale Integration Device)

It contains thousands of gates within a single package such as microcomputer chip.

5. ULSI(Ultra Large Scale Integration Device)

It contains hundreds of thousands of gates within a single package such as microcomputer chip.

2
Logic Gates used in Digital Computers
Binary information is represented in digital computers by physical quantities called signals.
Electrical signals such as voltages exist throughout the computer in either one of the two
recognizable states. The two states represent a binary variable that can be equal to 1 or 0.
For example, a particular digital computer may employ a signal of 3 volts to represent
binary 1 and 0.5 volt to represent binary 0. Now the input terminals of digital circuits will accept
binary signals of only 3 and 0.5 volts to represent binary input and output corresponding to 1 and
0, respectively.
So now we know, that at core level, computer communicates in the form of 0 and 1, which is
nothing but low and high voltage signals.
But how are different operations performed on these signals? That is done using different
logic Gates.

What are Gates?


Binary logic deals with binary variables and with operations that assume a logical meaning. It is
used to describe, in algebraic or tabular form, the manipulation done by logic circuits
called gates.
Gates are blocks of hardware that produce graphic symbol and its operation can be described by
means of an algebraic expression. The input-output relationship of the binary variables for each
gate can be represented in tabular form by a truth-table.
The most basic logic gates are AND and inclusive OR with multiple inputs and NOTwith a
single input.
Each gate with more than one input is sensitive to either logic 0 or logic 1 input at any one of its
inputs, generating the output according to its function. For example, a multi-input AND gate is
sensitive to logic 0 on any one of its inputs, irrespective of any values at other inputs.
The various logical gates are:

1. AND
2. OR
3. NOT
4. NAND
5. NOR
6. XOR
7. XNOR

3
AND Gate
The AND gate produces the AND logic function, that is, the output is 1 if input A and input B
are both equal to 1; otherwise the output is 0.
The algebraic symbol of the AND function is the same as the multiplication symbol of ordinary
arithmetic.

OR Gate
The OR gate produces the inclusive-OR function; that is, the output is 1 if input A or input B or
both inputs are 1; otherwise, the output is 0.

Inverter (NOT) Gate


The inverter circuit inverts the logic sense of a binary signal. It produces the NOT, or
complement, function.

4
NAND Gate
The NAND function is the complement of the AND function, as indicated by the graphic
symbol, which consists of an AND graphic symbol followed by a small circle.
The designation NAND is derived from the abbreviation of NOT-AND.

NOR Gate
The NOR gate is the complement of the OR gate and uses an OR graphic symbol followed by a
small circle.

Exclusive-OR Gate
The exclusive-OR gate has a graphic symbol similar to the OR gate except for the additional
curved line on the input side.
The output of the gate is 1 if any input is 1 but excludes the combination when both inputs are 1.
It is similar to an odd function; that is, its output is 1 if an odd number of inputs are 1.

5
Exclusive-NOR Gate
The exclusive-NOR is the complement of the exclusive-OR, as indicated by the small circle in
the graphic symbol.
The output of this gate is 1 only if both the inputs are equal to 1 or both inputs are equal to 0.

ASCII code
ASCII Stands for American Standard Code for Information Interchange (pronounced 'as-key'). 
This is a standard set of characters understood by all computers, consisting mostly of letters and
numbers plus a few basic symbols such as $ and %. Which employs the 128 possible 7-bit
integers to encode the 52 uppercase and lowercase letters and 10 numeric digits of the Roman
alphabet, plus punctuation characters and some other symbols. The fact that almost everyone
agrees on ASCII makes it relatively easy to exchange information between different programs,
different operating systems, and even different computers.

Computer memory saves all data in digital form. There is no way to store characters directly.
Each character has its digital code equivalent: This is called ASCII code (for American Standard
Code for Information Interchange). Basic ASCII code represented characters as 7 bits (for 128
possible characters, numbered from 0 to 127). 

Character ASCII Code Hexadecimal Code


NUL (Null) 0 00
SOH (Start of heading) 1 01
STX (Start of text) 2 02
ETX (End of text) 3 03
EOT (End of transmission) 4 04
ENQ (Enquiry) 5 05
ACK (Acknowledge) 6 06

6
BEL (Bell) 7 07
BS (Backspace) 8 08
TAB (Horizontal tabulation) 9 09
LF (Line Feed) 10 0A
VT (Vertical tabulation) 11 0B
FF (Form feed) 12 0C
CR (Carriage return) 13 0D
SO (Shift out) 14 0E
SI (Shift in) 15 0F
DLE (Data link escape) 16 10
DC1 (Device control 1) 17 11
DC2 (Device control 2) 18 12
DC3 (Device control 3) 19 13
DC4 (Device control 4) 20 14
NAK (Negative acknowledgement) 21 15
SYN (Synchronous idle) 22 16
ETB (End of transmission block) 23 17
CAN (Cancel) 24 18
EM (End of medium) 25 19
SUB (Substitute) 26 1A
ESC (Escape) 27 1B
FS (File separator) 28 1C
GS (Group separator) 29 1D
RS (Record separator) 30 1E
US (Unit separator) 31 1F
SP (Space) 32 20
! 33 21
" 34 22
# 35 23
$ 36 24
% 37 25
& 38 26

7
' 39 27
( 40 28
) 41 29
* 42 2A
+ 43 2B
, 44 2C
- 45 2D
. 46 2E
/ 47 2F
0 48 30
1 49 31
2 50 32
3 51 33
4 52 34
5 53 35
6 54 36
7 55 37
8 56 38
9 57 39
: 58 3A
; 59 3B
< 60 3C
= 61 3D
> 62 3E
? 63 3F
@ 64 40
A 65 41
B 66 42
C 67 43
D 68 44
E 69 45
F 70 46

8
G 71 47
H 72 48
I 73 49
J 74 4A
K 75 4B
L 76 4C
M 77 4D
N 78 4E
O 79 4F
P 80 50
Q 81 51
R 82 52
S 83 53
T 84 54
U 85 55
V 86 56
W 87 57
X 88 58
Y 89 59
Z 90 5A
[ 91 5B
\ 92 5C
] 93 5D
^ 94 5E
_ 95 5F
' 96 60
a 97 61
b 98 62
c 99 63
d 100 64
e 101 65
f 102 66

9
g 103 67
h 104 68
i 105 69
j 106 6A
k 107 6B
l 108 6C
m 109 6D
n 110 6E
o 111 6F
p 112 70
q 113 71
r 114 72
s 115 73
t 116 74
u 117 75
v 118 76
w 119 77
x 120 78
y 121 79
z 122 7A
{ 123 7B
124 7C
} 125 7D
~ 126 7E
Delete key 127 7F

A Brief History of Computers


1940 – 1956:  First Generation – Vacuum Tubes

These early computers used vacuum tubes as circuitry and magnetic drums for memory. As a
result they were enormous, literally taking up entire rooms and costing a fortune to run. These
were inefficient materials which generated a lot of heat, sucked huge electricity and subsequently
generated a lot of heat which caused ongoing breakdowns.

10
These first generation computers relied on ‘machine language’ (which is the most basic
programming language that can be understood by computers). These computers were limited to
solving one problem at a time. Input was based on punched cards and paper tape. Output came
out on print-outs. The two notable machines of this era were the UNIVAC and ENIAC machines
– the UNIVAC is the first every commercial computer which was purchased in 1951 by a
business – the US Census Bureau.

1956 – 1963: Second Generation – Transistors

The replacement of vacuum tubes by transistors saw the advent of the second generation of
computing. Although first invented in 1947, transistors weren’t used significantly in computers
until the end of the 1950s. They were a big improvement over the vacuum tube, despite still
subjecting computers to damaging levels of heat. However they were hugely superior to the
vacuum tubes, making computers smaller, faster, cheaper and less heavy on electricity use. They
still relied on punched card for input/printouts.

The language evolved from cryptic binary language to symbolic (‘assembly’) languages. These
meant programmers could create instructions in words. About the same time high level
programming languages were being developed (early versions of COBOL and FORTRAN).
Transistor-driven machines were the first computers to store instructions into their memories –
moving from magnetic drum to magnetic core ‘technology’. The early versions of these
machines were developed for the atomic energy industry.

1964 – 1971: Third Generation – Integrated Circuits

By this phase, transistors were now being miniaturized and put on silicon chips (called
semiconductors). This led to a massive increase in speed and efficiency of these machines. 
These were the first computers where users interacted using keyboards and monitors which
interfaced with an operating system, a significant leap up from the punch cards and printouts.
This enabled these machines to run several applications at once using a central program which
functioned to monitor memory.

As a result of these advances which again made machines cheaper and smaller, a new mass
market of users emerged during the ‘60s.

1972 – 2010: Fourth Generation – Microprocessors

This revolution can be summed in one word: Intel. The chip-maker developed the Intel 4004 chip in 1971,
which positioned all computer components (CPU, memory, input/output controls) onto a single chip.
What filled a room in the 1940s now fit in the palm of the hand. The Intel chip housed thousands of
integrated circuits. The year 1981 saw the first ever computer (IBM) specifically designed for home use

11
and 1984 saw the MacIntosh introduced by Apple. Microprocessors even moved beyond the realm of
computers and into an increasing number of everyday products.

Fifth Generation – Artificial Intelligence

Computer devices with artificial intelligence are still in development, but some of these technologies are
beginning to emerge and be used such as voice recognition.

AI is a reality made possible by using parallel processing and superconductors. Leaning to the future,
computers will be radically transformed again by quantum computation, molecular and nanotechnology.

The essence of fifth generation will be using these technologies to ultimately create machines which can
process and respond to natural language, and have capability to learn and organise themselves.

Four Decades of Computing

Features Batch Time-sharing Desktop Network


Decade 1960s 1970s 1980s 1990s
Location Computer room Terminal room Desktop Mobile
Users Experts Specialists Individuals Groups
Data Alphanumeric Text, numbers Fonts, graphs Multimedia
Objective Calculate Access Present Communicate
Interface Punched card Keyboard & CRT See & Point Ask & tell
Operation Process Edit Layout Orchestrate
Connectivity None Peripheral cable LAN Internet
Owners Corporate Divisional IS shops Departmental Everyone
computer, end-users
Centers

12
Unit Two- Introduction to Computer Organization and Architecture

Computer Organization and Computer Architecture


Computer architecture is the architectural attributes like physical address memory,CPU and
how they should be made and made to coordinate with each other keeping the future demands
and goals in mind.

Computer organization is how operational attributes are linked together and contribute to
realize the architectural specifications.

Computer architecture comes before computer organization. It’s like building the design and
architecture of house takes maximum time and then organization is building house by bricks or
by latest technology keeping the basic layout and architecture of house in mind.

13
Computer Architecture Computer Organization
Describes what the computer does. Describes how computer does it.
Deals with functional behavior of computer Deals with structural relationship
system
Deals with high-level design issue Deals with low-level design issue
Architecture indicates its hardware Organization indicates its performance
For designing computer, its architecture is For designing computer, organization is
fixed first. decided after its architecture

Von Neumann Architecture

Historically there have been 2 types of Computers:

1. Fixed Program Computers – Their function is very specific and they couldn’t be
programmed, e.g. Calculators.
2. Stored Program Computers – These can be programmed to carry out many different
tasks, applications are stored on them, hence the name.
The modern computers are based on a stored-program concept introduced by John Von
Neumann. In this stored-program concept, programs and data are stored in a separate storage unit
called memories and are treated the same. This novel idea meant that a computer built with this
architecture would be much easier to reprogram.

The basic structure is like,

It is also known as IAS computer and is having three basic units:

14
1. The Central Processing Unit (CPU)
2. The Main Memory Unit
3. The Input/Output Device

Von Neumann bottleneck


Whatever we do to enhance performance, we cannot get away from the fact that instructions can
only be done one at a time and can only be carried out sequentially. Both of these factors hold
back the competence of the CPU. This is commonly referred to as the ‘Von Neumann
bottleneck’. We can provide a Von Neumann processor with more cache, more RAM, or faster
components but if original gains are to be made in CPU performance then an influential
inspection needs to take place of CPU configuration.

This architecture is very important and is used in our PCs and even in Super Computers.

Harvard Architecture

It is computer architecture with physically separate storage and signal pathways for program data
and instructions. Unlike Von Neumann architecture which employs a single bus to both fetch
instructions from memory and transfer data from one part of a computer to another, Harvard
architecture has separate memory space for data and instruction.

Both the concepts are similar except the way they access memories. The idea behind the Harvard
architecture is to split the memory into two parts – one for data and another for programs. The
terms was based on the original Harvard Mark I relay based computer which employed a system
that would allow both data and transfers and instruction fetches to be performed at the same
time.

Real world computer designs are actually based on modified Harvard architecture and are
commonly used in microcontrollers and DSP (Digital Signal Processing).

15
The Von Neumann architecture has only one bus that is used for both instruction fetches and data
transfers, and the operations must be scheduled because they cannot be performed at the same
time. The Harvard architecture, on the other hand, has separate memory space for instructions
and data, which physically separate signals and storage for code and data memory, which in turn
makes it possible to access each of the memory system simultaneously.

Structure and Function

Structure indicates the way in which the components are interrelated.

The simplest possible depiction of a computer is given below.

16
 The computer interacts with its external environment using peripheral devices or
communication lines.

The top level structure of a computer is as follows.

The internal structure of the computer contains four main structural components. They are,

17
Central processing unit (CPU): Controls the operation of the computer and performs its data
processing functions; often simply referred to as processor.

Main memory: Stores data.

I/O: Moves data between the computer and its external environment.

System interconnection: Some mechanism that provides for communication among CPU, main
memory, and I/O.

And the major structural components of CPU are,

Control unit: Controls the operation of the CPU and hence the computer.

Arithmetic and logic unit (ALU): Performs the computer’s data processing functions.

Registers: Provides storage internal to the CPU.

CPU interconnection: Some mechanism that provides for communication among the control


unit, ALU, and registers

Function indicates the operation of each individual component as part of the structure of the
computer systems. The functional view of a computer is given below.

18
The four basic functions that a computer can perform are:

a. Data Processing
b. Data Storage
c. Data Movement
d. Control

Possible computer operations are illustrated in below diagrams. The computer can function as a
data movement device, i.e., simply transferring data from one peripheral or communications line
to another. It can also function as a data storage device, with data transferred from the external
environment to computer storage (read) and vice versa (write). The final two diagrams show
operations involving data processing, on data either in storage or en route between storage and
the external environment.

19
20
Computer Components

All types of computers follow the same basic logical structure and perform the following five
basic operations for converting raw input data into information useful to their users.

S/N Operation Description


1 Take Input The process of entering data and instructions into the computer
system
2 Store Data Saving data and instructions so that they are available for
processing as and when required.
3 Process Data Performing arithmetic, and logical operations on data in order
to convert them into useful information.
4 Output Information The process of producing useful information or results for the
user, such as a printed report or visual display.
5 Control the workflow Directs the manner and sequence in which all of the above
operations are performed.

21
Unit Three - Central Processing Unit

CPU Structure and Function


Central Processing Unit (CPU) Memory
General R0 Addr.
Registers R1 $ 000
R2 001
ALU R3 Address Bus 002
Output Input Register 1
Register

Internal Bus
Input Register 2

Data Bus
Program Counter

Instruction Register
Instruction Decoder
Control Bus
3FD
3FE
Control Unit Read/Write $ 3FF

A basic microprocessor requires certain elements to perform some operation. To perform an


operation microprocessor requires:

 Registers
 Arithmetic and Logic Unit (ALU)
 Control Logic
 Instruction register
 Program counter
 Bus.

Arithmetic and Logic Unit:


It is the computational unit of microprocessor. It performs arithmetic and logical operations on
various data. Whenever there is a need to perform an operation on a data then the data is sent to
ALU to perform the necessary function.

Registers:

22
Registers may be called as the Internal Storage device. Input data, Output data and various other
binary data is stored in this unit for further processing.

Control Unit:
Control unit as the name specifies controls the flow of data and signals in the microprocessor. It
generates the necessary control signals for various data that are fed to microprocessor.

Instruction Register:
All the instructions that are fetched from memory are located in the Instruction register. So the
Instruction register is used to store various information’s that microprocessor requires in order to
carry out an operation.

Program Counter (PC):


Program counter stores the address of the next instruction to be executed. It is usually denoted as
PC.

Arithmetic and Logic Unit


An arithmetic logic unit (ALU) is a digital circuit used to perform arithmetic and logic
operations. It represents the fundamental building block of the central processing unit (CPU) of
a computer. Modern CPUs contain very powerful and complex ALUs. In addition to ALUs,
modern CPUs contain a control unit (CU).
Most of the operations of a CPU are performed by one or more ALUs, which load data from
input registers. A register is a small amount of storage available as part of a CPU. The control
unit tells the ALU what operation to perform on that data, and the ALU stores the result in an
output register. The control unit moves the data between these registers, the ALU, and memory.
Arithmetic and Logic Unit is a like a calculator to a computer. ALU performs all arithmetic
operations along with decision making functions. In modern CPU or Microprocessors, there can
be more than one integrated ALU to speed up arithmetical and logical operations, such as;
integer unit, floating point unit etc.

Organization of ALU
Various circuits are required to process data or perform arithmetical operations which are
connected to microprocessor's ALU. Accumulator and Data Buffer stores data temporarily.
These data are processed as per control instructions to solve problems. Such problems are
addition, multiplication etc.

23
Functions of ALU:
Functions of ALU or Arithmetic & Logic Unit can be categorized into following 3 categories

1. Arithmetic Operations:
Additions, multiplications etc. are example of arithmetic operations. Finding greater than or
smaller than or equality between two numbers by using subtraction  is also a form of arithmetic
operations.

2. Logical Operations:
Operations like AND, OR, NOR, NOT etc. using logical circuitry are examples of logical
operations.

3. Data Manipulations:
Operations such as flushing a register are an example of data manipulation. Shifting
binary numbers are also example of data manipulation.

Buses
24
A bus is a physical group of signal lines that have a related function. Buses allow for the transfer
of electrical signals between different parts of the computer system and thereby transfer
information from one device to another. For example, the data bus is the group of signal lines
that carry data between the processor and the various subsystems that comprise the computer.
The “width” of a bus is the number of signal lines dedicated to transferring information. For
example, an 8-bit-wide bus transfers 8 bits of data in parallel.

The majority of microprocessors available today (with some exceptions) use the three-bus
system architecture. The three buses are the address bus , the data bus, and the control bus .

  Three-bus system

The data bus is bidirectional, the direction of transfer being determined by the processor. The
address bus carries the address, which points to the location in memory that the processor is
attempting to access. It is the job of external circuitry to determine in which external device a
given memory location exists and to activate that device. This is known as address decoding .
The control bus carries information from the processor about the state of the current access, such
as whether it is a write or a read operation. The control bus can also carry information back to the
processor regarding the current access, such as an address error. Different processors have
different control lines, but there are some control lines that are common among many processors.
The control bus may consist of output signals such as read, write, valid address, etc. A processor
usually has several input control lines too, such as reset, one or more interrupt lines, and a clock
input.

There are different types of bus systems such as data, address, control etc. Here below the
functions of some buses: 

Data Bus

Data bus is the most common type of bus. It is used to transfer data between different
components of computer. The number of lines in data bus affects the speed of data transfer

25
between different components. The data bus consists of 8, 16, 32, 64 lines. A
64-line data bus can transfer 64 bits of data at one time.

Address Bus

Many components are connected to one another through buses. Each component is assigned a
unique ID. This ID is called the address of that component. If a component wants to
communicate with another component, it uses address bus to specify the address
of that component. The address is a unidirectional bus. It can carry information only in one
direction. It carries address of memory location from microprocessor to main memory.

Control Bus

Control bus is used to transmit different commands or control signals from one component to
another component. A control signal contains the timing information and command signal (type
of operation to be performed).

AGP (Advanced Graphic Port) Bus

This is a 32-bit bus specifically for a video card. It requires for very fast performance of video on
computers. It support high performance video like 3D graphics, full-motion video etc. It runs at
66MHz, 133MHZ, 266MHz or 533MHz. It creates connection between CPU and video card.

PCI (Peripheral Component Interconnect) Bus

This is usually a 33MHz 32-bit bus found in virtually all systems. It connects the CPU, memory
and peripherals to wider and faster data pathway. High speed peripherals such as
network card, video card and more can be plugged into PCI Bus slot. PCI-X and PCI-Express are
faster developments of the PCI bus.

PCI-X Bus

PCI-X is a second generation development of the PCI Bus that provides faster speeds than PCI.
It is used primarily in workstation and server installations. PCI-X supports
64-bit slots.

PCI-Express

PCI-Express bus is a third generation development of the PCI bus. PCI-Express is a differential
signaling bus that can be generated by either the North Bridge or the South Bridge. The speed of
PCI-Express is described in terms of lanes.

26
USB (Universal Serial Bus)

This is an external bus standard that supports data transfer rates of 12 Mbps. A single USB port
can be used to connect up to 127 peripheral devices including mouse, modem, keyboard, printer,
digital camera etc. USB has two versions- USB 1X and USB 2X.

Registers
A register is a very small amount of very fast memory that is built into the CPU (central
processing unit) in order to speed up its operations by providing quick access to commonly used
values. Registers refers to semiconductor devices whose contents can be accessed (i.e., read and
written to) at extremely high speeds but which are held there only temporarily (i.e., while in use
or only as long as the power supply remains on).

Registers are the top of the memory hierarchy and are the fastest way for the system to
manipulate data. Registers are normally measured by the number of bits they can hold, for
example, an 8-bit register means it can store 8 bits of data or a 32-bit register means it can store
32 bit of data.

Registers are used to store data temporarily during the execution of a program. Some of the
registers are accessible to the user through instructions. Data and instructions must be put into
the system. So we need registers for this. 
Register Number of
Register Name Description
Symbol Bits
AC Accumulator 16 Processor Register
DR Data Register 16 Hold memory data
TR Temporary 16 Holds temporary Data
Register
IR Instruction 16 Holds Instruction Code
Register
AR Address Register 12 Holds memory address
PC Program Counter 12 Holds address of next
instruction
INPR Input Register 8 Holds Input data
OUTR Output Register 8 Holds Output data

27
Instructions
Computer instructions are a set of machine language instructions that a particular processor
understands and executes. A computer performs tasks on the basis of the instruction provided.

Examples of instructions

 ADD - Add two numbers together.


 COMPARE - Compare numbers.
 IN - Input information from a device, e.g., keyboard.
 JUMP - Jump to designated RAM address.
 JUMP IF - Conditional statement that jumps to a
designated RAM address.
 LOAD - Load information from RAM to the CPU.
 OUT - Output information to device, e.g., monitor.
 STORE - Store information to RAM.

28
An instruction comprises of groups called fields. These fields include:

o The Operation code (Opcode) field which specifies the operation to be performed.
o The Address field which contains the location of the operand, i.e., register or memory
location.
o The Mode field which specifies how the operand will be located.

Instruction Formats
Instruction format describes the internal structures (layout design) of the bits of an instruction,
in terms of its constituent parts.

An Instruction format must include an opcode, and address is dependent on an availability of


particular operands.

The format can be implicit or explicit which will indicate the addressing mode for each operand.

A basic computer has three instruction code formats which are:

1. Memory - reference instruction


2. Register - reference instruction
3. Input-Output instruction

Memory - reference instruction

In Memory-reference instruction, 12 bits of memory is used to specify an address and one bit to
specify the addressing mode 'I'.

Register - reference instruction

29
The Register-reference instructions are represented by the Opcode 111 with a 0 in the leftmost
bit (bit 15) of the instruction.

A Register-reference instruction specifies an operation on or a test of the AC (Accumulator)


register.

Input-output instruction

Just like the Register-reference instruction, an Input-Output instruction does not need a reference
to memory and is recognized by the operation code 111 with a 1 in the leftmost bit of the
instruction. The remaining 12 bits are used to specify the type of the input-output operation or
test performed.

Instruction Set Architecture


The instruction set, also called ISA (instruction set architecture) is part of a computer that
pertains to programming, which is basically machine language. The instruction set provides
commands to the processor, to tell it what it needs to do. The instruction set consists of
addressing modes, instructions, native data types, registers, memory architecture, interrupt, and
exception handling, and external I/O.

An example of an instruction set is the x86 instruction set, which is common to find on


computers today. Different computer processors can use almost the same instruction set while
still having very different internal design. Both the Intel Pentium and AMD Athlon processors
use nearly the same x86 instruction set. An instruction set can be built into the hardware of the
processor, or it can be emulated in software, using an interpreter. The hardware design is more
efficient and faster for running programs than the emulated software version.

The number of instructions that a particular CPU can have is limited and the collection of all those
instructions is called the Instruction Set.

The Instruction Set is very important. High-level programming languages are designed based on
how the instruction set is and a proper design of hardware and instruction set can determine how
fast the CPU is.

30
CPU Performance

The performance of a CPU is the number of programs it can run in a given time. The more the
number of programs it can run in that time, the faster the CPU is.

The performance is determined by the number of instructions that a program has: more
instructions, more time to perform them. It also depends upon the number of cycles (clock cycles)
per instructions.

This means that there are only two ways to improve the performance: Either minimize the number
of instructions per program, or reduce the number of cycles per instruction.

We cannot do both as they are complementary; optimizing one will sacrifice the other. And the
optimizations that we have to make are embedded deep in the instruction set and the hardware of
the CPU.

It is because of this that the CPU industry is divided between two very big players backing one of
the either techniques. While many Intel CPU’s are CISC architecture based, all Apple CPUs and
ARM devices have RISC architectures under the hood.

CISC architecture

CISC is the shorthand for Complex Instruction Set Computer. The CISC architecture tries to
reduce the number of Instructions that a program has, thus optimizing the Instructions per
Program part of the above equation. This is done by combining many simple instructions into a
single complex one.

Characteristics of CISC processors

 Complex instruction, hence complex instruction decoding.


 Instructions take more than single clock cycle to get executed.
 Uses more data types and complex addressing modes.

31
RISC ARCHITECTURE

On the other hand, Reduced Instruction Set Computer or RISC architectures have more
instructions, but they reduce the number of cycles that an instruction takes to perform. Generally,
a single instruction in a RISC machine will take only one CPU cycle. 

Characteristic of RISC processors:

 One cycle execution time: RISC processors have a CPI (clock per instruction) of one
cycle. This is due to the optimization of each instruction on the CPU and a technique
called pipelining.
 Pipelining: a technique that allows for simultaneous execution of parts, or stages, of
instructions to more efficiently process instructions.
 Large number of registers: the RISC design philosophy generally incorporates a larger
number of registers to prevent in large amounts of interactions with memory

Differences between CISC and RISC

Architectural Complex Instruction Set Reduced Instruction Set


Characteristics Computer(CISC) Computer(RISC)

Instruction size and Large set of instructions with variable Small set of instructions with
format formats (16-64 bits per instruction). fixed format (32 bit).

Data transfer Memory to memory. Register to register.

Most micro coded using control memory


Mostly hardwired without
CPU control (ROM) but modern CISC use hardwired
control memory.
control.

Instruction type Not register based instructions. Register based instructions.

Memory access More memory access. Less memory access.

Clocks Includes multi-clocks. Includes single clock.

Instructions are reduced and


Instruction nature Instructions are complex.
simple.

32
Moore’s Law and Microprocessor Evolution
Moore’s law states that “the number of transistors in a dense integrated circuit doubles
approximately every two years”. This basically means that engineers are able to cram more
and more of the building blocks (transistors) into the same amount of space as time goes on.

More transistors mean more units available to execute instructions, which equates to a faster
processor with more features. Another factor which greatly affects how quickly a processor can
decode that HD video stream, or brute-force crack your neighbor’s WiFi password (we
recommend asking them nicely first), is the clock speed.

Early processors had a clock speed of just a few KHz (thousands of on-off cycles per second),
while the most modern processors operate at GHz speeds (Billions of cycles per second).

Processor speed has historically been measured in terms of ‘Instructions Per Second’. New
processors are so fast that they can be measured in orders of Millions or Billions of instructions
per second. The following graphs show how mainstream processor speeds have climbed over the
decades.

Processor statistics in numbers – A glimpse of Past and Present

Year of Release Processor No. of Transistors Clock Speed

1971 Intel 4004 2300 740 KHz

1972 Intel 8008 6000 800 KHz

1978 Intel 8086 29000 5 MHz

1993 Intel Pentium 3.1 million 50 MHz

2000 Intel Pentium 4 42 million 1.5 GHz

2017 AMD Ryzen 4.2 billion 4.4 GHz

Graph 1 Year wise increment in Processor clock frequencies

33
Graph 2 Year wise increment in the number of transistors in the Processor

Size of Transistors in the Processors over the Years

34
Year Size of Transistors in µm

1971 10

1972 10

1978 3

1993 0.8

2000 0.13

2017 0.014

Computer Performance

Computer performance is the amount of work accomplished by a computer system. The word
performance in computer performance means “How well is the computer doing the work it is
supposed to do?” It basically depends on response time, throughput and execution time of a
computer system.

Response time is the time from start to completion of a task. This also includes:
 Operating system overhead.
 Waiting for I/O and other processes
 Accessing disk and memory
 Time spent executing on the CPU or execution time.

Throughput is the total amount of work done in a given time.

CPU execution time is the total time a CPU spends computing on a given task. It also excludes
time for I/O or running other programs. This is also referred to as simply CPU time.
Performance is determined by execution time as performance is inversely proportional to
execution time.
Performance = (1 / Execution time)

And,
(Performance of A / Performance of B)

35
= (Execution Time of B / Execution Time of A)

If given that Processor A is faster than processor B, that means execution time of A is less than
that of execution time of B. Therefore, performance of A is greater than that of performance of
B.
Example –

Machine A runs a program in 100 seconds, Machine B runs the same program in 125 seconds

(Performance of A / Performance of B)
= (Execution Time of B / Execution Time of A)
= 125 / 100 = 1.25

That means machine A is 1.25 times faster than Machine B.


And, the time to execute a given program can be computed as:
Execution time = CPU clock cycles x clock cycle time

Since clock cycle time and clock rate are reciprocals, so,
Execution time = CPU clock cycles / clock rate

The number of CPU clock cycles can be determined by,


CPU clock cycles
= (No. of instructions / Program ) x (Clock cycles / Instruction)
= Instruction Count x CPI
Which gives,
Execution time
= Instruction Count x CPI x clock cycle time= Instruction Count x CPI / clock rate

The units for CPU Execution time are:

36
How to Improve Performance?

To improve performance you can either:

 Decrease the CPI (clock cycles per instruction) by using new Hardware.
 Decrease the clock time or Increase clock rate by reducing propagation delays or by use
pipelining.
 Decrease the number of required cycles or improve ISA or Compiler.

Multicore processor
A multicore processor is a single computing component comprised of two or
more CPUs that read and execute the actual program instructions. The individual cores can
execute multiple instructions in parallel, increasing the performance of software which has been
written to take advantage of the unique architecture.

The first multicore processors were produced by Intel and AMD in the early 2000s. Since then,
processors have been created with two cores ("dual core"), four cores ("quad core"), six cores
("hexa core"), eight cores ("octo core"), etc. Processors have also been made with as many as
100 physical cores, as well as 1000 effective independent cores by using FPGAs (Field
Programmable Gate Arrays).

37
Unit Four – Control Unit
Control Unit is the part of the computer’s central processing unit (CPU), which directs the
operation of the processor. It was included as part of the Von Neumann Architecture by John von
Neumann. It is the responsibility of the Control Unit to tell the computer’s memory,
arithmetic/logic unit and input and output devices how to respond to the instructions that have
been sent to the processor. It fetches internal instructions of the programs from the main memory
to the processor instruction register, and based on this register contents, the control unit generates
a control signal that supervises the execution of these instructions.

Functions of the Control Unit –


1. It coordinates the sequence of data movements into, out of, and between a processor’s
many sub-units.
2. It interprets instructions.
3. It controls data flow inside the processor.
4. It receives external instructions or commands to which it converts to sequence of control
signals.
5. It controls many execution units (i.e. ALU, data buffers and registers) contained within a
CPU.
6. It also handles multiple tasks, such as fetching, decoding, execution handling and storing
results.

Instruction Cycle

A program residing in the memory unit of a computer consists of a sequence of instructions.


These instructions are executed by the processor by going through a cycle for each instruction.

In a basic computer, each instruction cycle consists of the following phases:

1. Fetch instruction from memory.


2. Decode the instruction.
3. Read the effective address from memory.
4. Execute the instruction.

38
Instructions are processed under the direction of the control unit in a step-by-step manner.

There are four fundamental steps in the instruction cycle:

1. Fetch the instruction The next instruction is fetched from the memory address that is
currently stored in the Program Counter (PC), and stored in the Instruction register (IR). At
the end of the fetch operation, the PC points to the next instruction that will be read at the
next cycle.

2. Decode the instruction The decoder interprets the instruction. During this cycle the
instruction inside the IR (instruction register) gets decoded.

3. Execute The Control Unit of CPU passes the decoded information as a sequence of control
signals to the relevant function units of the CPU to perform the actions required by the
instruction such as reading values from registers, passing them to the ALU to perform
mathematical or logic functions on them, and writing the result back to a register. If the ALU
is involved, it sends a condition signal back to the CU.

4. Store result The result generated by the operation is stored in the main memory, or sent to
an output device. Based on the condition of any feedback from the ALU, Program Counter
may be updated to a different address from which the next instruction will be fetched.

39
Control Unit Types

1. Hardwired Control Unit

``2. Microprogram Control Unit

40
The comparison between a Hardwired control unit and a Microprogram control unit is given in

The table below:

Sr. Hardwired Control Unit Microprogram Control Unit


No.

1. Its a hardware control unit. It lies between the software and the hardware.

2. It has a high error rate. Error rate is comparatively low.

3. It is difficult to design, test and implementation. It is easy to design, test and implementation.

4. Modifications are not flexible. Modifications are flexible.

5. It uses a finite state machine to generate signals. It generates signals using a control line.

6. Example: RISC processor. Example: CISC processor.

7. It uses flags, decoder, logic gates and other digital It uses sequence of micro programming language.
circuits.

8. On the basis of input signal output is generated. It generates a set of control signal on the basis of control
line.

9. It has faster mode of operation. It has slower mode of operation.

10. It is expensive. It is cheaper.

41
Unit Five – Memory
A memory unit is the collection of storage units or devices together. The memory unit stores the
binary information in the form of bits. Generally, memory/storage is classified into 2 categories:

 Volatile Memory: This loses its data, when power is switched off.
 Non-Volatile Memory: This is a permanent storage and does not lose any data when
power is switched off.

Memory Hierarchy Design

In the Computer System Design, Memory Hierarchy is an enhancement to organize the memory
such that it can minimize the access time. The Memory Hierarchy was developed based on a
program behavior known as locality of references. The figure below clearly demonstrates the
different levels of memory hierarchy:

42
This Memory Hierarchy Design is divided into 2 main types:
1. External Memory or Secondary Memory –
Comprising of Magnetic Disk, Optical Disk, Magnetic Tape i.e. peripheral storage devices
which are accessible by the processor via I/O Module. -Discussion
2. Internal Memory or Primary Memory –
Comprising of Main Memory, Cache Memory & CPU registers. This is directly accessible
by the processor. - Discussion

Characteristics of Memory System


We can infer the following characteristics of Memory Hierarchy Design from above figure:
1. Capacity:
It is the global volume of information the memory can store. As we move from top to
bottom in the Hierarchy, the capacity increases.
2. Access Time:
It is the time interval between the read/write request and the availability of the data. As we
move from top to bottom in the Hierarchy, the access time increases.
3. Performance:
Earlier when the computer system was designed without Memory Hierarchy design, the
speed gap increases between the CPU registers and Main Memory due to large difference
in access time. This results in lower performance of the system and thus, enhancement was
required. This enhancement was made in the form of Memory Hierarchy Design because of
which the performance of the system increases. One of the most significant ways to
increase system performance is minimizing how far down the memory hierarchy one has to
go to manipulate data.

4. Cost per bit:


As we move from bottom to top in the Hierarchy, the cost per bit increases i.e. Internal
Memory is costlier than External Memory.

Cache Memory
Cache Memory is a special very high-speed memory. It is used to speed up and synchronizing
with high-speed CPU. Cache memory is costlier than main memory or disk memory but
economical than CPU registers. Cache memory is an extremely fast memory type that acts as a
buffer between RAM and the CPU. It holds frequently requested data and instructions so that
they are immediately available to the CPU when needed.
Cache memory is used to reduce the average time to access data from the Main memory. The
cache is a smaller and faster memory which stores copies of the data from frequently used main

43
memory locations. There are various different independent caches in a CPU, which stored
instruction and data.

Levels of memory:

 Level 1 or Register
It is a type of memory in which data is stored and accepted that are immediately stored in
CPU. Most commonly used register is accumulator, Program counter, address register etc.

 Level 2 or Cache memory


It is the fastest memory which has faster access time where data is temporarily stored for
faster access.

 Level 3 or Main Memory


It is memory on which computer works currently it is small in size and once power is off
data no longer stays in this memory.

 Level 4 or Secondary Memory


It is external memory which is not fast as main memory but data stays permanently in this
memory

Memory Measurement:

When you use a RAM, ROM, Floppy disk or hard disk the data is measured using some unit. In
computer terminology, they are called nibble, Bit, Byte, Kilobyte, Megabyte, Gigabyte, etc.

Bit It stands for a Binary Digit. Which is either 0 or 1.

Byte (B) A byte is approximately one character (letter ’a’. number ‘1’. Symbol’?’. etc…). Also.
a group of 8 bits is called a byte.

Nibble 4 bits make one nibble.

44
Kilobyte (KB) In memory a group of 1024 bytes is called a Kilobyte.

Megabyte (MB) In memory. a group of 1024 Kilobytes is called a Megabyte.

Gigabyte (GB) In memory, a group of 1024 megabytes is called a Gigabyte. It is sometimes


used, less precisely, to mean 1 billion bytes or 1000 MB . Now, a number of companies
manufacture memory chips in terms of Megabyte such as 64 MB, 128 MB, 256 MB, 1.2 GB etc.

Terabyte (TB) A terabyte is approximately a trillion bytes.

Petabyte (PB) one petabyte of information is equal to 1000 terabytes or 10 bytes.

Exabyte (EB) One Exabyte of information equal to 1000 petabytes or 10 bytes.

1Bit = Binary Digit


8 Bits = 1 Byte = 2 Nibble
1024 Bytes = 1 KB (Kilobyte)
1024 KB = 1 MB (Megabyte)
1024 MB = 1 GB (Giga Byte)
1024 GB = 1 TB (Terabyte)
1024 TB = 1 PB (Petabyte)
1024 PB = 1 EB (Exabyte)
1024 EB = 1 ZB (Zettabyte)
1024 ZB = 1 YB (Yottabyte)
1024 YB = 1 ( Brontobyte)
1024 Brontobyte = 1 (Geop Byte)

 Bit is the smallest memory measurement unit.


 Geop Byte is the highest memory measurement unit.

45
Unit Six – Input Output
The I/O subsystem of a computer provides an efficient mode of communication between the
central system and the outside environment. It handles all the input-output operations of the
computer system.

Peripheral Devices
Input or output devices that are connected to computer are called peripheral devices. These
devices are designed to read information into or out of the memory unit upon command from the
CPU and are considered to be the part of computer system. These devices are also
called peripherals.
For example: Keyboards, display units and printers are common peripheral devices.
There are three types of peripherals:

1. Input peripherals: Allows user input, from the outside world to the computer. Example:
Keyboard, Mouse etc.
2. Output peripherals: Allows information output, from the computer to the outside world.
Example: Printer, Monitor etc
3. Input-Output peripherals: Allows both input(from outised world to computer) as well
as, output(from computer to the outside world). Example: Touch screen etc.

I/O Interface

The method that is used to transfer information between internal storage and external I/O devices
is known as I/O interface. The CPU is interfaced using special communication links by the
peripherals connected to any computer system. These communication links are used to resolve
the differences between CPU and peripheral. There are special hardware components between
CPU and peripherals to supervise and synchronize all the input and output transfers that are
called input-output interface units.

The communication between CPU and input/output devices is implemented using an interface
unit. In a computer system, data is transferred from an input device to the processor and from the
processor to an output device. Each input and output device is provided with a device controller,
which is used to manage the working of various peripheral devices. Actually, the CPU
communicates with the device controllers for performing the I/O operations.

In the computer system, the interface unit works as an intermediary between the processor and
the device controllers of various peripheral devices. The interface unit accepts the control
commands from the processor and interprets the commands so that they can be easily understood
by the device controllers for performing the required operations. Hence, the interface unit is

46
responsible for controlling the input and output operations. The processor to I/O devices
communication involves two important operations-I/O read and I/O write.

I/O Processor
The input/output processor or I/O processor is a processor that is separate from the CPU and
is designed to handle only input/output processes for a device or the computer.

The I/O processor is capable of performing actions without interruption or intervention from the
CPU. The CPU only needs to initiate the I/O processor by telling it what activity to perform.
Once the necessary actions are performed, the I/O processor then provides the results to the
CPU. Doing these actions allow the I/O processor to act as a bus to the CPU, like a CPU bus,
carrying out activities by directly interacting with memory and other devices in the computer. A
more advanced I/O processor may also have memory built into it, allowing it to perform actions
and activities more quickly.

For example, without an I/O processor, a computer would require the CPU to perform all actions
and activities, reducing overall computer performance. However, a computer with an I/O
processor would allow the CPU to send some activities to the I/O processor. While the I/O
processor is performing the necessary actions for those activities, the CPU is free to carry out
other activities, making the computer more efficient and increase performance.

Operations-I/O read and I/O write

The I/O operation helps the CPU to read the data from an input device. The sequence of steps
done during transferring the data from and an input device to the CPU are as follows-

1. The input device kept the data that needs to be transferred on the data bus which transfers
single byte of data at a time.
2. The input device then issues the data valid single through the device control bus to the
data register, showing that the data is present on the data bus.
3. After accepting the data by the data register of the interface unit, it issues a data accepted
signal through the device control bus as an acknowledgement to the input device,
showing that the data is received. Then, the input device disables the data valid signal.
4. The flag bit of the status register is set to 1 because the data register holds the data.
5. Now, the CPU issues an I/O read signal to the data register in the interface unit.
6. The data register then places the data on the data bus connected to the CPU. When the
data is received, the CPU sends an acknowledgement signal to the input device, showing
that the data has been received.

47
The I/O write operation helps the processor to write the data to an output device. The sequence
of steps done during transferring the data from CPU to the output device are as follows-

1. The CPU kept the data that needs to be transferred on the data bus connected to the data
register of the interface unit.
2. The CPU also kept the address of the output device on the device address bus.
3. The CPU then issues the I/O write signal which writes the data on the data register. The
data register holds the data , showing that flag bit of the status register is set to 1 .
4. Now, the data register issues a data accepted signal through the control bus to the CPU,
showing that the data has been received.
5. Then, the interface unit kept the data on the data bus connected to the device controller of
the output device.
6. The output device receives the data and sends an acknowledgement signal to the CPU
through the interface unit, showing that the desired data has been received.

Modes of Transfer
Data transfer between the central unit and I/O devices can be handled in generally three types of
modes which are given below:

1. Programmed I/O
2. Interrupt Initiated I/O
3. Direct Memory Access

Programmed I/O
Programmed I/O instructions are the result of I/O instructions written in computer program. Each
data item transfer is initiated by the instruction in the program.
Usually the program controls data transfer to and from CPU and peripheral. Transferring data
under programmed I/O requires constant monitoring of the peripherals by the CPU.

Interrupt Initiated I/O


In the programmed I/O method the CPU stays in the program loop until the I/O unit indicates
that it is ready for data transfer. This is time consuming process because it keeps the processor
busy needlessly.
This problem can be overcome by using interrupt initiated I/O. In this when the interface
determines that the peripheral is ready for data transfer, it generates an interrupt. After receiving
the interrupt signal, the CPU stops the task which it is processing and service the I/O transfer and
then returns back to its previous processing task.

48
Direct Memory Access
Removing the CPU from the path and letting the peripheral device manage the memory buses
directly would improve the speed of transfer. This technique is known as DMA.
In this, the interface transfer data to and from the memory through memory bus. A DMA
controller manages to transfer data between peripherals and memory unit.
Many hardware systems use DMA such as disk drive controllers, graphic cards, network cards
and sound cards etc. It is also used for intra chip data transfer in multicore processors. In DMA,
CPU would initiate the transfer, do other operations while the transfer is in progress and receive
an interrupt from the DMA controller when the transfer has been completed.

Input Output Channels


A channel is an independent hardware component that co-ordinate all I/O to a set of controllers.
Computer systems that use I/O channel have special hardware components that handle all I/O
operations.
Channels use separate, independent and low cost processors for its functioning which are called
Channel Processors.
Channel processors are simple, but contain sufficient memory to handle all I/O tasks. When I/O
transfer is complete or an error is detected, the channel controller communicates with the CPU
using an interrupt, and informs CPU about the error or the task completion.
Each channel supports one or more controllers or devices. Channel programs contain list of
commands to the channel itself and for various connected controllers or devices. Once the
operating system has prepared a list of I/O commands, it executes a single I/O machine
instruction to initiate the channel program, the channel then assumes control of the I/O
operations until they are completed.

IBM 370 I/O Channel


The I/O processor in the IBM 370 computer is called a Channel. A computer system
configuration includes a number of channels which are connected to one or more I/O devices.

Categories of I/O Channels


Following are the different categories of I/O channels:
Multiplexer
The Multiplexer channel can be connected to a number of slow and medium speed devices. It is
capable of operating number of I/O devices simultaneously.
Selector
This channel can handle only one I/O operation at a time and is used to control one high speed
device at a time.

49
Block-Multiplexer
It combines the features of both multiplexer and selector channels.

The CPU directly can communicate with the channels through control lines. Following diagram
shows the word format of channel operation.

The computer system may have number of channels and each is assigned an address. Each
channel may be connected to several devices and each device is assigned an address.

50
Unit Seven – Parallel Processing
Introduction to parallel architectures

The traditional architecture for computers follows the conventional, Von Neumann serial
architecture. Computers based on this form usually have a single, sequential processor. The main
limitation of this form of computing architecture is that the conventional processor is able to
execute only one instruction at a time. Algorithms that run on these machines must therefore be
expressed as a sequential problem. A given task must be broken down into a series of sequential
steps, each to be executed in order, one at a time.

Many problems that are computationally intensive are also highly parallel. An algorithm that is
applied to a large data set characterizes these problems. Often the computation for each element
in the data set is the same and is only loosely reliant on the results from computations on
neighboring data. Thus, speed advantages may be gained from performing calculations in
parallel for each element in the data set, rather than sequentially moving through the data set and
computing each result in a serial manner. Machines with multitudes of processors working on a
data structure in parallel often far outperform conventional computers in such applications.

The grain of the computer is defined as the number of processing elements within the machine.
A coarsely grained machine has relatively few processors, whereas a finely grained machine may
have tens of thousands of processing elements. Typically, the processing elements of a finely
grained machine are much less powerful than those of a coarsely grained computer. The
processing power is achieved through the brute-force approach of having such a large number of
processing elements.

There are several different forms of parallel machine. Each architecture has its own advantages
and limitations, and each has its share of supporters.

Pipelining – Discussion

Multicore computers and Multithreading – Discussion

Flynn's Classification of Computers

M.J. Flynn proposed a classification for the organization of a computer system by the number of
instructions and data items that are manipulated simultaneously.

The sequence of instructions read from memory constitutes an instruction stream.

The operations performed on the data in the processor constitute a data stream.

Parallel processing may occur in the instruction stream, in the data stream, or both.
51
Flynn's classification divides computers into four major groups that are:
1. Single instruction stream, single data stream (SISD)
2. Single instruction stream, multiple data stream (SIMD)
3. Multiple instruction stream, single data stream (MISD)
4. Multiple instruction stream, multiple data stream (MIMD)

Single-instruction, single-data (SISD) systems

An SISD computing system is a uniprocessor machine which is capable of executing a single


instruction, operating on a single data stream. In SISD, machine instructions are processed in a
sequential manner and computers adopting this model are popularly called sequential computers.
Most conventional computers have SISD architecture. All the instructions and data to be
processed have to be stored in primary memory.

The speed of the processing element in the SISD model is limited(dependent) by the rate at
which the computer can transfer information internally. Dominant representative SISD systems
are IBM PC, workstations.

52
Single-instruction, multiple-data (SIMD) systems

An SIMD system is a multiprocessor machine capable of executing the same instruction on all
the CPUs but operating on different data streams. Machines based on an SIMD model are well
suited to scientific computing since they involve lots of vector and matrix operations. So that the
information can be passed to all the processing elements (PEs) organized data elements of
vectors can be divided into multiple sets(N-sets for N PE systems) and each PE can process one
data set.

Multiple-instruction, single-data (MISD) systems

An MISD computing system is a multiprocessor machine capable of executing different


instructions on different PEs but all of them operating on the same dataset .

Example Z = sin(x)+cos(x)+tan(x)
The system performs different operations on the same data set. Machines built using the MISD
model are not useful in most of the application, a few machines are built, but none of them are
available commercially.

53
Multiple-instruction, multiple-data (MIMD) systems

An MIMD system is a multiprocessor machine which is capable of executing multiple


instructions on multiple data sets. Each PE in the MIMD model has separate instruction and data
streams; therefore machines built using this model are capable to any kind of application. Unlike
SIMD and MISD machines, PEs in MIMD machines work asynchronously.

54

You might also like