Professional Documents
Culture Documents
UNIT-3
Basic Multiprocessor Architecture
Flynn’s classification
UMA,NUMA
Loosely Coupled and Tightly Coupled System
Centralized Shared Memory Architecture and Distributed Shared Memory Architecture
Array Processor
Vector Processor
A multiprocessor system is defined as "a system with more than one processor", and, more precisely, "a
number of central processing units linked together to enable parallel processing to take place".
The key objective of a multiprocessor is to boost a system's execution speed.
Stream:
A sequence of items (data / instructions)
A sequence or flow of either instruction or data operated on by the computer.
It is of two types
o Instruction stream
o Data stream
Instruction Stream:
In the complete cycle of instruction execution, a flow of instructions from main memory to the CPU is
established. This flow of instructions is called instruction stream.
Data Stream:
Flow of operands between processor and memory bidirectional is called data stream.
It can be said that the sequence of instruction executed by CPU forms the instruction stream and sequence of
data (operands) required for execution of instructions forms the data stream.
Parallel computing is a computing where the jobs are broken into discrete parts that can be executed
concurrently.
Each part is further broken down to a series of instructions.
Instructions from each part execute simultaneously on different CPUs.
Parallel systems deal with the simultaneous use of multiple computer resources that can include a single
computer with multiple processors, a number of computers connected by a network to form a parallel processing
cluster or a combination of both.
Parallel systems are more difficult to program than computers with a single processor because the architecture
of parallel computers varies accordingly and the processes of multiple CPUs must be coordinated and
synchronized.
The speed of the processing element in the SISD model is limited (dependent) by the rate at which the
computer can transfer information internally. Dominant representative SISD systems are IBM PC,
workstations.
An SIMD system is a multiprocessor machine capable of executing the same instruction on all the
CPUs but operating on different data streams.
Machines based on an SIMD model are well suited to scientific computing since they involve lots of
vector and matrix operations.
So that the information can be passed to all the processing elements (PEs) organized data elements
of vectors can be divided into multiple sets(N-sets for N PE systems) and each PE can process one
data set.
Dominant representative SIMD systems are Cray’s vector processing machine.
In the shared memory MIMD model (tightly coupled multiprocessor systems), all the PEs are connected to a
single global memory and they all have access to it.
The communication between PEs in this model takes place through the shared memory, modification of the
data stored in the global memory by one PE is visible to all other PEs. Dominant representative shared
memory MIMD systems are Silicon Graphics machines and Sun/IBM’s SMP (SymmetricMulti-Processing).
In Distributed memory MIMD machines (loosely coupled multiprocessor systems) all PEs have a local
memory. The communication between PEs in this model takes place through the interconnection network (the
inter process communication channel, or IPC). The network connecting PEs can be configured to tree, mesh
or in accordance with the requirement.
MULTIPROCESSOR
The attached array processor intends to improve the performance of the host computer in specific numeric
computations.
The ILLIAC IV computer, manufactured by the Burroughs Corporation, is the most well-known SIMD array
processor. Single Instruction Multiple Data (SIMD) processors are highly specialized computers. They're only
good for numerical issues that can be stated as vectors or matrices; they're not suitable for other kinds of
computations.
Configurations of SIMD
1. Array processors that use RAM(Random Access Memory) are also known as Dedicated Memory
Organisation.
ILLIAC-IV
CM-2
MP-1
2. Associative processor that uses content accessible memory is known as Global Memory Organisation.
BSP
Usage of Array Processors
Array processors enhance the total speed of instruction processing.
Most array processors' design optimizes its performance for repetitive arithmetic operations, making it
faster at vector arithmetic than the host CPU. Since most Array processors run asynchronously from the
host CPU, the system's overall capacity is thus improved.
Array Processors have their own local memory, providing additional extra memory to systems with limited
memory. This is an essential consideration for the systems with a limited physical memory or address
space.
Array processors are extremely useful for dealing with problems that require a lot of parallelisms. However,
they do require a change in programming methodology. Converting conventional (sequential) programs to
support array processors is complex, and different (parallel) algorithms may be needed to match the parallel
approach.
Vector Processor
Vector processor is basically a central processing unit that has the ability to execute the complete vector input in
a single instruction.
More specifically we can say, it is a complete unit of hardware resources that executes a sequential set of similar
data items in the memory using a single instruction.
We know elements of the vector are ordered properly so as to have successive addressing format of the memory.
This is the reason why we have mentioned that it implements the data sequentially.
It holds a single control unit but has multiple execution units that perform the same operation on different data
elements of the vector.
Unlike scalar processors that operate on only a single pair of data, a vector processor operates on multiple pair
of data. However, one can convert a scalar code into vector code. This conversion process is known as
vectorization. So, we can say vector processing allows operation on multiple data elements by the help of single
instruction.
These instructions are said to be single instruction multiple data or vector instructions. The CPU used in
recent time makes use of vector processing as it is advantageous than scalar processing.
Architecture and Working
The figure below represents the typical diagram showing vector processing by a vector computer:
As it has several functional pipes thus it can execute the instructions over the operands. We know that both data
and instructions are present in the memory at the desired memory location. So, the instruction processing unit
i.e., IPU fetches the instruction from the memory.
Once the instruction is fetched then IPU determines either the fetched instruction is scalar or vector in nature. If
it is scalar in nature, then the instruction is transferred to the scalar register and then further scalar processing is
performed.
While, when the instruction is a vector in nature then it is fed to the vector instruction controller. This vector
instruction controller first decodes the vector instruction then accordingly determines the address of the vector
operand present in the memory.
Then it gives a signal to the vector access controller about the demand of the respective operand. This vector
access controller then fetches the desired operand from the memory. Once the operand is fetched then it is
provided to the instruction register so that it can be processed at the vector processor.
At times when multiple vector instructions are present, then the vector instruction controller provides the
multiple vector instructions to the task system. And in case the task system shows that the vector task is very
long then the processor divides the task into subvectors.
These subvectors are fed to the vector processor that makes use of several pipelines in order to execute the
instruction over the operand fetched from the memory at the same time.
The various vector instructions are scheduled by the vector instruction controller.
Classification of Vector Processor
The classification of vector processor relies on the ability of vector formation as well as the presence of vector
instruction for processing. So, depending on these criteria, vector processing is classified as follows: