You are on page 1of 5

Computer Architecture Unit 9

facilitate high-speed vector processing. The vector processor control is


contained in the stream unit. The string and all logical operations are
performed in the string unit. The memory interface provides the read and
write ports of central memory for the scalar and vector processors. Each
port contains a one-SWORD (512-bit Super Word) buffer to facilitate high
transfer rates. The CPU processes input and output by issuing relatively
simple high-level messages to high-speed peripheral stations or a front-end
processor connected to the input/output ports.
9.3.2 Vector register architecture
In a vector-register processor, the entire vector operations excluding load
and store are in the midst of the vector registers. Such architectures are the
vector equivalent of load-store architecture. Since the late 1980s, all major
vector computers have been using a vector-register architecture which
includes the Cray Research processors (Cray-1, Cray-2, X-MP, YMP, C90,
T90 and SV1), Japanese supercomputers (NEC SX/2 through SX/5, Fujitsu
VP200 through VPP5000, and the Hitachi S820 and S-8300), and the mini-
supercomputers(Convex C-1 through C-4).
All vector operations are memory to memory in a memory-memory vector
processor, the initial vector computers and CDC’s vector computers were of
such kind. Vector register architectures possess various benefits over vector
memory-memory architectures. It is necessary for the vector memory-
memory architecture to write the entire intermediate outcomes to memory
as well as later on read them back from memory. Vector register
architecture is able to maintain intermediate outcomes in the vector
registers just near to the vector functional units, decreasing temporary
storage needs, inter-instruction latency and memory bandwidth needs.
In case a vector outcome is required by multiple other vector instructions,
memory-memory architecture should read it from memory innumerable
times; while a vector register machine can use the value from vector
registers once again, thereby decreasing memory bandwidth needs. For
such reasons, vector register machines have proved to be more effective
practically.
Components of a vector register processor: The major components of
the vector unit of a vector register machine are as given below:

Manipal University of Jaipur B1648 Page No. 201


Computer Architecture Unit 9

1. Vector registers: There are many vector registers that can perform
different vector operations in an overlapped manner. Every vector
register is a fixed-length bank that consists of one vector with multiple
elements and each element is 64-bit in length. There are also many
read and write ports. A pair of crossbars connects these ports to the
inputs/ outputs of functional unit.
2. Scalar registers: The scalar registers are also linked to the functional
units with the help of the pair of crossbars. They are used for various
purposes such as computing addresses for passing to the vector
load/store unit and as buffer for input data to the vector registers.
3. Vector functional units: These units are generally floating-point units
that are completely pipelined. They are able to initiate a new operation
on each clock cycle. They comprise all operation units that are utilised
by the vector instructions.
4. Vector load and store unit: This unit can also be pipelined and perform
an overlapped but independent transfer to or from the vector registers.
5. Control unit: This unit decodes and coordinates among functional units.
It can detect data hazards as well as structural hazards. Data hazards
are the conflicts in register accesses while functional hazards are the
conflicts in functional units.
Figure 9.1 gives you a clear picture of the above mentioned functional units
of vector processor.

Main
FP add/subtract
Memory

FP multiply

Vector
Load/Store FP divide

Logical

Vector
Integer
Registers

Scalar
Registers

Figure 9.1: Vector Register Architecture

Manipal University of Jaipur B1648 Page No. 202


Computer Architecture Unit 9

Types of Vector Instructions: The various types of vector instructions for a


register-register vector processor are:
(a) Vector-scalar instructions
(b) Vector-vector instructions
(c) Vector-memory instructions
(d) Gather and scatter instructions
(e) Masking instructions
(f) Vector reduction instructions
Let us discuss these.
(a) Vector-scalar instructions: Using these instructions, a scalar operand
can be combined with a vector one. If A and B are vector registers and
f is a function that performs some operation on each element of a
single or two vector operands, a vector-scalar operand can be defined
as follows:
Ai: = f (scalar, Bi)
(b) Vector-vector instructions: Using these instructions, one or two
vector operands are fetched from respective vector registers and
produce results in another vector register. If A, B, and C are three
vector registers, a vector-vector operand can be defined as follows:
Ai: = f (Bi, Ci)
(c) Vector-memory instructions: These instructions correspond to vector
load or vector store. The vector load can be defined as follows:
A: = f (M) where M is a memory register
The vector store can be defined as follows:
M: = f (A)
(d) Gather and scatter instructions: Gather is an operation that fetches
the non-zero elements of a sparse vector from memory as defined
below:
A x Vo: = f (M)
Scatter stores a vector in a sparse vector into memory as defined
below:
M: = f (A x Vo)

Manipal University of Jaipur B1648 Page No. 203


Computer Architecture Unit 9

(e) Masking instructions: These instructions use a mask vector to


expand or compress a vector as defined below:
V = f (A x VM) where V is a mask vector
(f) Vector reduction instructions: These instructions accept one or two
vectors as input and produce a scalar as output.
Vector processor implementation (CRAY-1): CRAY-1 is one of the oldest
processors that implemented vector processing. CRAY-1 is considered as
the world's first vector supercomputer. It was introduced in 1975 by
Seymour Cray. It is basically a register-oriented RISC-like machine requiring
all operands to be in registers. It has five kinds of registers:
(a) A registers: A set of 8 24-bit registers
(b) B registers: A set of 64 24-bit registers
(c) S registers: A set of 8 64-bit registers
(d) T registers: A set of 64 64-bit registers
(e) Vector registers: A set of 8 64-element floating point registers
There are12 functional units in CRAY-1:
(a) 2 24-bit units for address calculation
(b) 4 64-bit integer scalar units for integer operations
(c) 6 deeply pipelined units for vector operations
CRAY-1 uses 16-bit instructions. All vector operations can be executed in
one 16-bit instruction. The block diagram of CRAY-1 architecture is shown
in figure 9.2.

Manipal University of Jaipur B1648 Page No. 204


Computer Architecture Unit 9

Vector
Unit

VM

RTC FP Units
T00-T63
16-way Interleaved Memory

Scalar
Unit
VL

B00-B63
Address
Registers Address
Unit

PC

Instruction
Buffers
NIP CIP

LIP

Figure 9.2: Architecture of CRAY-1

Self Assessment Questions


3. _________________ is a modern shared-memory multiprocessor
version of the CDC Cyber 205 _________________.
4. The memory-memory vector processors can prove to be much efficient
in case the vectors are sufficiently long. (True/False)
5. The scalar registers are linked to the functional units with the help of a
pair of ____________________.
6. ______________ correspond to vector load or vector store.
7. Functional hazards are the conflicts in register accesses. (True/False)

Manipal University of Jaipur B1648 Page No. 205

You might also like