You are on page 1of 16

Module 2: Parallel Architectures

Flynn’s Taxonomy
• SISD

• SIMD
– Vector Processors
• MISD
– Systolic arrays
• MIMD
– Multi-core, many-core
SIMD (Single Instruction Multiple Data)
Approach
MISD (Systolic Arrays)
• A systolic array is a homogeneous network of tightly
coupled data processing units (DPUs) called cells or
nodes
• Systolic arrays are often hard-wired for a specific
operation, such as "multiply and accumulate", to
perform massively parallel integration
SIMT Approach
• Single Instruction Multiple Thread
SIMD, SMT, SIMT???
Simple OoO
SMT
SIMD
GPU
What is Common Between SIMD and SIMT?

• Both approach parallelism through


broadcasting the same instruction to multiple
execution units

• They all share the same fetch/decode


hardware
What’s the Difference then?
• SIMT has the following which SIMD don’t have
– Single instruction, multiple register sets

– Single instruction, multiple addresses

– Single instruction, multiple flow paths


Single instruction, multiple register sets
• Eg :Add two vectors of numbers

• In SIMD the above can be performed by,


– Breaking data into short vectors
– Loop processes them using instructions
– Eg SIMD instructions
void add(uint32_t *a, uint32_t *b, uint32_t *c, int n)
{
for(int i=0; i<n; i+=4)
{ //compute c[i], c[i+1], c[i+2], c[i+3]
uint32x4_t a4 = vld1q_u32(a+i);
uint32x4_t b4 = vld1q_u32(b+i);
uint32x4_t c4 = vaddq_u32(a4,b4); vst1q_u32(c+i,c4);
}
How can the previous example be solved
with SIMT?
__global__ void add(float *a, float *b, float *c)
{
int i = blockIdx.x * blockDim.x + threadIdx.x;
a[i]=b[i]+c[i];
//no loop! }
Single instruction, multiple addresses

• Each thread is allowed to access different


addresses
Single instruction, multiple path-flow

• Control flow of different threads can thus


diverge

You might also like