You are on page 1of 19

PARALLEL

ARCHITECTURES

A “close to the hardware” perspective


Parallel architectures. Flynn’s taxonomy

According to Flynn, computer architectures can be categorized


regarding
 The number of instruction streams [PROCESSORS] (Single
Instruction –SI- vs. Multiple Instruction –MI-)
 the number of data streams (Single Data –SD- vs. Multiple Data

–MD-)
There are four different possibilities:
 SISD

 SIMD

 MISD

 MIMD
SISD

SISD: The common Von Neumann model used in single-processor


computers. No parallelism is exploited in either the instructions or the
data streams.
SIMD

SIMD: A single instruction stream is concurrently broadcast to multiple


processors , each one operating on its own data stream.

Exploiting this parallelism is often done by the compiler (loop parallelization...)

Example: Vector-processors specialized in signal processing (e.g. Image processing). GPUs


(Graphical Processing Units –graphic processors- )
MISD

MISD: Multiple instructions operate on a single data stream. This a


very uncommon architecture, mostly cited for the sake of completeness.
MIMD

MIMD: Each processing element has its own stream of instructions


operating on its own data.

Some type of inter-processor communication may


be required
The vast majority of modern parallel systems fit into this approach.
MIMD

MIMD systems can be further categorized according to memory organization:


 Shared-memory systems: all processes share a single address space (memory).
Symmetric multiprocessors (SMPs): memory is common and access speed
uniform for all processors.
Non-uniform memory access systems (NUMA-systems): memory is common
but some blocks are physically closer to certain processors
 Distributed-memory systems: each process has its own address space and
communicate with other processes by sending and receiving messages (message
passing):
Massively parallel processors (MPP): processors and network infrastructure
are highly coupled.
 Clusters: off-the-shelf computers interconnected by an off-the-shelf network
 Hybrid systems: e.g. clusters of nodes, where each node contains several
processors that share memory (clusters of SMPs)
MIMD/SHARED MEMORY/SMP
Image and text taken from wikipedia

Most common multiprocessor systems today use


an SMP architecture. In the case of multi-core
processors, the SMP architecture applies to the
cores, treating them as separate processors.
MIMD/SHARED MEMORY/SMP
In modern multi-core processors, the
SMP architecture applies to the cores,
treating them as separate processors.
HOW DOES A CORE WORK?
HOW DOES A CORE WORK?
HYPERTHREADING?
HYPERTHREADING?
MIMD/SHARED MEMORY/NUMA
Keep close to you
what you use more
often

Some blocks of memory may be physically closer to some processors than


others. This reduces the memory bandwidth bottleneck but the access time
from a processor to a memory location can be significantly different
depending on how "close" the memory location is to the processor. To mitigate
the effects of non-uniform access, each processor may have a cache
MIMD/DISTRIBUTED MEMORY

If the interconnection network and the


processors are tightly coupled then the
system may be regarded as a MPP,
otherwise it may be regarded as
cluster.
The Bingxing problem

Once upon a time, long before there was a computer, an intelligent


Chinese princess wanted to get married. She announced to the
generals of her father, then the emperor of China, that the general
who was younger than 30 and could within one month find all the
prime factors of 368788194383 would be the prince. Before the
deadline, a young general, called Bingxing (meaning parallel) brought
her 7, 17, 23, 257 and 524287.

She was very happy because these numbers are prime and
7 x 17 x 23 x 257 x 524287 = 368788194383

Then, she asked the general how he had found the numbers…
The Bingxing problem

1. he first calculated 607280 = CEILING(SQRT(368788194383)) ;


He then gave each of the numbers between 2 and 607280 to one
of his 607279 soldiers, and ordered them to check if the numbers
they got were factors of 368788194383
2. 20 minutes later the soldiers who got 7, 17, 23, 119, 161, 257,
391, 1799, 2737, 4369, 5911, 30583, 41377, 100487, and
524287, reported that their numbers were factors.
3. Then, General Binxing himself purged that list, removing the non-
prime numbers
4. So he came to the princess with these numbers

5. The end of the story is left to your imagination ()


Which was the architecture?

Yes, general Binxing “used” a SIMD architecture.


The same instruction was given to each soldier:
determine whether your number is a factor or not (SI)
But each soldier received a different number (MD)

You might also like