You are on page 1of 26

Advanced Computer Architecture

Dr. Saima Farhan

Fall’ 2018 Semester


Course Outline
Parallel processing:
Basic concepts, Types and levels of parallelism, Classification of parallel architectures, Basic
parallel techniques

ILP processors:
Evolution, Dependencies, Scheduling, Preservation, Speed-up

Pipelined processors:
Basic concepts, Design space of pipelines, Overview of pipelined instruction processing

VLIW processors:
Architectures, Basic Principles

Superscalar Processors:
Introduction, Parallel decoding, Superscalar instruction issue, Register renaming, Parallel
execution, Preserving sequential consistency of instruction execution

Processing of control transfer instructions:


Introduction, Basic approaches to branch handling, Delayed branching and branch processing,
Multiway branching

Parallel computing and Cache coherence:


Why Parallel Architecture, Convergence of Parallel Architectures, Fundamental Design issues
Course Outline

Parallel Programs:
The Parallelization Process, Parallelization of an Example program

Shared memory Multiprocessor:


Cache Coherence, Memory consistency, Design Space for Snooping Protocols,
Synchronization

System Interconnect Architectures:


Network properties and routing, Static connection networks and dynamic connection networks,
Multiprocessor system interconnect

Data Parallel architecture:


Introduction, Connectivity

SIMD architectures:
Fine grained SIMD, Course grained architectures , Multithreaded architectures: Computational
models, Data flow architectures

Recent architectural trends:


Multi-core system organization, Multi core memory issues
Introduction to Parallel Processing

• Basic concepts
• Types and levels of parallelism
• Classification of parallel architecture
• Basic parallel techniques

CH03
Basic concepts

• The concept of program

 ordered set of instructions (programmer’s view)


 executable file (operating system’s view)
The concept of process

• OS view, process relates to execution


• Process creation
 setting up the process description
 allocating an address space
 loading the program into the allocated address space,
and
 passing the process description to the scheduler
• Process states
 ready to run
 running
 wait
Process spawning (independent processes)

B C

D E
The concept of thread

• smaller chunks of code (lightweight)


• threads are created within and belong to process
• for parallel thread processing, scheduling is
performed on a per-thread basis
• finer-grain, less overhead on switching from thread to
thread
Single-thread process or multi-thread (dependent)

Thread tree

Process

Threads
Three basic methods for creating and terminating threads

1. unsynchronized creation and unsynchronized


termination
• calling library functions: CREATE_THREAD,
START_THREAD
2. unsynchronized creation and synchronized
termination
• FORK and JOIN
3. synchronized creation and synchronized
termination
• COBEGIN and COEND
Processes and threads in languages

• Black box view: T: thread


T2 T0 T1 T1 T2 T0 . . . Tn

FORK COBEGIN

...
FORK

JOIN

COEND

JOIN

(a) (b)
The concepts of concurrent and parallel execution

Client Sever Client Sever

t t
Sequential nature Simultaneous nature

N-client 1-server model with concurrent execution


Main aspects of the scheduling policy

Scheduling policy

Pre-emption rule Selection rule

Whether servicing a client can How clients from the


be interrupted and if so on what competing clients will be
occasions selected for service
Basic pre-emption schemes
Pre-emption rule

Non pre-emptive Pre-emptive

Time-shared Priotized

Priority

Client Sever Client Sever Client Sever


The concepts of concurrent and parallel execution
N-client N-server model

Synchronous
(lock step) Asynchronous

Client Sever Client Sever


Concurrent and parallel programming languages

Classification of programming languages

Languages 1_client N_client 1_client N_client


1-server 1-server N-server N-server
model mode model model
sequential + - - -

concurrent - + - -

Data-parallel - - + -

Parallel - + - +
Types and levels of parallelism

• Available and utilized parallelism


 available: in program or in the problem solutions
 utilized: during execution

• Types of available parallelism


 Functional parallelism
 arises from the logic of a problem solution
 Data parallelism
 arises from data structures
Levels of available functional parallelism

• Parallelism at the instruction level (fine-grained parallelism)


• Parallelism at the loop level (middle-grained parallelism)
• Parallelism at the procedure level (middle-grained parallelism)
• Parallelism at the program level (coarse-grained parallelism
Available and utilized levels of functional parallelism

Available levels Utilized levels

User (program) level User level

2
Procedure level Process level

Loop level Thread level 1

Instruction level Instruction level

1: Exploited by architectures
2: Exploited by means of operating systems
Utilization of functional parallelism

• Available parallelism can be utilized by

 architecture
 instruction-level parallel architectures (ILP architectures)
 compilers
 parallel optimizing compiler
 operating system
 multitasking
Concurrent execution models

• User level --- Multiprogramming, time sharing


• Process level --- Multitasking
• Thread level --- Multi-threading

level of granularity
Utilization of data parallelism

• by using data-parallel architecture


Classification of parallel architectures

• Flynn’s classification

 SISD (Single Instruction Single Data)


 SIMD (Single Instruction Multiple Data)
 MISD (Multiple Instruction Single Data)
 MIMD (Multiple Instruction Multiple Data)
Proposed Classification
Parallel architectures
PAs

Data-parallel architectures Function-parallel architectures

Instruction-level Thread-level Process-level


PAs
PAs PAs

DPs
ILPS MIMDs

Vector Associative SIMDs Systolic Pipelined VLIWs Superscalar Distributed Shared


and neural architecture processors processors memory memory
architecture architecture MIMD (multi-
(multi-computer) Processors)
Basic parallel technique

• Pipelining (time)
 a number of functional units are employed in sequence
to perform a single computation
 a number of steps for each computation

• Replication (space)
 a number of functional units perform multiple
computations simultaneously
 more processors
 more memory
 more I/O
 more computers
Pipelining and replication in parallel computer architecture

Pipelining Replication
Vector processors +
Systolic arrays + +
SIMD (array) processor +
Associative processors +
Pipelined processors +
VLIW processors +
Superscalar processors + +
Multi-threaded machines + +
Multicomputers + +
Multiprocessors +

You might also like