CH03

Advanced Computer Architecture
Dr. Saima Farhan
Fall’ 2018 Semester

Course Outline
Parallel processing:
Basic concepts, Types and levels of parallelism, Classification of parallel architectures, Basic
parallel techniques
ILP processors:
Evolution, Dependencies, Scheduling, Preservation, Speed-up
Pipelined processors:
Basic concepts, Design space of pipelines, Overview of pipelined instruction processing
VLIW processors:
Architectures, Basic Principles
Superscalar Processors:
Introduction, Parallel decoding, Superscalar instruction issue, Register renaming, Parallel
execution, Preserving sequential consistency of instruction execution
Processing of control transfer instructions:

Introduction, Basic approaches to branch handling, Delayed branching and branch processing,
Multiway branching
Parallel computing and Cache coherence:

Why Parallel Architecture, Convergence of Parallel Architectures, Fundamental Design issues
Course Outline
Parallel Programs:
The Parallelization Process, Parallelization of an Example program
Shared memory Multiprocessor:

Cache Coherence, Memory consistency, Design Space for Snooping Protocols,
Synchronization
System Interconnect Architectures:

Network properties and routing, Static connection networks and dynamic connection networks,
Multiprocessor system interconnect
Data Parallel architecture:

Introduction, Connectivity
SIMD architectures:
Fine grained SIMD, Course grained architectures , Multithreaded architectures: Computational
models, Data flow architectures
Recent architectural trends:

Multi-core system organization, Multi core memory issues
Introduction to Parallel Processing
• Basic concepts
• Types and levels of parallelism
• Classification of parallel architecture
• Basic parallel techniques
CH03
Basic concepts
• The concept of program
 ordered set of instructions (programmer’s view)

 executable file (operating system’s view)
The concept of process
• OS view, process relates to execution

• Process creation
 setting up the process description
 allocating an address space
 loading the program into the allocated address space,
and
 passing the process description to the scheduler
• Process states
 ready to run
 running
 wait
Process spawning (independent processes)
B C
D E
The concept of thread
• smaller chunks of code (lightweight)

• threads are created within and belong to process
• for parallel thread processing, scheduling is
performed on a per-thread basis
• finer-grain, less overhead on switching from thread to
thread
Single-thread process or multi-thread (dependent)
Thread tree
Process
Threads
Three basic methods for creating and terminating threads
1. unsynchronized creation and unsynchronized

termination
• calling library functions: CREATE_THREAD,
START_THREAD
2. unsynchronized creation and synchronized
termination
• FORK and JOIN
3. synchronized creation and synchronized
termination
• COBEGIN and COEND
Processes and threads in languages
• Black box view: T: thread

T2 T0 T1 T1 T2 T0 . . . Tn
FORK COBEGIN
...
FORK
JOIN
COEND
JOIN
(a) (b)
The concepts of concurrent and parallel execution
Client Sever Client Sever
t t
Sequential nature Simultaneous nature
N-client 1-server model with concurrent execution

Main aspects of the scheduling policy
Scheduling policy
Pre-emption rule Selection rule
Whether servicing a client can How clients from the

be interrupted and if so on what competing clients will be
occasions selected for service
Basic pre-emption schemes
Pre-emption rule
Non pre-emptive Pre-emptive
Time-shared Priotized
Priority
Client Sever Client Sever Client Sever

The concepts of concurrent and parallel execution
N-client N-server model
Synchronous
(lock step) Asynchronous
Client Sever Client Sever

Concurrent and parallel programming languages
Classification of programming languages
Languages 1_client N_client 1_client N_client

1-server 1-server N-server N-server
model mode model model
sequential + - - -
concurrent - + - -
Data-parallel - - + -
Parallel - + - +
Types and levels of parallelism
• Available and utilized parallelism

 available: in program or in the problem solutions
 utilized: during execution
• Types of available parallelism

 Functional parallelism
 arises from the logic of a problem solution
 Data parallelism
 arises from data structures
Levels of available functional parallelism
• Parallelism at the instruction level (fine-grained parallelism)

• Parallelism at the loop level (middle-grained parallelism)
• Parallelism at the procedure level (middle-grained parallelism)
• Parallelism at the program level (coarse-grained parallelism
Available and utilized levels of functional parallelism
Available levels Utilized levels
User (program) level User level
2
Procedure level Process level
Loop level Thread level 1
Instruction level Instruction level
1: Exploited by architectures
2: Exploited by means of operating systems
Utilization of functional parallelism
• Available parallelism can be utilized by
 architecture
 instruction-level parallel architectures (ILP architectures)
 compilers
 parallel optimizing compiler
 operating system
 multitasking
Concurrent execution models
• User level --- Multiprogramming, time sharing

• Process level --- Multitasking
• Thread level --- Multi-threading
level of granularity
Utilization of data parallelism
• by using data-parallel architecture

Classification of parallel architectures
• Flynn’s classification
 SISD (Single Instruction Single Data)

 SIMD (Single Instruction Multiple Data)
 MISD (Multiple Instruction Single Data)
 MIMD (Multiple Instruction Multiple Data)
Proposed Classification
Parallel architectures
PAs
Data-parallel architectures Function-parallel architectures
Instruction-level Thread-level Process-level

PAs
PAs PAs
DPs
ILPS MIMDs
Vector Associative SIMDs Systolic Pipelined VLIWs Superscalar Distributed Shared

and neural architecture processors processors memory memory
architecture architecture MIMD (multi-
(multi-computer) Processors)
Basic parallel technique
• Pipelining (time)
 a number of functional units are employed in sequence
to perform a single computation
 a number of steps for each computation
• Replication (space)
 a number of functional units perform multiple
computations simultaneously
 more processors
 more memory
 more I/O
 more computers
Pipelining and replication in parallel computer architecture
Pipelining Replication
Vector processors +
Systolic arrays + +
SIMD (array) processor +
Associative processors +
Pipelined processors +
VLIW processors +
Superscalar processors + +
Multi-threaded machines + +
Multicomputers + +
Multiprocessors +

CH03

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH03

Uploaded by

Copyright:

Available Formats

Advanced Computer Architecture

Dr. Saima Farhan

Fall’ 2018 Semester

Processing of control transfer instructions:

Parallel computing and Cache coherence:

Shared memory Multiprocessor:

System Interconnect Architectures:

Data Parallel architecture:

Recent architectural trends:

• The concept of program

 ordered set of instructions (programmer’s view)

• OS view, process relates to execution

• smaller chunks of code (lightweight)

1. unsynchronized creation and unsynchronized

• Black box view: T: thread

Client Sever Client Sever

N-client 1-server model with concurrent execution

Pre-emption rule Selection rule

Whether servicing a client can How clients from the

Non pre-emptive Pre-emptive

Client Sever Client Sever Client Sever

Client Sever Client Sever

Classification of programming languages

Languages 1_client N_client 1_client N_client

• Available and utilized parallelism

• Types of available parallelism

• Parallelism at the instruction level (fine-grained parallelism)

Available levels Utilized levels

User (program) level User level

Loop level Thread level 1

Instruction level Instruction level

• Available parallelism can be utilized by

• User level --- Multiprogramming, time sharing

• by using data-parallel architecture

 SISD (Single Instruction Single Data)

Data-parallel architectures Function-parallel architectures

Instruction-level Thread-level Process-level

Vector Associative SIMDs Systolic Pipelined VLIWs Superscalar Distributed Shared

You might also like