0% found this document useful (0 votes)

23 views14 pages

Ipp 1

Uploaded by

kirandeep.224024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views14 pages

Ipp 1

Uploaded by

kirandeep.224024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

What is Parallel Computing?

 Parallel computing is a process where a large problem is broken into smaller parts
and solved simultaneously using multiple processors.

 Unlke serial programming, where one instruction is executed at a time, parallel

computing executes many instructions at the same time.

 Serial Programming Fails when:

o Only one task can run at a time, leading to wasted hardware resources and
increased execution time.

Why Parallel Programming?

 Handles dynamic and large problems more efficiently.

 Enables faster computations and resource savings.

 Makes complex simulations and data processing possible (e.g., weather forecasting,
real-time systems).

 Needed due to limits in improving processor speeds (e.g., clock speed limits, heat
issues).

Moore's Law

 Predicts that the number of transistors on a chip doubles every two years.

 This increases computational power.

 But physical limits mean we need to focus on multiple cores rather than faster single
cores.

Performance Factors

 Processor speed and memory speed both matter.

 Clock rates have increased by ~40%, but DRAM (memory) speed only ~10%.

 Cache memory is used to reduce the gap between processor and memory speeds.
Types of Parallelism

1. Pipelining:

o Breaks tasks into stages so multiple instructions are processed in different

stages at the same time.

2. Superscalar Execution:

o Processor executes multiple instructions per clock cycle using multiple

execution units.

3. Very Long Instruction Word (VLIW):

o Compiler schedules multiple instructions in advance.

o Simplifies hardware but increases complexity in programming.

4. SIMD (Single Instruction, Multiple Data):

o One instruction operates on multiple data elements.

o Used in GPUs, image processing, matrix operations.

5. MIMD (Multiple Instruction, Multiple Data):

o Different processors execute different instructions on different data.

Memory Architectures

1. Shared Memory:

o All processors access the same memory.

o Easier to program but difficult to scale (e.g., UMA & NUMA).

 UMA: Equal access time to all memory.

 NUMA: Access time varies depending on memory location.

2. Message Passing:

o Each processor has its own memory.

o Processors communicate by sending messages (e.g., using MPI or PVM).

o More scalable but harder to program.

Cache Coherency
 Ensures that multiple copies of the same memory block (cached in different
processors) stay consistent.

PRAM (Parallel Random Access Machine) Models

1. EREW – Exclusive Read, Exclusive Write (no sharing).

2. CREW – Concurrent Read, Exclusive Write.

3. ERCW – Exclusive Read, Concurrent Write.

4. CRCW – Concurrent Read and Write (most powerful).

Threads

 A thread is a single sequence of execution within a program.

 Multi-threading allows multiple threads to run in parallel.

Benefits of Using Threads

1. Portable software across systems.

2. Load balancing – Distribute work evenly.

3. Better resource usage – Avoids delays due to I/O or waiting.

OpenMP – A Tool for Parallel Programming

 Used in C/C++ and Fortran.

 Uses pragmas/directives to manage parallel code.

Basic Directives:

 #pragma omp parallel: Starts parallel region.

 #pragma omp parallel for: Splits loop among threads.

Thread Functions:

 omp_get_thread_num(): Gets thread ID.

 omp_get_num_threads(): Gets total threads.

 omp_set_num_threads(N): Sets number of threads.

 omp_get_wtime(): Gets wall-clock time.

Data Handling:

 private: Each thread has its own copy.

 shared: All threads share the variable.

 firstprivate: Copy with initial value.

 lastprivate: Saves the final value from the last iteration.

Synchronization:

 #pragma omp critical: Allows one thread at a time.

 #pragma omp atomic: Fast, thread-safe increment.

 #pragma omp barrier: Synchronizes all threads.

Loop Scheduling:

 Static: Equal chunks to threads.

 Dynamic: Assigns as threads become free.

 Guided: Starts with large chunks, decreases over time.

Parallel Sections:

 #pragma omp parallel sections: Different tasks run in parallel.

Performance Metrics

 FLOPS (Floating Point Operations Per Second): Measures computing power.

 GFLOPS = Billion FLOPS.

Conclusion

Parallel computing improves speed, efficiency, and makes complex problems solvable by:

 Breaking tasks into subtasks.

 Running them across multiple processors/threads.

 Using smart scheduling, memory handling, and architecture-aware programming.

Would you like this in a downloadable PDF or formatted as a revision sheet?

Great! Here's a simplified and easy-to-learn version of your IPP Question Bank, with short
but clear answers — perfect for revision and exams.

✅ UNIT 1: Basics of Parallel Programming

1. What is Parallel Computing?

Parallel computing means solving a large problem by dividing it into smaller tasks and
running them simultaneously using multiple processors or threads. This helps in faster
execution.

2. What is Concurrency?

Concurrency means that multiple tasks are in progress, but not necessarily at the same
time. It focuses more on task management than actual simultaneous execution.

3. Why do we need Parallel Programming?

To solve problems faster, use hardware efficiently, and handle large, complex, or real-time
problems like simulations, weather forecasting, etc.

4. Difference Between Serial and Parallel Execution

 Serial: One task at a time (step-by-step).

 Parallel: Many tasks run together using multiple cores or threads.

✅ UNIT 2: OpenMP Basics

5. What is OpenMP?

OpenMP is an API in C, C++, and Fortran used to write parallel programs easily using
pragmas (e.g., #pragma omp parallel).

6. How to Create Threads in OpenMP?

Use:

#pragma omp parallel

This starts a parallel region where multiple threads execute the block of code.

7. OpenMP for Loop Parallelism

#pragma omp parallel for

for (int i = 0; i < N; i++) { ... }

This splits loop iterations among threads.

8. Critical and Atomic

 #pragma omp critical: Only one thread at a time in this block (for complex updates).

 #pragma omp atomic: Faster, for simple updates like sum += a[i].

9. Private, Shared, Firstprivate, Lastprivate

 private: Each thread gets its own copy (uninitialized).

 shared: All threads use the same variable.

 firstprivate: Thread gets a copy with the original value.

 lastprivate: The value from the last iteration is saved.

10. What is Synchronization?

It means controlling access to shared data to prevent conflicts. Tools:

 critical

 atomic

 barrier

 lock

11. What is Schedule in OpenMP?

It defines how loop iterations are divided:

 static: Fixed-size chunks

 dynamic: Assigned as threads finish

 guided: Decreasing chunk size over time

✅ UNIT 3: MPI Basics

12. What is MPI?

MPI (Message Passing Interface) is used in distributed memory systems to communicate

between processors.

13. Important MPI Functions

 MPI_Send / MPI_Recv: Send and receive messages (blocking)

 MPI_Isend / MPI_Irecv: Non-blocking versions

 MPI_Bcast: Broadcast to all processes

 MPI_Gather / MPI_Scatter: Collect or distribute data

 MPI_Reduce: Combines values using operations (sum, max, etc.)

 MPI_Barrier: All processes wait here (synchronization)

14. MPI_Comm_rank & MPI_Comm_size

 MPI_Comm_rank: Gives process ID

 MPI_Comm_size: Total number of processes

✅ UNIT 4: Architectures and Concepts

15. Shared vs Distributed Memory

 Shared: All processors access same memory (OpenMP).

 Distributed: Each has own memory (MPI), uses message passing.

16. UMA vs NUMA

 UMA (Uniform): Equal memory access time

 NUMA (Non-uniform): Memory access time varies based on location

17. Superscalar and VLIW

 Superscalar: Executes multiple instructions in one cycle using multiple ALUs

 VLIW: Compiler schedules multiple instructions in one long word

18. Cache Coherence

Ensures all processors see consistent values of shared variables in their caches.

✅ UNIT 5: Algorithms & Case Studies

19. PRAM Models

 EREW: Exclusive Read, Exclusive Write

 CREW: Concurrent Read, Exclusive Write

 ERCW: Exclusive Read, Concurrent Write

 CRCW: Concurrent Read, Concurrent Write

20. Odd-Even Sort (MPI)

 Alternates sorting even-odd and odd-even indexed pairs

 Processes exchange and compare values using MPI_Sendrecv

21. BFS in OpenMP

 Traverse levels of a graph using parallel loops

 Protect next-level queue using critical or atomic

22. Histogram Sorting and Race Condition

 Race condition occurs when threads update the same variable (e.g., histogram bin)

 Fix using: #pragma omp atomic or locks

23. Block Matrix Multiplication (MPI)

 Divide matrices into blocks

 Each process computes its part

 Use MPI to share and collect results

 Time complexity: O(n³/p) + communication cost

24. Fork-Join Model

 Fork: Master creates threads (parallel region)

 Join: Threads finish and sync before continuing

Would you like this cleaned version as a PDF file or a printable revision sheet?

Absolutely! Here's the complete simplified and easy-to-understand summary of your IPP
Question Bank PDF — all in one go, organized by topic.

🌟 Introduction to Parallel Programming (Simplified Notes)

✅ OpenMP – Shared Memory Programming

🔹 Basic Directives

 #pragma omp parallel: Creates multiple threads.

 #pragma omp parallel for: Splits loop among threads.

#pragma omp parallel {

printf("Thread %d\n", omp_get_thread_num());

}

🔹 Synchronization

 critical: Only one thread enters at a time.

 atomic: Fast, for simple operations (like sum++).

 barrier: Waits until all threads reach this point.

 nowait: Skips waiting after loops.

 flush: Makes updates visible to all threads.

 ordered: Maintains original loop order inside a parallel loop.

🔹 Data Sharing Clauses

 private: Each thread gets its own uninitialized variable.

 shared: Variable is shared among all threads.

 firstprivate: Private, but initialized with original value.

 lastprivate: After loop, saves value from last iteration.

 threadprivate: Global variable, but private to each thread.

🔹 Reduction

 Combines thread results (e.g., sum, max).

#pragma omp parallel for reduction(+:sum)

🔹 Loop Scheduling

 static: Fixed chunks.

 dynamic: Chunks given as threads finish.

 guided: Large to small chunks.

 auto: Compiler decides.

🔹 Tasks
 task: Runs block later.

 taskwait: Waits for child tasks to finish.

🔹 Locks

omp_lock_t lock;

omp_init_lock(&lock);

omp_set_lock(&lock);

// critical work

omp_unset_lock(&lock);

omp_destroy_lock(&lock);

🔹 Useful OpenMP Functions

 omp_set_num_threads(n)

 omp_get_num_threads()

 omp_get_thread_num()

 omp_get_wtime() → returns wall time

🔹 OpenMP Env Variables

 OMP_NUM_THREADS: set thread count

 OMP_SCHEDULE: schedule type

 OMP_DYNAMIC: auto-adjust threads

 OMP_PROC_BIND: fix threads to CPUs

✅ MPI – Distributed Memory Programming

🔹 Basic MPI Communication

 MPI_Send / MPI_Recv: Blocking send/receive

 MPI_Isend / MPI_Irecv: Non-blocking

 MPI_Wait, MPI_Test: Wait for non-blocking completion

 MPI_Sendrecv: Send and receive together

🔹 Collective Communication

 MPI_Bcast: Broadcast to all

 MPI_Gather: Collect to root

 MPI_Allgather: All collect

 MPI_Scatter: Divide from root

 MPI_Reduce: Combine values (sum, max)

 MPI_Allreduce: All get combined result

 MPI_Barrier: Wait for all to sync

🔹 MPI Info & Utilities

 MPI_Comm_rank(): Get process ID

 MPI_Comm_size(): Total processes

 MPI_COMM_WORLD: All processes group

 MPI_Status: Holds message info

✅ Core Concepts

🔹 Clock Concepts

 Clock cycle: Smallest unit of CPU time.

 Clock speed: How many cycles/second (GHz = billions/sec).

🔹 Superscalar vs VLIW

 Superscalar: Hardware runs many instructions per cycle.

 VLIW: Compiler packs many instructions into one long word.

🔹 Memory Concepts

 Cache coherence: Keep cache data consistent across cores.

 Spatial locality: Accessing nearby memory addresses.

 Temporal locality: Reusing same address soon.

 Strided access: Accessing with a step (e.g., a[i+2]).

🔹 Memory Architecture

 Shared memory: One memory, many threads (OpenMP).

 Distributed memory: Each processor has its own memory (MPI).

 UMA: Uniform access to memory.

 NUMA: Access speed varies by location.

🔹 SIMD vs MIMD vs SPMD

 SIMD: One instruction, many data (used in GPUs).

 MIMD: Each core does different instructions.

 SPMD: Same code, different data (common in MPI).

🔹 Blocking vs Non-blocking MPI

 Blocking: Sender/receiver wait.

 Non-blocking: Send/recv continues, needs MPI_Wait.

✅ Algorithms

🔹 Dijkstra’s Algorithm (MPI)

 Find shortest path.

 Use MPI_Allreduce to share minimum distance info.

🔹 Odd-Even Sort (MPI)

 Even phase: sort pairs (0,1), (2,3)...

 Odd phase: sort pairs (1,2), (3,4)...

 Use MPI_Sendrecv to swap values.

🔹 BFS (OpenMP)

 Traverse graph by levels.

 Use parallel for to explore neighbors.

 Use critical/atomic to update next level safely.

🔹 Histogram Sorting

 Count frequencies (e.g., in images).

 Problem: race condition if threads update same bin.

 Fix: use critical, atomic, or locks.

🔹 Block Matrix Multiplication (MPI)

 Divide matrices into smaller blocks.

 Each process computes a block.

 Time: O(n³/p) + communication cost

🔹 Fork-Join Model

 Fork: Start parallel threads.

 Join: Wait for threads to finish.

🔹 PRAM Types

 EREW: Exclusive read, exclusive write

 CREW: Concurrent read, exclusive write

 ERCW: Exclusive read, concurrent write

 CRCW: Concurrent read, concurrent write

🔹 K-d Mesh & Hypercube

 K-d Mesh: Grid network, 2 neighbors per dimension

 Hypercube: 2^d nodes, each connected by 1-bit difference

🎯 Done!

This is your complete simplified version of the full "Introduction to Parallel Programming"
question bank — with no topics missed.

Would you like me to turn this into a PDF or revision sheet now?

Module 1
No ratings yet
Module 1
67 pages
Introduction to Parallel Computing Basics
No ratings yet
Introduction to Parallel Computing Basics
23 pages
Govindarajan - ParallelizationPrinciples NSM AstroPhysics
No ratings yet
Govindarajan - ParallelizationPrinciples NSM AstroPhysics
50 pages
MCAP Qb.
No ratings yet
MCAP Qb.
7 pages
Programming Assignment: On Openmp
No ratings yet
Programming Assignment: On Openmp
19 pages
Parallel Programming Unit 2
No ratings yet
Parallel Programming Unit 2
71 pages
Introduction To Parallel Programming: Center For Institutional Research Computing
No ratings yet
Introduction To Parallel Programming: Center For Institutional Research Computing
98 pages
DS1822 - Parallel Computing - Unit 1
No ratings yet
DS1822 - Parallel Computing - Unit 1
23 pages
High Performance Computing Labs Overview
No ratings yet
High Performance Computing Labs Overview
5 pages
Practice Questions For PP Final Orals-2022
No ratings yet
Practice Questions For PP Final Orals-2022
8 pages
ParallelProgramming Start2016
No ratings yet
ParallelProgramming Start2016
41 pages
Lecture Open MP
No ratings yet
Lecture Open MP
25 pages
Understanding Multi-Core Processor Architectures
No ratings yet
Understanding Multi-Core Processor Architectures
32 pages
Multi-Core Architecture Question Bank
No ratings yet
Multi-Core Architecture Question Bank
7 pages
OpenMP Parallel Computing Tutorial
No ratings yet
OpenMP Parallel Computing Tutorial
58 pages
Mansi Kadam PC Lab Assignment 1
No ratings yet
Mansi Kadam PC Lab Assignment 1
4 pages
Prebook MCAP
No ratings yet
Prebook MCAP
11 pages
Understanding Parallel Programming Concepts
No ratings yet
Understanding Parallel Programming Concepts
18 pages
Lec6 - TLP Data Dependence Solutions
No ratings yet
Lec6 - TLP Data Dependence Solutions
20 pages
Overview of Parallel Processing Types
No ratings yet
Overview of Parallel Processing Types
31 pages
Concurrency: CS2403 Programming Languages
No ratings yet
Concurrency: CS2403 Programming Languages
44 pages
High Performance Computing Overview
No ratings yet
High Performance Computing Overview
42 pages
Understanding Multicore and OpenMP
No ratings yet
Understanding Multicore and OpenMP
82 pages
Parallel Answers
No ratings yet
Parallel Answers
6 pages
24csppc202 Multicore Architecture and Programming
No ratings yet
24csppc202 Multicore Architecture and Programming
21 pages
PA Midsem
No ratings yet
PA Midsem
20 pages
Understanding Parallel Programming Concepts
No ratings yet
Understanding Parallel Programming Concepts
18 pages
CS3006 Parallel Computing Course Overview
100% (1)
CS3006 Parallel Computing Course Overview
46 pages
Multi Core Architectures and Programming
No ratings yet
Multi Core Architectures and Programming
10 pages
Introduction to Parallel Computing Basics
No ratings yet
Introduction to Parallel Computing Basics
64 pages
CSC-334 - P&DC - Lab Manual - V2.0
No ratings yet
CSC-334 - P&DC - Lab Manual - V2.0
102 pages
Omp Hands On
No ratings yet
Omp Hands On
200 pages
Bcs702 Parallel Computing Module 1
100% (3)
Bcs702 Parallel Computing Module 1
35 pages
Openmp
No ratings yet
Openmp
61 pages
1st Ia Preparation
No ratings yet
1st Ia Preparation
15 pages
1 Introduction
No ratings yet
1 Introduction
30 pages
Multiprocessors
No ratings yet
Multiprocessors
39 pages
Introduction to Parallel Programming
No ratings yet
Introduction to Parallel Programming
108 pages
Computer System Organization & OpenMP
No ratings yet
Computer System Organization & OpenMP
6 pages
Aca
No ratings yet
Aca
13 pages
CS33 S25 L15 More OpenMP Annotated
No ratings yet
CS33 S25 L15 More OpenMP Annotated
65 pages
Lecture 03
No ratings yet
Lecture 03
39 pages
Parallel Computing
100% (1)
Parallel Computing
12 pages
Understanding Shared Memory and OpenMP
No ratings yet
Understanding Shared Memory and OpenMP
86 pages
PDC Lecture 05
No ratings yet
PDC Lecture 05
48 pages
MPI vs OpenMP: Parallel Programming Comparison
No ratings yet
MPI vs OpenMP: Parallel Programming Comparison
4 pages
LP V Theory and Practical Explanation: o o o o
No ratings yet
LP V Theory and Practical Explanation: o o o o
96 pages
Understanding Multiprocessor Systems
No ratings yet
Understanding Multiprocessor Systems
40 pages
Ipp Question Bank
No ratings yet
Ipp Question Bank
11 pages
Multiprocessor Basics & Performance
No ratings yet
Multiprocessor Basics & Performance
52 pages
Need for Parallel Computing Explained
No ratings yet
Need for Parallel Computing Explained
76 pages
Understanding Parallel Computing Basics
No ratings yet
Understanding Parallel Computing Basics
34 pages
OpenMP Programming Overview and Examples
No ratings yet
OpenMP Programming Overview and Examples
46 pages
OpenMP Shared Memory Programming Guide
No ratings yet
OpenMP Shared Memory Programming Guide
65 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
47 pages
HPC Summary
No ratings yet
HPC Summary
17 pages
Concurrent and Parallel Programming Unit V-Notes Unit V Openmp, Opencl, Cilk++, Intel TBB, Cuda 5.1 Openmp
No ratings yet
Concurrent and Parallel Programming Unit V-Notes Unit V Openmp, Opencl, Cilk++, Intel TBB, Cuda 5.1 Openmp
10 pages
Azure Access & PowerShell Guide
No ratings yet
Azure Access & PowerShell Guide
42 pages
Computing Fundamentals Course Overview
No ratings yet
Computing Fundamentals Course Overview
21 pages
Free Bitcoin Roll Script Guide
No ratings yet
Free Bitcoin Roll Script Guide
2 pages
BTS-ISEM-INS-001 Setup Internet Information Services (IIS)
No ratings yet
BTS-ISEM-INS-001 Setup Internet Information Services (IIS)
12 pages
Ccproxy8 0
No ratings yet
Ccproxy8 0
398 pages
David Whitley E Comm 13
No ratings yet
David Whitley E Comm 13
24 pages
Device Driver Identifiers
No ratings yet
Device Driver Identifiers
4 pages
Original Thomson Twg870 Manual
0% (1)
Original Thomson Twg870 Manual
90 pages
Promela Code Conversion and Concepts
No ratings yet
Promela Code Conversion and Concepts
26 pages
Emc VNX Vnxe3150: Installation Guide
No ratings yet
Emc VNX Vnxe3150: Installation Guide
26 pages
8-Cisco ISE Deployment Options
No ratings yet
8-Cisco ISE Deployment Options
3 pages
Resolved Exam Cisco CCNA1 Chapter 2
No ratings yet
Resolved Exam Cisco CCNA1 Chapter 2
9 pages
Lg-Dcaciv1.2 Aci v6 July 2023 - DP
No ratings yet
Lg-Dcaciv1.2 Aci v6 July 2023 - DP
453 pages
SolarWinds Academy Training Videos
No ratings yet
SolarWinds Academy Training Videos
3 pages
Huawei H12-211 Exam Q&A Guide
No ratings yet
Huawei H12-211 Exam Q&A Guide
161 pages
Lab3 Branch Prediction Hardware
No ratings yet
Lab3 Branch Prediction Hardware
16 pages
Esd Lab Manual
No ratings yet
Esd Lab Manual
41 pages
Wifirobin User Manual V1.78 Guide
No ratings yet
Wifirobin User Manual V1.78 Guide
25 pages
Intel Lga775-Pinout
40% (5)
Intel Lga775-Pinout
1 page
3009 51 C Mimer NetworkInterface
No ratings yet
3009 51 C Mimer NetworkInterface
6 pages
GDSmanual en 1.7 1
No ratings yet
GDSmanual en 1.7 1
25 pages
SJF and SRTF Scheduling Algorithms Explained
No ratings yet
SJF and SRTF Scheduling Algorithms Explained
8 pages
Understanding Computer Network Addressing
No ratings yet
Understanding Computer Network Addressing
24 pages
PPSSPP: Fast Portable PSP Emulator
No ratings yet
PPSSPP: Fast Portable PSP Emulator
4 pages
Computer Architecture Asignment
No ratings yet
Computer Architecture Asignment
7 pages
Readme
No ratings yet
Readme
19 pages
Formative2 Castro
No ratings yet
Formative2 Castro
11 pages
25 LC 1024
No ratings yet
25 LC 1024
28 pages
MS-DOS Lab Guide for IT Students
No ratings yet
MS-DOS Lab Guide for IT Students
14 pages
Minecraft ModLoader Initialization Log
No ratings yet
Minecraft ModLoader Initialization Log
2 pages