Welcome to Scribd!

AML Report Bricola-1

Uploaded by

0% found this document useful (0 votes)

18 views5 pages

This document summarizes an MPI assignment to parallelize the computation of pi. It discusses using the Riemann sum method to approximate the integral of 4/(1+x^2) from 0 to 1. The integral is divided into intervals that are distributed among processing units. Each processor computes the sum over its subset of intervals. The local sums are aggregated using MPI Reduce to obtain the total approximation of pi. Experiments show that computation time decreases as more processes/threads are used, with the fastest time achieved with 16,384 threads.

Original Description:

Original Title

AML_report_bricola-1

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

18 views5 pages

AML Report Bricola-1

Uploaded by

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 5

Search inside document

HPC - report of MPI assignment

Andrea Bricola

January 2, 2023
Chapter 1

Report

Background on pi number

Figure 1.1

Pi is a mathematical constant. Pi number is equal to the ratio of the circumference

over the diameter of a circle. Pi can be approximated by computing the integral of
the following function in the interval [0,1].
Z 1
4
π= f (x) =
0 1 + x2

Pi computation
The computational problem regards dividing the integral in 101 0 intervals and calcu-
lating the Riemann sum to approximate the value of the integral with proper precision.
The method of Riemann sum consists of dividing the integral of a function in many
intervals, computing the area in each interval and finally approximating the integral
as the sum of all areas.

1
Figure 1.2

The program computes an approximation of π with precision to the 8th decimal

digit.

Integral computation with MPI

Figure 1.3

The idea to make the computation faster is distributing the intervals of Riemann sum
among processing units that work in parallel, so each processor obtains the sum of a

2
subset of intervals and finally a global sum is acquired as aggregation of subset sums.
In order to aggregate local sums I used the procedure MPI Reduce() so each process
sends a message to root process with its local sum and finally root process computes
the total integral

Experiments with MPI

Hardware
THe experiment are conducted in computer cluster OCAPIE (Ottimizzazione di CAl-
colo Parallelo Intensivo Applicate a problematiche in ambito Energetico). There are
8 avaiable nodes and each machine has a Xeon phi has processor, which has 64 cores
and supports hyperthreading with 256 logical cores.

1.0.1 Measuring performances

An MPI program involves the initialization of processes inside the local operative sys-
tem and possibly in remote computers. An MPI program requires a certain amount
of time before processes initialize and can read their own rank. After compiling a
multiprocess program with mpicc, I executed it with mpiexec, and I observed a delay
of some seconds before processes start and can print to standard output their rank.
This delay was around 12 seconds when I initialized 64 processes (one for each core)
in each of the 8 avaiable nodes in the computer cluster. The initialization time is a
parallelization overhead which cannot be removed.

In order to measure computation time of the integral, I inserted in my code a

chronometer which starts before the parallel computation of intervals and stops once
root processes receives local sums and computes the total integral.

results
• 32 processes on one node: 239
• 64 processes on one node: 119 seconds
• 128 processes on one node: 61 seconds
• 256 processes on one node: 37 seconds
• 256 processi distribuiti nel cluster: 29 seconds
• 500 processi distributii nel cluster: 15.7 seconds
• 700 processi distribuiti nel cluster: 11.2
• 800 processi distribuiti: 10.9
• 900 processi 9.8
• 1000 processi 8.9
• 1200 processi: 14.9 secondi (cluster occupato?)
vedere se migliora mettendo piu di 64 processi con un nodo solo, e poi decidere
come organizzare i processi remoti

3
Experiments with MPI and OMP
precisione rilevata: 12 cifre decimali

• 64 thread su un processo: 121.1 secondi

• 128 thrwad su un processo: 62.7

• 256 thread su un processo: 34.5

• 512 thread distribuiti: 19.5

• 768 thread distribuiti: 13.6 secondi

• 1024: 11.3 secondi

• 2048 thread: 10

• 4096: 9 seconds

• 8192 thread: 9.2

• 16384 thread:

Computer Organization and Architecture 10th Edition Stallings Solutions Manual
Document26 pages
Computer Organization and Architecture 10th Edition Stallings Solutions Manual
MeganJonesjwbp
96% (54)
CG Theory Assignment
Document22 pages
CG Theory Assignment
pinky
No ratings yet
Assignment 1
Document2 pages
Assignment 1
Samira Yomi
No ratings yet
Problems Chapter 17 Parallel Processsing: 17.14 An Application Program Is Executed On A Nine-Computer Cluster. A
Document4 pages
Problems Chapter 17 Parallel Processsing: 17.14 An Application Program Is Executed On A Nine-Computer Cluster. A
8Rafi Ahmad Fadhlan
No ratings yet
Computer Structures - MPI
Document16 pages
Computer Structures - MPI
Yomal Wijesinghe
No ratings yet
Mamma Mia 2008 Remastered
Document3 pages
Mamma Mia 2008 Remastered
HDDaffa
No ratings yet
HPC Project Mpi
Document17 pages
HPC Project Mpi
jaya vignesh
No ratings yet
Lecture3 Jan10 Efficiency
Document34 pages
Lecture3 Jan10 Efficiency
stark
No ratings yet
PP 1
Document41 pages
PP 1
Murtaza Ali
No ratings yet
Pi-Calculation by Parallel Programming: Mr. Paopat Ratpunpairoj
Document8 pages
Pi-Calculation by Parallel Programming: Mr. Paopat Ratpunpairoj
Kaew Pao
No ratings yet
Weatherwax Pacheco Problems
Document12 pages
Weatherwax Pacheco Problems
narendra
No ratings yet
Lecture 03
Document30 pages
Lecture 03
hamza abbas
No ratings yet
MPI Application Tune Up r5
Document23 pages
MPI Application Tune Up r5
rida99
No ratings yet
Final Project Report MRI Reconstruction
Document19 pages
Final Project Report MRI Reconstruction
Gokul Subramani
No ratings yet
1.1 Parallelism Is Ubiquitous
Document3 pages
1.1 Parallelism Is Ubiquitous
Rajinder Sanwal
No ratings yet
Computer Organization and Architecture Cs2253: Part-A
Document21 pages
Computer Organization and Architecture Cs2253: Part-A
janukarthi
No ratings yet
Why Assembly Language?
Document74 pages
Why Assembly Language?
Shivam
No ratings yet
Hardware RSA Accelerator: Group 3: Ariel Anders, Timur Balbekov, Neil Forrester May 15, 2013
Document15 pages
Hardware RSA Accelerator: Group 3: Ariel Anders, Timur Balbekov, Neil Forrester May 15, 2013
Rash Rashad
No ratings yet
Lecture Ch4 Performance
Document25 pages
Lecture Ch4 Performance
Hassam Hafeez
No ratings yet
Performance Enhancement of Cisc Microcontroller: Mr. K. Sai Krishna Mr. G. Sreenivasa Raju
Document6 pages
Performance Enhancement of Cisc Microcontroller: Mr. K. Sai Krishna Mr. G. Sreenivasa Raju
Anonymous lPvvgiQjR
No ratings yet
Architecture 2020
Document10 pages
Architecture 2020
honey arguelles
No ratings yet
Design and Analysis of Algorithms
Document168 pages
Design and Analysis of Algorithms
Jeeva M CSE KIOT
No ratings yet
ECL333 - Ktu Qbank
Document7 pages
ECL333 - Ktu Qbank
Roshith K
No ratings yet
Chapter 5 - CO - BIM - III
Document7 pages
Chapter 5 - CO - BIM - III
Ankit Shrestha
No ratings yet
Time Complexity - Class Lecture
Document10 pages
Time Complexity - Class Lecture
Ali Razzaq
No ratings yet
Interconnection Network Architectures
Document54 pages
Interconnection Network Architectures
Le Professionist
No ratings yet
DSP Module 5 2018 Scheme
Document104 pages
DSP Module 5 2018 Scheme
D SUDEEP REDDY
No ratings yet
Analysis Chapter 1
Document74 pages
Analysis Chapter 1
Yawkal Addis
No ratings yet
W2 Chapter Algorithm Analysis
Document28 pages
W2 Chapter Algorithm Analysis
Visnu Manimaran
No ratings yet
COSS Webinar01 13-06-2023
Document37 pages
COSS Webinar01 13-06-2023
Prasana Venkatesh
No ratings yet
HW2 Solutions
Document4 pages
HW2 Solutions
নিবিড় অভ্র
No ratings yet
Computer Architecture 16 Marks
Document28 pages
Computer Architecture 16 Marks
Balachandar2000
100% (1)
Lec3a Asymptotic Growth Rate
Document21 pages
Lec3a Asymptotic Growth Rate
Online Worker
No ratings yet
MPI Plamen Krastev
Document49 pages
MPI Plamen Krastev
Akyas
No ratings yet
Report - Viber String
Document26 pages
Report - Viber String
LopUi
No ratings yet
Lab Report - Assignment 1: Variables
Document4 pages
Lab Report - Assignment 1: Variables
ABC
No ratings yet
B1 Group3
Document13 pages
B1 Group3
Kausar Parvej
No ratings yet
cs152 Fa97 mt2 Sol
Document18 pages
cs152 Fa97 mt2 Sol
Muhammad Fahad Naeem
No ratings yet
Content PDF
Document14 pages
Content PDF
robbyyu
No ratings yet
Hon Pro
Document8 pages
Hon Pro
Ioan
No ratings yet
3.4b-DD2356 - CalculatePI
Document10 pages
3.4b-DD2356 - CalculatePI
Daniel Araújo
No ratings yet
Computer Organization and Architecture (COA) 2017 May - June Old Solved Question Paper
Document35 pages
Computer Organization and Architecture (COA) 2017 May - June Old Solved Question Paper
maharshisanandyadav
No ratings yet
Hw1 Sum2013
Document8 pages
Hw1 Sum2013
Dylan Erickson
No ratings yet
Assignment No. 10 Title: Write x86 ALP To Find The Factorial of A Given Integer Number On A Command Line by Using
Document2 pages
Assignment No. 10 Title: Write x86 ALP To Find The Factorial of A Given Integer Number On A Command Line by Using
Rahul Jadhav
No ratings yet
PL01 Guiao
Document3 pages
PL01 Guiao
João Lourenço
No ratings yet
CDA5155 Fall 2016 Homework 1 - Dhiraj Borade
Document8 pages
CDA5155 Fall 2016 Homework 1 - Dhiraj Borade
Dhiraj Borade
No ratings yet
Class XII (As Per CBSE Board) : Computer Science
Document17 pages
Class XII (As Per CBSE Board) : Computer Science
Saravana Kumar R
No ratings yet
Chapter 3-Fuzzy Logic Control
Document67 pages
Chapter 3-Fuzzy Logic Control
Silesh
No ratings yet
Week 1: Algorithm Analysis
Document26 pages
Week 1: Algorithm Analysis
Lord Koyo
No ratings yet
DAA Notes
Document80 pages
DAA Notes
Anil
No ratings yet
08 - Mixedprogramming: 1 Mixed Programming
Document41 pages
08 - Mixedprogramming: 1 Mixed Programming
giordano mancini
No ratings yet
DSP Module V
Document21 pages
DSP Module V
Junaid Ahmed 404
No ratings yet
Modular Design of Adders With Domino Logic 1: M.B.Damle 2 DR S.S.Limaye
Document5 pages
Modular Design of Adders With Domino Logic 1: M.B.Damle 2 DR S.S.Limaye
Ijarcet Journal
No ratings yet
Pipeline and Vector Processing
Document28 pages
Pipeline and Vector Processing
Terror Blade
No ratings yet
FPGA Implementation of IEEE-754 Karatsuba Multiplier
Document4 pages
FPGA Implementation of IEEE-754 Karatsuba Multiplier
SatyaKesav
No ratings yet
Base Conversions
Document18 pages
Base Conversions
Adhesh
No ratings yet
Embed Sensor
Document20 pages
Embed Sensor
Belinda Lee
No ratings yet
Computer Arithmetic in Theory and Practice
From Everand
Computer Arithmetic in Theory and Practice
Ulrich W. Kulisch
Rating: 4 out of 5 stars
4/5 (1)
Integer Programming
From Everand
Integer Programming
Elsevier Books Reference
No ratings yet
Analog and Hybrid Computing: The Commonwealth and International Library: Electrical Engineering Division
From Everand
Analog and Hybrid Computing: The Commonwealth and International Library: Electrical Engineering Division
D. E. Hyndman
Rating: 1 out of 5 stars
1/5 (1)
Mastering Mathematica®: Programming Methods and Applications
From Everand
Mastering Mathematica®: Programming Methods and Applications
John W. Gray
Rating: 5 out of 5 stars
5/5 (1)
7 Distributional-Semantics Organized
Document28 pages
7 Distributional-Semantics Organized
R
No ratings yet
8 - Other Semantics Simons NLP Tutorial - Organized
Document61 pages
8 - Other Semantics Simons NLP Tutorial - Organized
R
No ratings yet
4 IntroAgentsAndMAS-new
Document65 pages
4 IntroAgentsAndMAS-new
R
No ratings yet
0 presentationNLP-MAS
Document24 pages
0 presentationNLP-MAS
R
No ratings yet
1 Thinkingmachines
Document63 pages
1 Thinkingmachines
R
No ratings yet
8 OpenMPTasking
Document68 pages
8 OpenMPTasking
R
No ratings yet
5 ParConcepts
Document68 pages
5 ParConcepts
R
No ratings yet
2 SPC
Document79 pages
2 SPC
R
No ratings yet
1 Introduction
Document45 pages
1 Introduction
R
No ratings yet
10 GPU-IntroCUDA3
Document141 pages
10 GPU-IntroCUDA3
R
No ratings yet
9 Intro MPI
Document173 pages
9 Intro MPI
R
No ratings yet
Asynchronous Chip
Document19 pages
Asynchronous Chip
bhawna
0% (1)
EC M2640idwL-M2040dnL-M2135TB1LAD-Boletim
Document19 pages
EC M2640idwL-M2040dnL-M2135TB1LAD-Boletim
victor
No ratings yet
Database Upgrade Guide PDF
Document198 pages
Database Upgrade Guide PDF
Nguessan Kouadio
No ratings yet
Yash Karan Singh Rathore
Document2 pages
Yash Karan Singh Rathore
Yash Karan Singh Rathore
No ratings yet
03-0271 Operation WSR
Document13 pages
03-0271 Operation WSR
mao liming
No ratings yet
ETAP FAQ Load Torque
Document2 pages
ETAP FAQ Load Torque
Javier Maldonado
No ratings yet
Vulva L
Document3 pages
Vulva L
subhajit_pal
No ratings yet
Telenor Presentation by Ibrar Ullah
Document25 pages
Telenor Presentation by Ibrar Ullah
Ibrar Khan
No ratings yet
2019-0313 PNJ PM Pertemuan 1
Document44 pages
2019-0313 PNJ PM Pertemuan 1
David Arlas Rumapea
No ratings yet
VMware - Vsphere - Version - Comparison - 40 41 50 51 55 60 65
Document1 page
VMware - Vsphere - Version - Comparison - 40 41 50 51 55 60 65
walid
No ratings yet
Spi Leader Pseudocode
Document5 pages
Spi Leader Pseudocode
api-581263110
No ratings yet
Excel Basic To Advanced Course
Document13 pages
Excel Basic To Advanced Course
Harish Pechetti
No ratings yet
Telkomsel: Aktivasi GPRS/MMS Lewat OTA / SMS
Document3 pages
Telkomsel: Aktivasi GPRS/MMS Lewat OTA / SMS
wisnuy31
No ratings yet
Date Tracking in Oracle HRMS
Document3 pages
Date Tracking in Oracle HRMS
Bick Kyy
No ratings yet
Sánchez - Modelado Termodinámico de Ciclos Avanzados para Propulsión de Alta Velocidad
Document97 pages
Sánchez - Modelado Termodinámico de Ciclos Avanzados para Propulsión de Alta Velocidad
Sara
No ratings yet
MI 3125 BT EurotestCOMBO
Document2 pages
MI 3125 BT EurotestCOMBO
Luis Mansilla
No ratings yet
BIS - 6660D Based On Intel H81 Chipset Support The 4th Generat
Document1 page
BIS - 6660D Based On Intel H81 Chipset Support The 4th Generat
Alvin Zeto
No ratings yet
Test - 4 (English) Board Test
Document10 pages
Test - 4 (English) Board Test
Amit Singh
No ratings yet
lastCleanException 20220424185711
Document19 pages
lastCleanException 20220424185711
beom choi
No ratings yet
DMX-SPI-203 LED Controller: 2. Product Dimension
Document3 pages
DMX-SPI-203 LED Controller: 2. Product Dimension
Nany Chocokat
No ratings yet
Mobile Secret Codes
Document5 pages
Mobile Secret Codes
Tanmoy Mondal
No ratings yet
Shanti Business School Aimer: Under Guidance Of:Sandeep Makwana
Document11 pages
Shanti Business School Aimer: Under Guidance Of:Sandeep Makwana
Akash Singh
No ratings yet
Clustering
Document7 pages
Clustering
Rupesh Gaur
No ratings yet
Mobile Computing Assignment: Submitted by Sakaanaa M 2017115583
Document31 pages
Mobile Computing Assignment: Submitted by Sakaanaa M 2017115583
Sakaanaa Mohan
No ratings yet
Umair Paracha: Contact Info
Document2 pages
Umair Paracha: Contact Info
Umair
No ratings yet
MCQ Set 3
Document6 pages
MCQ Set 3
bhasker sharma
No ratings yet
Java Fundamentals - Part I
Document135 pages
Java Fundamentals - Part I
EdwinMejia
100% (1)
Lathe Leadscrew Arduino Code
Document6 pages
Lathe Leadscrew Arduino Code
claudi94
No ratings yet
Artificial Intelligence in The Oilfield
Document25 pages
Artificial Intelligence in The Oilfield
Elio Eid
No ratings yet