BECOA157 Parallel Matrix Multiplication

Uploaded by

mysql mysql

0% found this document useful (0 votes)

26 views3 pages

Parallel matrix mulyiplication

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Parallel matrix mulyiplication

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

26 views3 pages

BECOA157 Parallel Matrix Multiplication

Uploaded by

mysql mysql

Parallel matrix mulyiplication

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

Parallel Matrix Multiplication using CUDA

#include <bits/stdc++.h>
using namespace std;

// Kernel function for matrix multiplication

_global_
void GPUmatmul(int N, double *x, double *y, double *ans)
{
//calculates unique thread ID in the block
int t=
(blockDim.x*blockDim.y)*threadIdx.z+(threadIdx.y*blockDim.x)+(threadIdx.x);
//calculates unique block ID in the grid
int b= (gridDim.x*gridDim.y)*blockIdx.z+(blockIdx.y*gridDim.x)+(blockIdx.x);
//block size (this is redundant though)
int T= blockDim.x*blockDim.y*blockDim.z;
//grid size (this is redundant though)
int B= gridDim.x*gridDim.y*gridDim.z;

/*
* Each cell in the matrix is assigned to a different thread.
* Each thread do O(N*number of asssigned cell) computation.
* Assigned cells of different threads does not overlape with
* each other. And so no need for synchronization.
*/

for (int i=b;i<N;i+=B)

{
for(int j=t;j<N;j+=T)
{
for(int k=0;k<N;k++)
{
ans[i*N+j]+=(x[i*N+k]*y[k*N+j]);
}
}
}
}

void CPUmatmul(int N,double x, double y, double *ans)

{
for(int i=0;i<N;i++)
{
for(int j=0;j<N;j++)
{
for(int k=0;k<N;k++)
{
ans[i*N+j]+=(x[i*N+k]*y[k*N+j]);
}
}
}
}

bool check(int N,double *ans)

{
for(int i=0;i<N;i++)
{
for(int j=0;j<N;j++)
{
if(ans[i*N+j]!=20.0)return false;
}
}
return true;
}

int main(void)
{
//size of matrix
int N = 1<<9;

double x, y, *ans;

// Allocate Unified Memory – accessible from CPU or GPU

cudaMallocManaged(&x, N*N*sizeof(double));
cudaMallocManaged(&y, N*N*sizeof(double));
cudaMallocManaged(&ans, N*N*sizeof(double));

// initialize x,y and ans arrays on the host

for (int i = 0; i < N; i++)
{
for(int j=0;j<N;j++)
{
x[i*N+j]=5;
y[i*N+j]=(i==j?1:0);
ans[i*N+j]=(double)0.000000000000;
}
}

clock_t t;
double avg=0;
cout<<"Strting CPU computation"<<endl;
for(int i=0;i<=3;i++)
{
t=clock();
CPUmatmul(N, x, y,ans);
t = clock() - t;
if(i)avg+=t; //we will ignore the first run
printf ("It took CPU-%d %f
ms.\n",i,(((double)t)/CLOCKS_PER_SEC)*1000);
}
avg/=3;
avg/=CLOCKS_PER_SEC;
avg*=1000;
printf ("It took %lf ms on avg.\n",avg);
if(check(N,ans))cout<<"RUN OK."<<endl;
else cout<<"RUN NOT OK."<<endl;

// initialize x,y and ans arrays on the host

for (int i = 0; i < N; i++)
{
for(int j=0;j<N;j++)
{
x[i*N+j]=5;
y[i*N+j]=(i==j?1:0);
ans[i*N+j]=(double)0.000000000000;
}
}
avg=0;
cout<<"Strting GPU computation"<<endl;
// Run kernel on GPU
for(int i=0;i<=3;i++)
{
t=clock();
GPUmatmul<<<dim3(16,16,16), dim3(16,8,8)>>>(N, x, y,ans);
cudaDeviceSynchronize();
t = clock() - t;
if(i)avg+=t; //we will ignore the first run
printf ("It took GPU-%d %f
ms.\n",i,(((double)t)/CLOCKS_PER_SEC)*1000);
}
avg/=3;
avg/=CLOCKS_PER_SEC;
avg*=1000;
printf ("It took %lf ms on avg.\n",avg);
if(check(N,ans))cout<<"RUN OK."<<endl;
else cout<<"RUN NOT OK."<<endl;

// Free memory
cudaFree(x);
cudaFree(y);
return 0;
}

Reporte
Document9 pages
Reporte
Renato Sebastian Rodriguez Llanos
No ratings yet
Amcat Automata
Document18 pages
Amcat Automata
rochanaa.shri
No ratings yet
Neural Network Approximates Sinusoid Function
Document3 pages
Neural Network Approximates Sinusoid Function
Indra Sinaga
No ratings yet
Karatsuba Algorithm Time Complexity
Document24 pages
Karatsuba Algorithm Time Complexity
Shashank Sharma
No ratings yet
Java Knight's Tour and Minimum Spanning Tree Assignment Solutions
Document10 pages
Java Knight's Tour and Minimum Spanning Tree Assignment Solutions
FY20J1029 Gaurav Shelke
No ratings yet
Informe 2
Document15 pages
Informe 2
Renato Sebastian Rodriguez Llanos
No ratings yet
All Hpc Programs
Document16 pages
All Hpc Programs
aditya kamble
No ratings yet
Cs6104 Data Structures and Algorithms WEEK - 10: T. Vezha Venthan 2019503570
Document16 pages
Cs6104 Data Structures and Algorithms WEEK - 10: T. Vezha Venthan 2019503570
Vezhaventhan
No ratings yet
Gpu 1
Document4 pages
Gpu 1
Aman Goyal (B19ME004)
No ratings yet
Submitted To: Submited by
Document45 pages
Submitted To: Submited by
Haspreet Singh
No ratings yet
AP _Complex Som
Document8 pages
AP _Complex Som
Shantanu Rai
No ratings yet
Solition Serie TP CPP
Document19 pages
Solition Serie TP CPP
Oubaida21 Snack
No ratings yet
Question No 1
Document8 pages
Question No 1
Rafay Nadeem
No ratings yet
Cấu trúc
Document15 pages
Cấu trúc
Hung Bui
No ratings yet
Sarcini Lab MN
Document5 pages
Sarcini Lab MN
Tabureanu Marian
No ratings yet
Cheat_Sheet
Document14 pages
Cheat_Sheet
Qing Hong Boon
No ratings yet
Input
Document26 pages
Input
GUNEET SURA
No ratings yet
Oop Assignmnet 1
Document12 pages
Oop Assignmnet 1
Talha
No ratings yet
Practical 1A-C Array Operations
Document25 pages
Practical 1A-C Array Operations
Kajal Goud
No ratings yet
Daa 1
Document17 pages
Daa 1
adyant.gupta2022
No ratings yet
Coding For Sobel Edge Detection by Using CPU and GPU 1
Document4 pages
Coding For Sobel Edge Detection by Using CPU and GPU 1
manas singh
No ratings yet
Prime Number From 2 To N Code
Document7 pages
Prime Number From 2 To N Code
Scarlet Vera Heather
No ratings yet
NATIONALINSTITUTEOFTECHNOLOGYWARANGAL–506004DEPARTMENTOFCOMPUTERSCIENCE&ENGINEERINGIB.Tech.,IISemesterPSCPLabAssignment- l2019
Document80 pages
NATIONALINSTITUTEOFTECHNOLOGYWARANGAL–506004DEPARTMENTOFCOMPUTERSCIENCE&ENGINEERINGIB.Tech.,IISemesterPSCPLabAssignment- l2019
Pratheek Chandra
No ratings yet
Samsung
Document58 pages
Samsung
Sagar Kasana
No ratings yet
Tayyab Khan DSA Lab Report 1
Document13 pages
Tayyab Khan DSA Lab Report 1
cigila6437
No ratings yet
Topic - Distance Vector & Link State Routing Protocol
Document18 pages
Topic - Distance Vector & Link State Routing Protocol
PREETI -
No ratings yet
Job Sequencing and Optimization
Document13 pages
Job Sequencing and Optimization
FY20J1029 Gaurav Shelke
No ratings yet
1. Ma trận
Document32 pages
1. Ma trận
Quan Master
No ratings yet
Endsem 2022
Document9 pages
Endsem 2022
varshilshah2003
No ratings yet
DU Computer Networks Practicals
Document44 pages
DU Computer Networks Practicals
suyashmehra.pcell
No ratings yet
A05 A06 A07 Merged
Document19 pages
A05 A06 A07 Merged
abhay ar
No ratings yet
CN and PW
Document34 pages
CN and PW
Roshan Thomas
No ratings yet
Tablourile de Memorie Şi Subprogramele
Document11 pages
Tablourile de Memorie Şi Subprogramele
ilinca Matra
No ratings yet
Analysis and Design of Algorithms Lab File: Submitted by
Document20 pages
Analysis and Design of Algorithms Lab File: Submitted by
vaibhav30388
No ratings yet
Answer
Document7 pages
Answer
kill devil
No ratings yet
Experiment - : Line Drawing by Dda
Document8 pages
Experiment - : Line Drawing by Dda
ishdeep singh
No ratings yet
19it422 Ai
Document18 pages
19it422 Ai
Devanshi Pandya
No ratings yet
Name - Aryan Gupta Reg. No. 199301088 Section - B
Document12 pages
Name - Aryan Gupta Reg. No. 199301088 Section - B
bruh
No ratings yet
Anshu Dsa File 10-30
Document50 pages
Anshu Dsa File 10-30
Roshan Choudhary
No ratings yet
Os 2020UIT3063
Document42 pages
Os 2020UIT3063
brainx Magic
No ratings yet
2D Array Operations in C
Document11 pages
2D Array Operations in C
mehar kashif
No ratings yet
Algorithm Assignment
Document12 pages
Algorithm Assignment
vinod jatav
No ratings yet
SM1
Document3 pages
SM1
Daniel Gyorfi
No ratings yet
Kitty's Calculations On A Tree
Document38 pages
Kitty's Calculations On A Tree
10D Sri Harshini
No ratings yet
Indian Institute of Technology Delhi: Submitted by
Document13 pages
Indian Institute of Technology Delhi: Submitted by
Rajeev Pandey
No ratings yet
Rock Paper and Scissor (Java)
Document6 pages
Rock Paper and Scissor (Java)
virecey444
No ratings yet
BEE MAJA cp2
Document7 pages
BEE MAJA cp2
Milind Mali CSE
No ratings yet
C++ Program
Document17 pages
C++ Program
helpdeskbillldh
No ratings yet
Daa 3
Document5 pages
Daa 3
ranjan.rjofficial
No ratings yet
Rezolvare Teste de Antrenament Setul1
Document10 pages
Rezolvare Teste de Antrenament Setul1
Pepa
No ratings yet
Program To Calculate 100!
Document25 pages
Program To Calculate 100!
Ashish Kapoor
No ratings yet
Cyber Security Mid 2
Document8 pages
Cyber Security Mid 2
Hemang agarwal
No ratings yet
20ucs212 Des
Document7 pages
20ucs212 Des
happyhomie6969
No ratings yet
CG ASsignment
Document20 pages
CG ASsignment
Goyal Aditya
No ratings yet
DAA_Elab_1
Document127 pages
DAA_Elab_1
dhinerao11032005
No ratings yet
CPP Lab Manual
Document29 pages
CPP Lab Manual
Srinivasa Dumpa
No ratings yet
DAA Elab 1
Document127 pages
DAA Elab 1
anant33331
No ratings yet
DAA ELAB Session 679
Document23 pages
DAA ELAB Session 679
Sivdutt S
No ratings yet
CN Experiment 1,2,3
Document7 pages
CN Experiment 1,2,3
Krishna Nithariya
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Steel Code Check Theory Enu
Document341 pages
Steel Code Check Theory Enu
popaciprian27
No ratings yet
Major Project Report
Document49 pages
Major Project Report
Mohini Bharti
No ratings yet
Catalogo Robinson
Document8 pages
Catalogo Robinson
clerigonsa
No ratings yet
Rear Axle Shafts
Document4 pages
Rear Axle Shafts
Maria Aparecida
No ratings yet
Band 2 e
Document58 pages
Band 2 e
Ionaş Claudia Ramona
100% (1)
Tribology and Dynamics of Engine and Powertrain Fundamentals Applications and Future Trends
Document13 pages
Tribology and Dynamics of Engine and Powertrain Fundamentals Applications and Future Trends
kumar_yogesh223881
0% (2)
Parts List - KTZ411-63
Document4 pages
Parts List - KTZ411-63
Mahmoud Elborae
No ratings yet
MEK 10103 ELECTRIC DRIVES AND APPLICATIONS
Document23 pages
MEK 10103 ELECTRIC DRIVES AND APPLICATIONS
Nazrul Kocy
No ratings yet
LG Rotary Compressor Guide
Document32 pages
LG Rotary Compressor Guide
วรศิษฐ์ อ๋อง
33% (3)
Reflective Essay
Document5 pages
Reflective Essay
bwood17
No ratings yet
Process Flow Chart
Document4 pages
Process Flow Chart
chacko chiramal
No ratings yet
EVERBUILD® EVERFLEX® 565 Clean Room Silicone: Product Data Sheet
Document3 pages
EVERBUILD® EVERFLEX® 565 Clean Room Silicone: Product Data Sheet
samira bashirvand
No ratings yet
Spring 2009 Midterm Opkst Mth601
Document10 pages
Spring 2009 Midterm Opkst Mth601
Khurram Nadeem
No ratings yet
Monorail HC Overhead Track Scale: Technical Manual
Document30 pages
Monorail HC Overhead Track Scale: Technical Manual
Ricardo Vazquez Salinas
No ratings yet
Tips On Fatigue - NAVWEPS 00-25-559
Document123 pages
Tips On Fatigue - NAVWEPS 00-25-559
Mark Evan Salutin
No ratings yet
G650 Unimog UL1750 RAAF Data Summary 0
Document4 pages
G650 Unimog UL1750 RAAF Data Summary 0
Jimmy
50% (2)
Ingles
Document6 pages
Ingles
yarilis
No ratings yet
BMW Inyeccion 320i-325i Motronic M31 PDF
Document2 pages
BMW Inyeccion 320i-325i Motronic M31 PDF
Ivoo oo
No ratings yet
Bradford Protein Assay: Considerations For Use
Document4 pages
Bradford Protein Assay: Considerations For Use
Raja Rajeshwari
No ratings yet
My Heart Sings Praises - Line Up
Document4 pages
My Heart Sings Praises - Line Up
Anthony Jimenez
No ratings yet
Port Equipments
Document21 pages
Port Equipments
Neha Motwani
100% (2)
Kodak Miraclon Plate Brochure
Document2 pages
Kodak Miraclon Plate Brochure
Quinson Benson Co
No ratings yet
Vegan Starter Kit
Document53 pages
Vegan Starter Kit
Gabriela Garcia
No ratings yet
Corn Tastes Better On The Honor System - Robin Wall Kimmerer
Document53 pages
Corn Tastes Better On The Honor System - Robin Wall Kimmerer
tristram59
100% (1)
Works) : SABS 1200
Document10 pages
Works) : SABS 1200
Palesa Tshetlanyane
No ratings yet
Liquid Gold Petroleum's Performance and Successes
Document2 pages
Liquid Gold Petroleum's Performance and Successes
Shubham Dawle
No ratings yet
Lukong Cornelius Fai - Feynman Path Integrals in Quantum Mechanics and Statistical Physics-CRC Press (2021)
Document415 pages
Lukong Cornelius Fai - Feynman Path Integrals in Quantum Mechanics and Statistical Physics-CRC Press (2021)
Vi Kem
100% (1)
Quilling Letter B
Document2 pages
Quilling Letter B
Gueure
No ratings yet
Schizophrenia Symptom Alleviation Through Implementation of A Lifestyle Intervention Program
Document6 pages
Schizophrenia Symptom Alleviation Through Implementation of A Lifestyle Intervention Program
Dr.Ahmad
No ratings yet
Mswin 9
Document388 pages
Mswin 9
KZ
No ratings yet