Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
P. 1
Numerical Methods Implementation on CUDA

Numerical Methods Implementation on CUDA

Ratings: (0)|Views: 931|Likes:

Availability:

See more
See less

01/16/2013

pdf

text

original

APROJECT REPORT
on
Numerical MethodsImplementation On CUDA
submitted for partial fulﬁllment for the degree of Bachelor of Technology
in
Department of Computer Engineering
(2007-11)Supervisor: Dr. Vijay Laxmi Ankur Sharma (2007UCP132)Nihar Amin (2007UCP161)Praveen Khokher (2007UCP157)Shehjad Khan (2007UCP113)
MALAVIYA NATIONAL INSTITUTE Of TECHNOLOGY, JAIPUR
May 2011

Contents
1.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Thread Level Heirarchy. . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Memory Level Heirarchy. . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Matrix proves to be advantageous in the implementation of following logics:-62.3 Sequential matrix-multiplication:. . . . . . . . . . . . . . . . . . . 62.4 Parallel matrix-multiplications on CUDA:-. . . . . . . . . . . . . . 62.4.1 Implementation:. . . . . . . . . . . . . . . . . . . . . . . . . 72.5 Kernel Speciﬁcations:. . . . . . . . . . . . . . . . . . . . . . . . . . 72.6 Salient Features:. . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.7 Limitations:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.8 Observations:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.9 Conclusions:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2 Sequential Preﬁx-sum algorithm:. . . . . . . . . . . . . . . . . . . 123.3 Parallel Preﬁx-Sum On CUDA:. . . . . . . . . . . . . . . . . . . . 123.3.1 Implementation-. . . . . . . . . . . . . . . . . . . . . . . . 123.4 Kernel Speciﬁcations:. . . . . . . . . . . . . . . . . . . . . . . . . . 123.5 Salient Features:. . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.6 Limitations:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.7 Observations:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.8 Conclusions:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.2 Parallel Bitonic-Sort On CUDA:. . . . . . . . . . . . . . . . . . . . 184.3 Salient Features:. . . . . . . . . . . . . . . . . . . . . . . . . . . . 19i