GPU Computing With Spark and Python

Uploaded by

2IA16 MUHAMMAD APRIENALDY

0% found this document useful (0 votes)

3 views33 pages

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

3 views33 pages

GPU Computing With Spark and Python

Uploaded by

2IA16 MUHAMMAD APRIENALDY

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 33

Search inside document

GPU Computing with

Spark and Python

Aﬁf A. Iskandar
(AI Research Engineer)
My Bio

Aﬁf A. Iskandar
Artiﬁcial Intelligence Research Engineer & Educator

- Artiﬁcial Intelligence Research Engineer @

Unicorn Startup
- Content Creator & Educator @
NgodingPython

Aﬁf A. Iskandar
AI Enthusiast
Bachelor Degree in Mathematics @ Universitas Indonesia
Master Degree in Computer Science @ Universitas Indonesia
Overview

● Why Python ?
● Numba : Python JIT Compiler for CPU and GPU
● PySpark : Distributed Programming in Python
● Hands-On Tutorial
● Conclusion
Why Python ?
Python is Fast
for writing, testing and developing code
Python is Fast
because it’s interpreted, dynamically typed and high level
Python is Slow
For repeated Execution of Low-level task
Python is Slow, Because

● Python is a high level, interpreted and

dynamically-typed language
● Each Python operation comes with a small
type-checking overhead
● With many repeated small operations (e.g. in a
loop), this overhead becomes signiﬁcant!
The paradox ...

what makes Python fast

for development

what makes Python slow

for code execution
Is there another way ?

- Switching languages for speed in your projects can be a

little clunky:
- Sometimes tedious boilerplate for translating data types
across the language barrier
- Generating compiled functions for the wide range of data
types can be difﬁcult
- How can we use cutting edge hardware, like GPUs?
Numba
Compiling Python

● Numba is an open-source, type-specializing compiler for Python

functions
● Can translate Python syntax into machine code if all type information
can be deduced when the function is called.
● Implemented as a module. Does not replace the Python interpreter!
● Code generation done with:
○ LLVM (for CPU)
○ NVVM (for CUDA GPUs).
How Does Numba Work ?
Numba on the CPU
Numba on the CPU
CUDA Kernels in Python
CUDA Kernels in Python
CUDA Kernels in Python
Calling the Kernel from Python
Handling Device Memory Directly
Higher Level Tools : GPU ufuncs
Higher Level Tools : GPU ufuncs
GPU ufuncs Performance
PySpark
What is Apache Spark

● An API and an execution engine for distributed computing on a cluster

● Based on the concept of Resilient Distributed Datasets (RDDs)
○ Dataset: Collection of independent elements (ﬁles, objects, etc) in memory
from previous calculations, or originating from some data store
○ Distributed: Elements in RDDs are grouped into partitions and may be
stored on different nodes
○ Resilient: RDDs remember how they were created, so if a node goes
down, Spark can recompute the lost elements on another node
Computation DAGs

Fig from:
https://databricks.com/blog/2015/06/22/understanding-your-spark-application-through-visualization.html
How Does Spark Scale ?

● All cluster scaling is about minimizing I/O. Spark does this in several
ways:
○ Keep intermediate results in memory with rdd.cache()
○ Move computation to the data whenever possible (functions are
small and data is big!)
○ Provide computation primitives that expose parallelism and
minimize communication between workers: map, ﬁlter, sample,
reduce, …
Python and Spark

● Spark is implemented in Java & Scala on the JVM

● Full API support for Scala, Java, and Python (+ limited support for R)
● How does Python work, since it doesn’t run on the JVM? not counting
IronPython)
Tutorial
Notebook Link : TBA
Conclusion
PySpark and Numba for GPU Clusters

● Numba let’s you create compiled CPU and CUDA functions right
inside your Python applications.
● Numba can be used with Spark to easily distribute and run your code
on Spark workers with GPUs
● There is room for improvement in how Spark interacts with the GPU,
but things do work.
● Beware of accidentally multiplying ﬁxed initialization and compilation
costs.
Thank You

Tribhuvan University Faculty of Humanities and Social Sciences
Document13 pages
Tribhuvan University Faculty of Humanities and Social Sciences
Anjali Pokhrel
100% (4)
BQ270 PartsBook E
Document168 pages
BQ270 PartsBook E
belan_80
No ratings yet
GPU Computing With Apache Spark and Python: April 5, 2016
Document55 pages
GPU Computing With Apache Spark and Python: April 5, 2016
Jyotirmay Sahu
No ratings yet
GPU Computing For Data Science - John Joo
Document34 pages
GPU Computing For Data Science - John Joo
Fabio
No ratings yet
EDA in Spark
Document30 pages
EDA in Spark
Will Gao
No ratings yet
ScilabTec Gpu
Document29 pages
ScilabTec Gpu
eractus883023
No ratings yet
Deep Learning Models at Scale With Apache Spark: May 31, 2019 Asa Sdss
Document50 pages
Deep Learning Models at Scale With Apache Spark: May 31, 2019 Asa Sdss
jannat
No ratings yet
Using Spark On Cori: Lisa Gerhardt, Evan Racah NERSC New User Training
Document14 pages
Using Spark On Cori: Lisa Gerhardt, Evan Racah NERSC New User Training
sonlongho
No ratings yet
PTC Big Data Analysis With ApacheS 27.11-28.11.2019 Handout
Document48 pages
PTC Big Data Analysis With ApacheS 27.11-28.11.2019 Handout
gkuma020.in.ibm.com
No ratings yet
Python Scripting
Document15 pages
Python Scripting
Shishir Gupta
No ratings yet
GPUProgramming Talk
Document18 pages
GPUProgramming Talk
Ramu
No ratings yet
FPGA Tutorial: Monday 07.09.2015 - 14:00
Document61 pages
FPGA Tutorial: Monday 07.09.2015 - 14:00
cbcaribe
No ratings yet
AP P Gis P: Ython Rimer FOR THE Rofessional
Document29 pages
AP P Gis P: Ython Rimer FOR THE Rofessional
Azzouz Fatouh
No ratings yet
High Performance Computing Using Apache Spark
Document10 pages
High Performance Computing Using Apache Spark
Eliezer Beczi
No ratings yet
Cuda-: An Emerging Technology That Can Make Robots Reflex Action Faster
Document11 pages
Cuda-: An Emerging Technology That Can Make Robots Reflex Action Faster
AnikRalhan
No ratings yet
Siv Python
Document19 pages
Siv Python
Sivaranjan Goswami
No ratings yet
Course On: Big Data Analytics
Document52 pages
Course On: Big Data Analytics
munish kumar agarwal
No ratings yet
Python, Performance, and GPUs - Towards Data Science
Document8 pages
Python, Performance, and GPUs - Towards Data Science
Marlon Faria
No ratings yet
Ibrahim Badhusha
Document37 pages
Ibrahim Badhusha
computer kily
No ratings yet
Introduction To Computers and Programming - Chapter1
Document13 pages
Introduction To Computers and Programming - Chapter1
fykk
No ratings yet
58 Merge Sort
Document20 pages
58 Merge Sort
Nikhil Vishwakarma
No ratings yet
Python - Introduction
Document40 pages
Python - Introduction
ANA
No ratings yet
Micro Python
Document21 pages
Micro Python
Giani Buzatu
No ratings yet
My NI Connect Presentation On Compiling Graphical Dataflow 1688998160
Document37 pages
My NI Connect Presentation On Compiling Graphical Dataflow 1688998160
danaherzx
No ratings yet
Rpi Pico Lab View
Document37 pages
Rpi Pico Lab View
DHARANI PATHI
No ratings yet
Utsavbb HTML
Document50 pages
Utsavbb HTML
SAGAR PANCHAL
No ratings yet
Python
Document51 pages
Python
Ali Shana'a
100% (1)
IoT Advance Complete Notes
Document173 pages
IoT Advance Complete Notes
Riya Singh
No ratings yet
C, C++, Java, Python, PHP, JavaScript and Linux For Beginners
Document1,865 pages
C, C++, Java, Python, PHP, JavaScript and Linux For Beginners
Manjunath.R
No ratings yet
Cs Project Grade Xi - Akhil CH
Document50 pages
Cs Project Grade Xi - Akhil CH
akhilnz06
No ratings yet
2013 07 22-Python-CUDA
Document25 pages
2013 07 22-Python-CUDA
DouglasAndrewBrummellIII
No ratings yet
DataAnalytic-03 - Data Analytics Implementation
Document37 pages
DataAnalytic-03 - Data Analytics Implementation
kadnan
No ratings yet
Python Library
Document16 pages
Python Library
Aakansha Saxena
No ratings yet
Gpu Cuda Part2
Document15 pages
Gpu Cuda Part2
Raghav Ganesh
No ratings yet
Spark - ESGI 2020: IPPON 2018
Document25 pages
Spark - ESGI 2020: IPPON 2018
nadir nadjem
No ratings yet
2 IntroPython
Document22 pages
2 IntroPython
Tanvi sharma
No ratings yet
Big Data Tools 2 - Apache Spark With PySpark
Document33 pages
Big Data Tools 2 - Apache Spark With PySpark
Aulia Fiqri Wicaksono
No ratings yet
SPARK Interview Questions
Document12 pages
SPARK Interview Questions
aditya.rana.datascience
No ratings yet
Lecture 1 479 1
Document38 pages
Lecture 1 479 1
Play Store
No ratings yet
BDT Assignment4
Document4 pages
BDT Assignment4
poorvaja.r
No ratings yet
GPGPU
Document139 pages
GPGPU
Cosmic02
No ratings yet
UNIT - 1 Python
Document42 pages
UNIT - 1 Python
Dhanush P S 107
100% (1)
High-Level Design Process Using Intel OpenCL FPGA
Document4 pages
High-Level Design Process Using Intel OpenCL FPGA
jack
No ratings yet
Fundamentos Computacionales: Victor Romero Cano, PHD
Document15 pages
Fundamentos Computacionales: Victor Romero Cano, PHD
Lina Garcia
No ratings yet
1 Introduction
Document45 pages
1 Introduction
R
No ratings yet
Python & Anaconda
Document6 pages
Python & Anaconda
MANISH SIKHWAL
No ratings yet
Iot Unit-Iv PDF
Document20 pages
Iot Unit-Iv PDF
Sireesha Moturi
No ratings yet
Solution Methodology
Document3 pages
Solution Methodology
Arnab Dey
No ratings yet
EnSPy: Python Library For Computations of Ensembles of Particles On GPU
Document41 pages
EnSPy: Python Library For Computations of Ensembles of Particles On GPU
PhtRaveller
No ratings yet
Journeyman Python
Document42 pages
Journeyman Python
Diego Rojas
No ratings yet
Getting Started With Python in Arcgis Feduc2011
Document37 pages
Getting Started With Python in Arcgis Feduc2011
sameerghouri
No ratings yet
VTOL UAV - Electronics Guide
Document10 pages
VTOL UAV - Electronics Guide
Yashu Bhargav
No ratings yet
17BCE10018 Digital Portfolio
Document14 pages
17BCE10018 Digital Portfolio
YAYADY S 18BCY10118
No ratings yet
Lecture 1 Notes
Document33 pages
Lecture 1 Notes
Mohammad Yousaf Jasoor
No ratings yet
Netfpga: Cambridge Spring School: Andrew W. Moore and David Miller Martin Žádník Nadi Sarrar
Document162 pages
Netfpga: Cambridge Spring School: Andrew W. Moore and David Miller Martin Žádník Nadi Sarrar
lexuancong
No ratings yet
Noddy 2
Document30 pages
Noddy 2
Whonoddy Music
No ratings yet
FPGA 2018 P4 Tutorial PDF
Document145 pages
FPGA 2018 P4 Tutorial PDF
Maxim Picconi
No ratings yet
Intro To Python - Compressed MOOCS
Document9 pages
Intro To Python - Compressed MOOCS
prolecsitos
No ratings yet
Circuitpython On Orangepi Linux
Document57 pages
Circuitpython On Orangepi Linux
Hoai Pham Hong
No ratings yet
Mit Openmp Mpi
Document77 pages
Mit Openmp Mpi
Thomas Yue
No ratings yet
PFC SilviaPascual
Document77 pages
PFC SilviaPascual
andres python
No ratings yet
Raspberry Pi Server Essentials
From Everand
Raspberry Pi Server Essentials
Piotr J Kula
No ratings yet
Analysis of Mobile HCI
Document18 pages
Analysis of Mobile HCI
DAS Nusantara
No ratings yet
BPS Tool
Document125 pages
BPS Tool
Anjali9087
No ratings yet
Deploying and Maintaining The Soccer Application
Document32 pages
Deploying and Maintaining The Soccer Application
Maja Peric
No ratings yet
SQP1 12 CS Yk
Document11 pages
SQP1 12 CS Yk
Md Ashfaq
No ratings yet
Python
Document14 pages
Python
Chinna Ediga
No ratings yet
XML Tutorial
Document80 pages
XML Tutorial
chisimdiri
No ratings yet
5.3 ODMG Object Model, ODL, OQL
Document9 pages
5.3 ODMG Object Model, ODL, OQL
nithin priya
No ratings yet
Uts Permodelan Lanjut Teknik Kimia Ui
Document3 pages
Uts Permodelan Lanjut Teknik Kimia Ui
brilly cahyo
No ratings yet
Concurrency: Mutual Exclusion and Synchronization: Operating Systems: Internals and Design Principles
Document69 pages
Concurrency: Mutual Exclusion and Synchronization: Operating Systems: Internals and Design Principles
Amna
No ratings yet
Learn C Programming
Document169 pages
Learn C Programming
Aamer
100% (3)
CSO Gaddis Java Chapter01
Document60 pages
CSO Gaddis Java Chapter01
ẂalaMohammad
No ratings yet
Gcse Cs Student Book Activity Answers v1 2 1
Document231 pages
Gcse Cs Student Book Activity Answers v1 2 1
Kareem Mohamed
No ratings yet
L4 1
Document21 pages
L4 1
Shrushti Bhange
No ratings yet
Manual Testing Syllabus
Document4 pages
Manual Testing Syllabus
Baranishankar
No ratings yet
CSC3A - ST1 Notes
Document8 pages
CSC3A - ST1 Notes
Yatin Badal
No ratings yet
LX Product Family System Requirements - 84
Document31 pages
LX Product Family System Requirements - 84
Maria Rita Ragusa
No ratings yet
Introduction To Spring Framework
Document18 pages
Introduction To Spring Framework
satish.sathya.a2012
No ratings yet
Resume. Updated Sanjana
Document4 pages
Resume. Updated Sanjana
Sanjana P Sanjana P
No ratings yet
Module 4: Systems Security Engineering: Lesson 1: Systems Development Methodologies
Document8 pages
Module 4: Systems Security Engineering: Lesson 1: Systems Development Methodologies
Antonio Brandão
No ratings yet
20BCS3929 Worksheet 10
Document14 pages
20BCS3929 Worksheet 10
Aryan kapoor
No ratings yet
Secret Code Translator: Bachelor of Technology
Document45 pages
Secret Code Translator: Bachelor of Technology
Nagababu Pachhala
No ratings yet
Advantages and Disadvantages of HTML, CSS, Javascript and PHP
Document17 pages
Advantages and Disadvantages of HTML, CSS, Javascript and PHP
Ehsan
No ratings yet
Logcatbb
Document194 pages
Logcatbb
RBXGameS
No ratings yet
MCSD Web Applications ASP Net Courseware
Document202 pages
MCSD Web Applications ASP Net Courseware
eagleni
No ratings yet
Kushalbaraskar (3 8)
Document6 pages
Kushalbaraskar (3 8)
Andrea Weber
No ratings yet
Introduction To Web Services Architecture
Document14 pages
Introduction To Web Services Architecture
manohp1
No ratings yet
SharePoint Development Standards2
Document10 pages
SharePoint Development Standards2
Murali Krishan
No ratings yet
Full Download Test Bank For Systems Analysis and Design 9th Edition Shelly PDF Full Chapter
Document8 pages
Full Download Test Bank For Systems Analysis and Design 9th Edition Shelly PDF Full Chapter
antalgic.sanskrit2ku8s
100% (22)