GPU Computing With Spark and Python

Uploaded by

2IA16 MUHAMMAD APRIENALDY

0% found this document useful (0 votes)

26 views33 pages

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

26 views33 pages

GPU Computing With Spark and Python

Uploaded by

2IA16 MUHAMMAD APRIENALDY

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 33

Search inside document

GPU Computing with

Spark and Python

Aﬁf A. Iskandar
(AI Research Engineer)
My Bio

Aﬁf A. Iskandar
Artiﬁcial Intelligence Research Engineer & Educator

- Artiﬁcial Intelligence Research Engineer @

Unicorn Startup
- Content Creator & Educator @
NgodingPython

Aﬁf A. Iskandar
AI Enthusiast
Bachelor Degree in Mathematics @ Universitas Indonesia
Master Degree in Computer Science @ Universitas Indonesia
Overview

● Why Python ?
● Numba : Python JIT Compiler for CPU and GPU
● PySpark : Distributed Programming in Python
● Hands-On Tutorial
● Conclusion
Why Python ?
Python is Fast
for writing, testing and developing code
Python is Fast
because it’s interpreted, dynamically typed and high level
Python is Slow
For repeated Execution of Low-level task
Python is Slow, Because

● Python is a high level, interpreted and

dynamically-typed language
● Each Python operation comes with a small
type-checking overhead
● With many repeated small operations (e.g. in a
loop), this overhead becomes signiﬁcant!
The paradox ...

what makes Python fast

for development

what makes Python slow

for code execution
Is there another way ?

- Switching languages for speed in your projects can be a

little clunky:
- Sometimes tedious boilerplate for translating data types
across the language barrier
- Generating compiled functions for the wide range of data
types can be difﬁcult
- How can we use cutting edge hardware, like GPUs?
Numba
Compiling Python

● Numba is an open-source, type-specializing compiler for Python

functions
● Can translate Python syntax into machine code if all type information
can be deduced when the function is called.
● Implemented as a module. Does not replace the Python interpreter!
● Code generation done with:
○ LLVM (for CPU)
○ NVVM (for CUDA GPUs).
How Does Numba Work ?
Numba on the CPU
Numba on the CPU
CUDA Kernels in Python
CUDA Kernels in Python
CUDA Kernels in Python
Calling the Kernel from Python
Handling Device Memory Directly
Higher Level Tools : GPU ufuncs
Higher Level Tools : GPU ufuncs
GPU ufuncs Performance
PySpark
What is Apache Spark

● An API and an execution engine for distributed computing on a cluster

● Based on the concept of Resilient Distributed Datasets (RDDs)
○ Dataset: Collection of independent elements (ﬁles, objects, etc) in memory
from previous calculations, or originating from some data store
○ Distributed: Elements in RDDs are grouped into partitions and may be
stored on different nodes
○ Resilient: RDDs remember how they were created, so if a node goes
down, Spark can recompute the lost elements on another node
Computation DAGs

Fig from:
https://databricks.com/blog/2015/06/22/understanding-your-spark-application-through-visualization.html
How Does Spark Scale ?

● All cluster scaling is about minimizing I/O. Spark does this in several
ways:
○ Keep intermediate results in memory with rdd.cache()
○ Move computation to the data whenever possible (functions are
small and data is big!)
○ Provide computation primitives that expose parallelism and
minimize communication between workers: map, ﬁlter, sample,
reduce, …
Python and Spark

● Spark is implemented in Java & Scala on the JVM

● Full API support for Scala, Java, and Python (+ limited support for R)
● How does Python work, since it doesn’t run on the JVM? not counting
IronPython)
Tutorial
Notebook Link : TBA
Conclusion
PySpark and Numba for GPU Clusters

● Numba let’s you create compiled CPU and CUDA functions right
inside your Python applications.
● Numba can be used with Spark to easily distribute and run your code
on Spark workers with GPUs
● There is room for improvement in how Spark interacts with the GPU,
but things do work.
● Beware of accidentally multiplying ﬁxed initialization and compilation
costs.
Thank You

Raspberry Pi Server Essentials
From Everand
Raspberry Pi Server Essentials
Piotr J Kula
No ratings yet
GPU Computing With Apache Spark and Python: April 5, 2016
Document55 pages
GPU Computing With Apache Spark and Python: April 5, 2016
Jyotirmay Sahu
No ratings yet
GPU Computing For Data Science - John Joo
Document34 pages
GPU Computing For Data Science - John Joo
Fabio
No ratings yet
EDA in Spark
Document30 pages
EDA in Spark
Will Gao
No ratings yet
ScilabTec Gpu
Document29 pages
ScilabTec Gpu
eractus883023
No ratings yet
Deep Learning Models at Scale With Apache Spark: May 31, 2019 Asa Sdss
Document50 pages
Deep Learning Models at Scale With Apache Spark: May 31, 2019 Asa Sdss
jannat
No ratings yet
Using Spark On Cori: Lisa Gerhardt, Evan Racah NERSC New User Training
Document14 pages
Using Spark On Cori: Lisa Gerhardt, Evan Racah NERSC New User Training
sonlongho
No ratings yet
PTC Big Data Analysis With ApacheS 27.11-28.11.2019 Handout
Document48 pages
PTC Big Data Analysis With ApacheS 27.11-28.11.2019 Handout
gkuma020.in.ibm.com
No ratings yet
Python Scripting
Document15 pages
Python Scripting
Shishir Gupta
No ratings yet
GPUProgramming Talk
Document18 pages
GPUProgramming Talk
Ramu
No ratings yet
FPGA Tutorial: Monday 07.09.2015 - 14:00
Document61 pages
FPGA Tutorial: Monday 07.09.2015 - 14:00
cbcaribe
No ratings yet
AP P Gis P: Ython Rimer FOR THE Rofessional
Document29 pages
AP P Gis P: Ython Rimer FOR THE Rofessional
Azzouz Fatouh
No ratings yet
High Performance Computing Using Apache Spark
Document10 pages
High Performance Computing Using Apache Spark
Eliezer Beczi
No ratings yet
Cuda-: An Emerging Technology That Can Make Robots Reflex Action Faster
Document11 pages
Cuda-: An Emerging Technology That Can Make Robots Reflex Action Faster
AnikRalhan
No ratings yet
Siv Python
Document19 pages
Siv Python
Sivaranjan Goswami
No ratings yet
Course On: Big Data Analytics
Document52 pages
Course On: Big Data Analytics
munish kumar agarwal
No ratings yet
Python, Performance, and GPUs - Towards Data Science
Document8 pages
Python, Performance, and GPUs - Towards Data Science
Marlon Faria
No ratings yet
Ibrahim Badhusha
Document37 pages
Ibrahim Badhusha
computer kily
No ratings yet
Introduction To Computers and Programming - Chapter1
Document13 pages
Introduction To Computers and Programming - Chapter1
fykk
No ratings yet
58 Merge Sort
Document20 pages
58 Merge Sort
Nikhil Vishwakarma
No ratings yet
Python - Introduction
Document40 pages
Python - Introduction
ANA
No ratings yet
Micro Python
Document21 pages
Micro Python
Giani Buzatu
No ratings yet
Rpi Pico Lab View
Document37 pages
Rpi Pico Lab View
DHARANI PATHI
No ratings yet
My NI Connect Presentation On Compiling Graphical Dataflow 1688998160
Document37 pages
My NI Connect Presentation On Compiling Graphical Dataflow 1688998160
danaherzx
No ratings yet
Utsavbb HTML
Document50 pages
Utsavbb HTML
SAGAR PANCHAL
No ratings yet
Python
Document51 pages
Python
Ali Shana'a
100% (1)
C, C++, Java, Python, PHP, JavaScript and Linux For Beginners
Document1,865 pages
C, C++, Java, Python, PHP, JavaScript and Linux For Beginners
Manjunath.R
No ratings yet
Cs Project Grade Xi - Akhil CH
Document50 pages
Cs Project Grade Xi - Akhil CH
akhilnz06
No ratings yet
2013 07 22-Python-CUDA
Document25 pages
2013 07 22-Python-CUDA
DouglasAndrewBrummellIII
No ratings yet
DataAnalytic-03 - Data Analytics Implementation
Document37 pages
DataAnalytic-03 - Data Analytics Implementation
kadnan
No ratings yet
Python Library
Document16 pages
Python Library
Aakansha Saxena
No ratings yet
Gpu Cuda Part2
Document15 pages
Gpu Cuda Part2
Raghav Ganesh
No ratings yet
Spark - ESGI 2020: IPPON 2018
Document25 pages
Spark - ESGI 2020: IPPON 2018
nadir nadjem
No ratings yet
2 IntroPython
Document22 pages
2 IntroPython
Tanvi sharma
No ratings yet
Big Data Tools 2 - Apache Spark With PySpark
Document33 pages
Big Data Tools 2 - Apache Spark With PySpark
Aulia Fiqri Wicaksono
No ratings yet
SPARK Interview Questions
Document12 pages
SPARK Interview Questions
aditya.rana.datascience
No ratings yet
Lecture 1 479 1
Document38 pages
Lecture 1 479 1
Play Store
No ratings yet
BDT Assignment4
Document4 pages
BDT Assignment4
poorvaja.r
No ratings yet
GPGPU
Document139 pages
GPGPU
Cosmic02
No ratings yet
UNIT - 1 Python
Document42 pages
UNIT - 1 Python
Dhanush P S 107
100% (1)
High-Level Design Process Using Intel OpenCL FPGA
Document4 pages
High-Level Design Process Using Intel OpenCL FPGA
jack
No ratings yet
1 Introduction
Document45 pages
1 Introduction
R
No ratings yet
Python & Anaconda
Document6 pages
Python & Anaconda
MANISH SIKHWAL
No ratings yet
Fundamentos Computacionales: Victor Romero Cano, PHD
Document15 pages
Fundamentos Computacionales: Victor Romero Cano, PHD
Lina Garcia
No ratings yet
Iot Unit-Iv PDF
Document20 pages
Iot Unit-Iv PDF
Sireesha Moturi
No ratings yet
Solution Methodology
Document3 pages
Solution Methodology
Arnab Dey
No ratings yet
EnSPy: Python Library For Computations of Ensembles of Particles On GPU
Document41 pages
EnSPy: Python Library For Computations of Ensembles of Particles On GPU
PhtRaveller
No ratings yet
Journeyman Python
Document42 pages
Journeyman Python
Diego Rojas
No ratings yet
Getting Started With Python in Arcgis Feduc2011
Document37 pages
Getting Started With Python in Arcgis Feduc2011
sameerghouri
No ratings yet
VTOL UAV - Electronics Guide
Document10 pages
VTOL UAV - Electronics Guide
Yashu Bhargav
No ratings yet
17BCE10018 Digital Portfolio
Document14 pages
17BCE10018 Digital Portfolio
YAYADY S 18BCY10118
No ratings yet
Netfpga: Cambridge Spring School: Andrew W. Moore and David Miller Martin Žádník Nadi Sarrar
Document162 pages
Netfpga: Cambridge Spring School: Andrew W. Moore and David Miller Martin Žádník Nadi Sarrar
lexuancong
No ratings yet
Lecture 1 Notes
Document33 pages
Lecture 1 Notes
Mohammad Yousaf Jasoor
No ratings yet
FPGA 2018 P4 Tutorial PDF
Document145 pages
FPGA 2018 P4 Tutorial PDF
Maxim Picconi
No ratings yet
Intro To Python - Compressed MOOCS
Document9 pages
Intro To Python - Compressed MOOCS
prolecsitos
No ratings yet
Circuitpython On Orangepi Linux
Document57 pages
Circuitpython On Orangepi Linux
Hoai Pham Hong
No ratings yet
PFC SilviaPascual
Document77 pages
PFC SilviaPascual
andres python
No ratings yet
Mit Openmp Mpi
Document77 pages
Mit Openmp Mpi
Thomas Yue
No ratings yet
Resume Aaron Sledge
Document3 pages
Resume Aaron Sledge
api-634511812
No ratings yet
Violent Python: Python in The Dark Side: Darkx Pycon - TW 2013
Document21 pages
Violent Python: Python in The Dark Side: Darkx Pycon - TW 2013
Mauri Del Carmen Martinez de la Cruz
No ratings yet
Types of UML Diagrams
Document18 pages
Types of UML Diagrams
Qais
No ratings yet
GPU Computing With Spark and Python
Document33 pages
GPU Computing With Spark and Python
2IA16 MUHAMMAD APRIENALDY
No ratings yet
PBO Week 08 OOAD Contoh 04
Document2 pages
PBO Week 08 OOAD Contoh 04
2IA16 MUHAMMAD APRIENALDY
No ratings yet
PBO Week 08 OOAD Contoh 03
Document1 page
PBO Week 08 OOAD Contoh 03
2IA16 MUHAMMAD APRIENALDY
No ratings yet
PBO Week 08 OOAD Contoh 04
Document2 pages
PBO Week 08 OOAD Contoh 04
2IA16 MUHAMMAD APRIENALDY
No ratings yet
PBO Week 08 OOAD Contoh 03
Document1 page
PBO Week 08 OOAD Contoh 03
2IA16 MUHAMMAD APRIENALDY
No ratings yet
Types of UML Diagrams
Document18 pages
Types of UML Diagrams
Qais
No ratings yet
GPU Computing With Spark and Python
Document33 pages
GPU Computing With Spark and Python
2IA16 MUHAMMAD APRIENALDY
No ratings yet
3 Year Cybersecurity Career Roadmap
Document21 pages
3 Year Cybersecurity Career Roadmap
akuardit
No ratings yet
Chapter-1 1.1 Overview
Document44 pages
Chapter-1 1.1 Overview
Bollam Pragnya 518
No ratings yet
Complots
Document103 pages
Complots
Jeanjean59
No ratings yet
Storage Devices
Document25 pages
Storage Devices
Rufaro Matanhire
No ratings yet
HCIP-Transmission V2.0 Mock Exam
Document4 pages
HCIP-Transmission V2.0 Mock Exam
Ghallab Alsadeh
No ratings yet
Programming IBM Rational Development Studio For I ILE RPG Programmer's Guide
Document530 pages
Programming IBM Rational Development Studio For I ILE RPG Programmer's Guide
Rubicon
No ratings yet
ACOS 4.1.4-GR1-P5 Configuring Application Firewall: For A10 Thunder Series and AX™ Series 24 August 2020
Document56 pages
ACOS 4.1.4-GR1-P5 Configuring Application Firewall: For A10 Thunder Series and AX™ Series 24 August 2020
Shahzada Imran Attari
No ratings yet
DSP Unit 1 Lecture Notes
Document40 pages
DSP Unit 1 Lecture Notes
gowri thumbur
No ratings yet
Subject Code Title of The Paper 20 BJMP 505 Web Programming Using PHP and Mysql - Lab Iii V 5 90 Course Objectives
Document1 page
Subject Code Title of The Paper 20 BJMP 505 Web Programming Using PHP and Mysql - Lab Iii V 5 90 Course Objectives
riaz ahamed
No ratings yet
Data Warehousing: Defined and Its Applications
Document31 pages
Data Warehousing: Defined and Its Applications
Indumathi K
No ratings yet
CENG400 Midterm Fall 2015
Document10 pages
CENG400 Midterm Fall 2015
Mohamad Issa
No ratings yet
Tps 54310
Document28 pages
Tps 54310
Kholil
No ratings yet
Guide To Microsoft System Center Management Pack For SQL Server 2014 Replication
Document81 pages
Guide To Microsoft System Center Management Pack For SQL Server 2014 Replication
fliphell
No ratings yet
Calling Convention: 18.1 C Datatypes and Alignment
Document4 pages
Calling Convention: 18.1 C Datatypes and Alignment
1
No ratings yet
Operating Systems Session 12 Address Translation
Document16 pages
Operating Systems Session 12 Address Translation
mine
No ratings yet
Revised - Internship Summary Report
Document10 pages
Revised - Internship Summary Report
AKSHAT ANTAL
No ratings yet
Fundamental of IT
Document22 pages
Fundamental of IT
NIRALI
No ratings yet
Test Automation Estimate Template: Automation Type Project Name
Document6 pages
Test Automation Estimate Template: Automation Type Project Name
Santosh Prasad Ulpi
No ratings yet
Pa042003en C400
Document2 pages
Pa042003en C400
indrajob27
No ratings yet
Chapter 11: BGP: Instructor Materials
Document73 pages
Chapter 11: BGP: Instructor Materials
kanthyadav
No ratings yet
DataSheet OPTICTIMES
Document5 pages
DataSheet OPTICTIMES
elvis alvites artica cipriano
No ratings yet
12-Channel Low Quiescent Current LED Driver: Features
Document54 pages
12-Channel Low Quiescent Current LED Driver: Features
mustafa
No ratings yet
Ppl-Unit-4 R16
Document32 pages
Ppl-Unit-4 R16
Mubashareen Shejayath
No ratings yet
Investigating IO Interrupts
Document5 pages
Investigating IO Interrupts
pdparthasarathy03
No ratings yet
Intro To The Linux Command Shell PDF
Document28 pages
Intro To The Linux Command Shell PDF
NJENGA Wagithi
No ratings yet
Red Hat Openstack Platform-16.2-Hyperconverged Infrastructure Guide-En-us
Document22 pages
Red Hat Openstack Platform-16.2-Hyperconverged Infrastructure Guide-En-us
abc abc
No ratings yet
Research Article Sixth Generation (6G) Cognitive Radio Network (CRN) Application, Requirements, Security Issues, and Key Challenges
Document18 pages
Research Article Sixth Generation (6G) Cognitive Radio Network (CRN) Application, Requirements, Security Issues, and Key Challenges
swathi
No ratings yet
Pcode Generator FPGA FPLA2002
Document15 pages
Pcode Generator FPGA FPLA2002
Pablo Silvoni
No ratings yet
Practice Test GR 11
Document2 pages
Practice Test GR 11
Prateek Goyal Computer5
No ratings yet
Ics 2101
Document2 pages
Ics 2101
michael
No ratings yet