You are on page 1of 23

Apache Spark with Docker

Achmad Ginanjar @2022


To do list

Install WSL2

Install Docker

Install apache spark


WSL 2 Installation
• Manual installation steps for older versions of WSL |
https://docs.microsoft.com/en-us/windows/wsl/install-manual#step-
4---download-the-linux-kernel-update-packageMicrosoft Docs
• Lakukan sampai step 4
Install
subsystem
untuk linux
pada step 4
install Docker
Pilih sesuai
yang di
inginkan
Docker
desktop
installation
finish
Kembali ke bagian install WSL 2
Docker up and
running
Run Docker
buka command prompt, pastikan command diatas berjalan seperti pada tampilan
slide
• Windows “cmd”
• Linux “terminal”
Go to:

Docker Hub

https://hub.docker.com/r/jupyter/all
-spark-notebook/#!
Installation

docker pull jupyter/all-spark-


notebook

Install scala, spark, jupyter


notebook
Run spark container
docker run --rm -p 4040:4040 -p 8888:8888 -p 8080:8080 -v
C:\Users\mambo\Documents\mengajar\sparkFolder:/home/jovyan/w
ork -e GRANT_SUDO=yes --user root jupyter/all-spark-notebook
To access the
server, open this
file in a browser
• file:///home/jovyan/.loca
l/share/jupyter/runtime/j
pserver-8-open.html
• Or copy and paste one of
these URLs:
http://b53b3edc1e9b:88
88/lab?token=dd845b81
e8cb158faee76c1b849c9
55b34327236b5caea8e
• or
http://127.0.0.1:8888/lab
?token=dd845b81e8cb15
8faee76c1b849c955b343
27236b5caea8e
Load spark
kernel
%%init_spark
#konfigurasi spark untuk lokal
launcher.master="local"
Jalankan code berikut

val rdd = sc.parallelize(0 to 100)


rdd.sum()
Buka spark
console
http://localhost:4040
val rdd = sc.parallelize(0 to 10000)
rdd.sum()
Terima kasih

You might also like