This document provides instructions for installing Apache Spark using Docker on Windows. It includes the following steps:
1. Install WSL2 and Docker on your Windows system.
2. Pull the Jupyter/all-spark-notebook Docker image from Docker Hub, which contains Scala, Spark, and Jupyter notebook.
3. Run the Spark container with ports exposed and a local folder mounted, and obtain the Jupyter notebook URL.
4. Use the Jupyter notebook to initialize Spark and run simple Spark code on a test RDD to verify the installation works locally.
This document provides instructions for installing Apache Spark using Docker on Windows. It includes the following steps:
1. Install WSL2 and Docker on your Windows system.
2. Pull the Jupyter/all-spark-notebook Docker image from Docker Hub, which contains Scala, Spark, and Jupyter notebook.
3. Run the Spark container with ports exposed and a local folder mounted, and obtain the Jupyter notebook URL.
4. Use the Jupyter notebook to initialize Spark and run simple Spark code on a test RDD to verify the installation works locally.
This document provides instructions for installing Apache Spark using Docker on Windows. It includes the following steps:
1. Install WSL2 and Docker on your Windows system.
2. Pull the Jupyter/all-spark-notebook Docker image from Docker Hub, which contains Scala, Spark, and Jupyter notebook.
3. Run the Spark container with ports exposed and a local folder mounted, and obtain the Jupyter notebook URL.
4. Use the Jupyter notebook to initialize Spark and run simple Spark code on a test RDD to verify the installation works locally.
WSL 2 Installation • Manual installation steps for older versions of WSL | https://docs.microsoft.com/en-us/windows/wsl/install-manual#step- 4---download-the-linux-kernel-update-packageMicrosoft Docs • Lakukan sampai step 4 Install subsystem untuk linux pada step 4 install Docker Pilih sesuai yang di inginkan Docker desktop installation finish Kembali ke bagian install WSL 2 Docker up and running Run Docker buka command prompt, pastikan command diatas berjalan seperti pada tampilan slide • Windows “cmd” • Linux “terminal” Go to:
notebook Run spark container docker run --rm -p 4040:4040 -p 8888:8888 -p 8080:8080 -v C:\Users\mambo\Documents\mengajar\sparkFolder:/home/jovyan/w ork -e GRANT_SUDO=yes --user root jupyter/all-spark-notebook To access the server, open this file in a browser • file:///home/jovyan/.loca l/share/jupyter/runtime/j pserver-8-open.html • Or copy and paste one of these URLs: http://b53b3edc1e9b:88 88/lab?token=dd845b81 e8cb158faee76c1b849c9 55b34327236b5caea8e • or http://127.0.0.1:8888/lab ?token=dd845b81e8cb15 8faee76c1b849c955b343 27236b5caea8e Load spark kernel %%init_spark #konfigurasi spark untuk lokal launcher.master="local" Jalankan code berikut
val rdd = sc.parallelize(0 to 100)
rdd.sum() Buka spark console http://localhost:4040 val rdd = sc.parallelize(0 to 10000) rdd.sum() Terima kasih