Professional Documents
Culture Documents
WebAssembly For AI Infra - A Lightweight, Fast, and Secure Alternative Approach
WebAssembly For AI Infra - A Lightweight, Fast, and Secure Alternative Approach
@MileyFu
https://github.com/WasmEdge/WasmEdge
Content
Give talks
Germany
○ Docker Container
Speedups from performance engineering a program that multiplies two 4096-by-4096 matrices.
● Performance Bottlenecks
● Parallelism: GIL ensures that only one thread executes Python
bytecode at a time in a single process
● Memory Management
How about Python + C/C++/Rust?
Portability Issue and Complex Integration
● Maintenance Cost
部分文字引用自https://wasmedge.org/wasm_linux_container) 图片源自https://medium.com/@shivraj.jadhav82/webassembly-wasm-docker-vs-wasm-275e317324a1
Rust + WebAssembly
● Performance and memory
safety
● Concurrency
● Powerful and expressive type
system
● Cargo, a modern package
management tool
● Rapidly growing ecosystem:
ndarray, llm, candle, burn, …
WebAssembly: Lighter faster&more secure
图片源自 https://medium.com/@shivraj.jadhav82/webassembly-wasm-docker-vs-wasm-275e317324a1
https://wasmedge.org/wasm_linux_container/
The technology path of virtualization
What’s next?
ApplicaKon containers
(e.g., Docker)
Hypervisor VM and microVMs
(e.g., Docker)
Hypervisor VM and microVMs
Browser IoT
AI Inference
SaaS Plugin Microservices
Run LLM on your Mac/ Across Devices
Begin with a single line of command to install WasmEdge runDme, with LLM
support.
curl -sSf
https://raw.githubusercontent.com/WasmEdge/WasmEdge/master
/utils/install.sh | bash -s -- --plugin wasi_nn-ggml
OR you could download and copy the WasmEdge install files manually following
the installaDon guide here.
2. Download an LLM Chat App in Wasm
Next, get the ultra small 2MB cross-platform binary – the LLM
chat application. It’s a testament to efficiency, requiring no other
dependencies and offering seamless operation across various
environments. This small Wasm file is compiled from Rust. To
build your own, check out the llama-utils repo.
curl -LO https://github.com/second-state/llama-
utils/raw/main/chat/llama-chat.wasm
3. Download the Llama2 7b Chat Model
curl -LO
https://huggingface.co/wasmedge/llama2/blob/ma
in/llama-2-7b-chat-q5_k_m.gguf
Now that you have everything set up, you can start chatting with your Llama2
7b chat-powered LLM using the command line.
wasmedge --dir .:. --nn-preload default:GGML:AUTO:llama-
2-7b-chat-q5_k_m.gguf llama-chat.wasm
After that, you can ask Llama2 7b chat any questions
That’s it! You can use the same llama-chat.wasm file to run other LLMs,
like OpenChat, CodeLlama, Mistral, etc.
https://www.secondstate.io/run-llm/
LLaMa-2 and WasmEdge on RHEL9
1.Install NVIDIA driver and CUDA tools and libraries (official repo)
2.dnf install gcc-toolset-12 ninja-build
3.git clone https://github.com/WasmEdge/WasmEdge.git
4.cd WasmEdge
5.Drop into the shell with proper paths for the optional toolchain: scl enable gcc-toolset-12 bash
6.Build: cmake -GNinja -Bbuild -DCMAKE_BUILD_TYPE=Release -
DWASMEDGE_PLUGIN_WASI_NN_GGML_LLAMA_CUBLAS="yes" -
DWASMEDGE_PLUGIN_WASI_NN_BACKEND="ggml" && cmake —build build
7.Install: sudo cmake —install build
8.Now you can download the model, the chat app and run it according to the original instruction
https://pushf.substack.com/p/llama-2-and-wasmedge-on-rhel9-aka
Beyond Language AI
Not limited to LLM tasks: vision and audio
as well. Besides the ggml backend,
WasmEdge Runtime supports PyTorch,
TensorFlow, and OpenVINO AI framework.
Discover how you can apply vision and audio AI with projects like
mediapipe-rs.
•Mediapipe-rs: a Rust library for MediaPipe tasks
• Source code: Mediapipe-rs GitHub
• Tutorial: Mediapipe soluDons
hYps://github.com/WasmEdge/mediapipe-rs
Build Serverless AI Apps
-LLM libraries for Rust developers, like ChatGPT, Claude, Llama2 series
-SaaS tools like GitHub, Discord, Telegram, Slack etc.
Serverless:
☑business logic
❎no compilaEon and deployment of Rust funcEons.
Learn Rust Bot (RAG-based)
https://flows.network/learn-rust
https://www.cncf.io/blog/2023/06/06/a-chatgpt-powered-code-reviewer-bot-for-open-source-projects/
Code Review Bot
● Code Review Bot
Keep in Touch
https://github.com/WasmEdge/WasmEdge