Professional Documents
Culture Documents
Building Tomorrow’s
Networking Data Plane
3
IPv6 could address both scale and
performance requirements
4
“The Cilium Experiment”
5
“The Cilium Experiment”
Source: https://www.slideshare.net/ThomasGraf5/cilium-fast-ipv6-container-networking-with-bpf-and-xdp 6
“The Cilium Experiment”
7
“The Cilium Experiment”
8
“The Cilium Experiment”
9
“The Cilium Experiment”
(2016)
Source: https://www.slideshare.net/ThomasGraf5/cilium-fast-ipv6-container-networking-with-bpf-and-xdp 10
“The Cilium Experiment”
11
K8s and IPv6
12
K8s and IPv6
13
K8s and IPv6
14
K8s and IPv6
15
K8s and IPv6
17
18
K8s and IPv6
19
K8s and IPv6
Source: https://whynoipv6.com/ 20
K8s and IPv6
21
K8s and IPv6
Source: https://github.com/community/community/discussions/10539 22
K8s and IPv6
23
K8s and IPv6
25
K8s, IPv6 and Cilium
26
K8s, IPv6 and Cilium: Stateful NAT46
A 1.2.3.4 DNS
ext. node
SYN 1.2.3.4 K8s LB svc: [f00d::1]:y
SYN f00d::1
IPv4 / Internet
NAT46
1.2.3.4:x -> [f00d::1]:y
27
(dual stack)
K8s, IPv6 and Cilium: Stateful NAT46
A 1.2.3.4 DNS
ext. node
SYN 1.2.3.4 K8s LB svc: [f00d::1]:y
SYN f00d::1
$ cilium service list
IPv4 / Internet ID Frontend Service Type Backend
NAT46
22 [f00d::1]:80 LoadBalancer 1 => [2001:db8:42::30b8]:80 (active)
1.2.3.4:x -> [f00d::1]:y
2 => [2001:db8:42::a07f]:80 (active)
28
(dual stack)
K8s, IPv6 and Cilium: Stateful NAT46
30
K8s, IPv6 and Cilium: Stateful NAT46
31
K8s, IPv6 and Cilium: Stateful NAT46
32
K8s, IPv6 and Cilium: Stateful NAT46
34
K8s, IPv6 and Cilium: Stateful NAT46
36
K8s, IPv6 and Cilium: Stateless NAT46
A 1.2.3.4 DNS
ext. node
SYN 1.2.3.4 K8s LB svc: [64:ff9b::1.2.3.4]:x
SYN 64:ff9b::1.2.3.4
IPv4 / Internet
NAT46
1.2.3.4 -> [64:ff9b::1.2.3.4]
37
(Dual Stack)
K8s, IPv6 and Cilium: Stateless NAT46
A 1.2.3.4 DNS
ext. node
SYN 1.2.3.4 K8s LB svc: [64:ff9b::1.2.3.4]:x
SYN 64:ff9b::1.2.3.4
$ cilium service list
IPv4 / Internet ID Frontend Service Type Backend
NAT46
22 [64:ff9b::102:304]:80 LoadBalancer 1 => [2001:db8:42::30b8]:80 (active)
1.2.3.4 -> [64:ff9b::1.2.3.4]
2 => [2001:db8:42::a07f]:80 (active)
38
(Dual Stack)
K8s, IPv6 and Cilium: Stateless NAT46
39
K8s, IPv6 and Cilium: Stateless NAT46
RFC6052: https://www.rfc-editor.org/rfc/rfc6052#section-2.1 40
K8s, IPv6 and Cilium: Stateless NAT46
42
K8s, IPv6 and Cilium: Stateless NAT46
43
K8s, IPv6 and Cilium: Stateless NAT46
44
45
K8s, IPv6 and Cilium
46
K8s, IPv6 and Cilium: Stateless NAT64 + DNS64
47
K8s, IPv6 and Cilium: Stateless NAT64 + DNS64
48
K8s, IPv6 and Cilium: Stateless NAT64 + DNS64
49
K8s, IPv6 and Cilium: Stateless NAT64 + DNS64
140.82.121.4
(Embedded IPv4 address)
51
K8s, IPv6 and Cilium: Stateless NAT64 + DNS64
https://coredns.io/plugins/dns64/
54
K8s, IPv6 and Cilium: Stateless NAT64
55
K8s, IPv6 and Cilium: Stateless NAT64
56
K8s, IPv6 and Cilium: Stateless NAT64
57
K8s, IPv6 and Cilium
58
59
60
K8s and IPv6
61
K8s and IPv6
62
IPv6, Cilium: BIG TCP
65
IPv6, Cilium: BIG TCP
66
IPv6, Cilium: BIG TCP App
TCP/UDP TCP/UDP
Routing Routing
tc eBPF tc eBPF
GRO Qdiscs
XDP/eBPF GSO
1.5k
NIC NIC
LRO TSO 1.5k
67
IPv6, Cilium: BIG TCP
68
IPv6, Cilium: BIG TCP App
TCP/UDP TCP/UDP
Routing Routing
69
IPv6, Cilium: BIG TCP
70
IPv6, Cilium: BIG TCP
Ethernet header
TCP header
Payload (aggregated)
TCP/UDP TCP/UDP
tc eBPF tc eBPF
GRO Qdiscs
XDP/eBPF GSO
1.5k
NIC NIC
LRO TSO 1.5k
74
IPv6, Cilium: BIG TCP
75
IPv6, Cilium: BIG TCP
Back to back: AMD Ryzen 9 3950X @ 3.5 GHz, 128G RAM @ 3.2 GHz, PCIe 4.0, ConnectX-6 Dx, mlx5 driver 78
netperf -t TCP_RR -H <remote pod> -- -r 80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT
IPv6, Cilium: BIG TCP
Back to back: AMD Ryzen 9 3950X @ 3.5 GHz, 128G RAM @ 3.2 GHz, PCIe 4.0, ConnectX-6 Dx, mlx5 driver 79
netperf -t TCP_RR -H <remote pod> -- -r 80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT
IPv6, Cilium: BIG TCP
Back to back: AMD Ryzen 9 3950X @ 3.5 GHz, 128G RAM @ 3.2 GHz, PCIe 4.0, ConnectX-6 Dx, mlx5 driver, LRO on 80
IPv6, Cilium: BIG TCP
Back to back: AMD Ryzen 9 3950X @ 3.5 GHz, 128G RAM @ 3.2 GHz, PCIe 4.0, ConnectX-6 Dx, mlx5 driver, LRO on 81
IPv6, Cilium: BIG TCP
83
Cilium: Tomorrow’s Data Plane
Key features of Cilium’s high-scale, opinionated dataplane (extract):
86
Cilium: meta devices for eBPF
Upper stack
(IP, netfilter,
routing, …)
veth veth
87
Cilium: meta devices for eBPF
Upper stack
(IP, netfilter,
routing, …)
bpf_redirect_peer()
veth veth
bpf_redirect_peer()
veth veth
Upper stack
(IP, netfilter,
routing, …)
bpf_redirect_neigh()
veth veth
bpf_redirect_neigh()
veth veth
Upper stack
(IP, netfilter,
routing, …)
veth veth
Upper stack
(IP, netfilter,
routing, …)
meta meta
98
Cilium: meta devices for eBPF Pod with
meta:
Internals for veth (today):
99
Cilium: meta devices for eBPF Pod with
meta:
Internals for meta (new):
Pod with veth:
meta_xmit()
- scrubs packet meta data
- switches netns to host
- Cilium’s eBPF prog called for meta
- Redirect to phys dev directly
without backlog queue
100
Cilium: meta devices for eBPF
Back to back: AMD Ryzen 9 3950X @ 3.5 GHz, 128G RAM @ 3.2 GHz, PCIe 4.0, ConnectX-6 Dx, mlx5 driver 101
netperf -t TCP_RR -H <remote pod> -- -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT
Cilium: meta devices for eBPF
Back to back: AMD Ryzen 9 3950X @ 3.5 GHz, 128G RAM @ 3.2 GHz, PCIe 4.0, ConnectX-6 Dx, mlx5 driver 102
netperf -t TCP_RR -H <remote pod> -- -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT
Cilium: meta devices for eBPF
throughput on par
with host
Back to back: AMD Ryzen 9 3950X @ 3.5 GHz, 128G RAM @ 3.2 GHz, PCIe 4.0, ConnectX-6 Dx, mlx5 driver 103
netperf -t TCP_STREAM -H <remote pod> -l 60
Cilium: Tomorrow’s Data Plane - Summary
Kube-proxy replacement
Bandwidth Manager
IPv6 BIG TCP meta devices
meta meta
eBPF host routing
104
Acknowledgements
➔ Eric Dumazet
➔ Coco Li
➔ Yuchung Cheng
➔ Martin Lau
➔ John Fastabend
➔ K8s, Cilium, BPF & netdev kernel community
105
Thank you! Questions?
107