You are on page 1of 23

Node Resource Interface

Extensible resource interface for containers

@crosbymichael - Apple
The Problem…
Resource Management
Cgroups and Topology

• Workload Performance Requirements

• Batch

• Latency Sensitive

• Customer Requirements

• SLA/SLO

• My workload is P1!
Resource Management
Cgroups and Topology

• CPU

• Schedule across cores

• Hyperthreads

• Numa

• Allocation of an entire node

• L3 Cache

• Hugepages
Resource Management
Cgroups and Topology

• DPDK

• VM Isolation

• Proximity

• GPU

• Network
Large Matrix
Current Solutions
Kubelet

• CPU Manager

• Few KEPs proposing improvements and extensions

• Weird UX

• Requests == limits OK!

• Off by default

• Topology Manager

• Hint providers
Current Solutions
Intel CPU Manager for Kube

• CMK cli

• Manages topology and pinning of resources

• CRI Resource Manager

• Implements CRI to create an interface for Kube

• Intel specific labels


QoS Is hard
Everyone has different
requirements
Focus on APIs not implementations!
Networking
Container Network Interface

• Simple

• Elegant

• Extensible

• Composable

• No controversy in the design that I know of


Let's make “CNI” for Resources
NRI
Because CRI was already taken :(

• Kubelet is not the right abstraction for this

• The lines between kubelet and CRI are getting too blurry

• Hook into the lifecycle of containers at the CRI level

• CRI implementations like containerd are robust and know how to interface
with the underlying host systems

• CRIs already support CNI for networking


NRI
Config

"version": "0.1",
"plugins": {
"konfine": {
"systemReserved": [0,1]
}
}
}
NRI
Skeleton

package main

import (

"context"
"fmt"
"os"

"github.com/containerd/containerd/pkg/nri/skel"
"github.com/sirupsen/logrus"
)

func main() {
ctx := context.Background()

if err := skel.Run(ctx, &konfine{}); err != nil {


fmt.Fprintf(os.Stderr, "%s", err)
os.Exit(1)
}

}
NRI
Integration in a CRI

if _, err := nri.Invoke(ctx, task, "create"); err != nil {


task.Delete(ctx, containerd.WithProcessKill)
container.Delete(ctx, containerd.WithSnapshotCleanup)
return errors.Wrap(err, "nri invoke")
}

defer func() {
if _, err := nri.Invoke(ctx, task, "delete"); err != nil {
fmt.Println(err)
}
}()

if err := task.Start(ctx); err != nil {


return err
}
konfine
konfine
Dynamic Topology and QoS Management Plugin
konfine

• Builds a dynamic node topology

• Dynamic placement of workloads based on QoS class

• NUMA Support

• Supports default (batch) and latency sensitive


konfine

• No need to wait for kube release cycle

• Can be updated independently

• Chain multiple plugins together

• Keeps this small and doing one thing and one thing well

• If it does not work for you

• Fork it, changed it, make it your own

• Build more plugins fo r your needs


Next Steps

• Formal Spec Proposal

• Demo plugins

• Containerd implementation for ctr and CRI


Thanks!

You might also like