Professional Documents
Culture Documents
methodologies OSPF
NOS
BGP
PIM-SM
LLDP
OSPF
Fixed
That’s how I
want to build my
Control Plane
network! Protocol Stacks
Requirements BGP
PIM-SM
???
??
That’s how I
want to build my
Control Plane
network! Protocol Stacks
Requirements BGP
PIM-SM
???
??
Data Plane
That’s how
packets should
be processed!
• Efficiency. Reduce energy consumption and expand scale by doing only what you need
E
• Exclusivity and Differentiation. No need to share your IP with the chip vendor
E
• Consortium of academic
and industry members
• Membership is free:
contributions are welcome
• Independent, set up as a
California nonprofit
Operators/
End Users
Systems
Targets
Solutions/
Services
Academia/
Research
• Open source, evolving, domain-specific language • Membership is free: contributions are welcome
• Permissive Apache license, code on GitHub today
Copyright • Independent, set up as a California nonprofit
© 2017 – P4.org
P416 Design Goals
• Logical Evolution of P414
◦ Same basic building blocks and concepts
◦ More expressive and convenient to use
◦ More formally defined semantics
■ Strong Type System
◦ Support for good software engineering practices
• Target-Independence
◦ Support for a variety of targets (ASICs, FPGAs, NICs, software)
■ Language/Architecture separation
■ Flexible data plane model
• Non-goals
◦ New constructs
◦ General-purpose programming
Match+Action
Stage (Unit)
Copyright © 2017 – P4.org 15
PISA in Action
• Packet is parsed into individual headers (parsed representation)
• Headers and intermediate results can be used for matching and actions
• Headers can be modified, added or removed
• Packet is deparsed (serialized)
Match+Action
Stage (Unit)
Copyright © 2017 – P4.org 16
Mapping a Simple L3 Data Plane Program on PISA
IPv4
L2 ACL
IPv6
Match+Action
Stage (Unit)
Copyright © 2017 – P4.org 17
Mapping a More Complex Data Plane Program on PISA
IPv4
L2 MPLS ACL
IPv6
MPLS IPv4
Address
Table Table
Ethernet
MAC ACL
Address IPv4 IPv6 Rules
Table Address Address
Table Table
Match+Action
Stage (Unit)
Copyright © 2017 – P4.org 18
P414 Switch Model
• Ingress Pipeline
• Egress Pipeline
• Traffic Manager
◦ N:1 Relationships: Queueing, Congestion Control
◦ 1:N Relationships: Replication
◦ Scheduling
Implicitly
Programmable Programmable
Parser Deparser
Packet
Queuing,
Replication &
Scheduling
Implicitly
Programmable Programmable
Parser Deparser
Packet
Queuing,
Replication &
Scheduling
Packet
Queuing,
Replication &
Scheduling
Packet
Queuing,
Replication &
Scheduling
Community-Developed Vendor-supplied
Multiple targets can support the same A portable Operating System or library can run on different CPUs
architecture
Community-Developed Vendor-supplied
Programmable Programmable
Parser Deparser
Packet
Queuing,
Replication &
Scheduling
State machine,
Parsers bitfield extraction
Tables, Actions,
Controls control flow
statements
Basic operations
Expressions and operators
Bistrings, headers,
Data Types structures, arrays
struct ingress_input_t {
P4 Platform
portid_t ingress_port;
timestamp_t ingress_timestamp;
}
struct ingress_output_t {
portid_t egress_port;
mgid_t mcast_group; parser_ ingress_ ingress_
input_t input_t output_t
bit<1> drop_flag;
qid_t egress_queue;
}
struct my_metadata_t {
/* Nothing yet */ meta meta
}
standard_metadata
}
9
/* Parser Declaration */ ingress_port
parser MyParser(packet_in packet,
out my_headers_t hdr,
inout my_metadata_t meta,
inout standard_metada_t standard_metadata)
{
. . .
accept reject
state parse_arp {
packet.extract(hdr.arp);
transition select(hdr.arp.htype,
hdr.arp.ptype, Internet Protocol (IPv4) over Ethernet ARP packet
hdr.arp.hlen, octet offset 0 1
hdr.arp.plen) { 0 Hardware type (HTYPE). (Ethernet = 0x0001)
(ARP_HTYPE_ETHERNET, 2 Protocol type (PTYPE) (IPv4 = 0x0800)
ARP_PTYPE_IPV4,
hdr.arp 4 Hardware address length (HLEN) Protocol address length (PLEN)
ARP_HLEN_ETHERNET, 6 Operation (OPER)
ARP_PLEN_IPV4) : parse_arp_ipv4; 8 Sender hardware address (SHA) (first 2 bytes)
default : accept; 10 (next 2 bytes)
} 12 (last 2 bytes)
state parse_arp_ipv4 {
hdr.arp_ipv4 18 Target hardware address (THA) (first 2 bytes)
20 (next 2 bytes)
packet.extract(hdr.arp_ipv4);
22 (last 2 bytes)
transition accept;
24 Target protocol address (TPA) (first 2 bytes)
}
26 (last 2 bytes)
}
. . .
state parse_ipv4 {
packet.extract(hdr.ipv4);
verify(hdr.ipv4.version == 4, error.IPv4BadHeader);
verify(hdr.ipv4.ihl >= 5, error.IPv4BadHeader);
transition select(hdr.ipv4.ihl) {
5: accept;
default: parse_ipv4_options;
}
. . .
state guess_mpls_payload {
transition select(packet.lookahead<ip46_t>().version) {
4 : parse_inner_ipv4;
6 : parse_inner_ipv6;
default : parse_inner_ethernet;
}
}
apply {
tmp = hdr.ethernet.dstAddr;
meta meta
hdr.ethernet.dstAddr = hdr.ethernet.srcAddr;
hdr.ethernet.srcAddr = tmp;
9 9
standard_metadata.egress_spec =
standard_metadata
standard_metadata.ingress_port; ingress_port egress_spec
standard_metadata
} 16
}
resubmit_flag mcast_grp
48 16
clone_spec
ingress_timestamp
resubmit_flag
5
egress_queue
MyIngress() MyEgress()
hdr hdr hdr hdr
Packet 9
9 9
ingress_port egress_spec Replication 16
egress_port
16
resubmit_flag mcast_grp Engine 16
egress_rid
16 clone_id 16
48
clone_spec clone_id
ingress_timestamp 16
resubmit_flag packet_length
drop_flag
5
egress_queue 16
deq_qdepth
action reflect_to_other_port() {
standard_metadata.egress_spec =
standard_metadata.ingress_port ^ 1;
}
action set_ipmcv4_mac_da_2() {
hdr.ethernet.dstAddr[47:24] = 0x01005E;
hdr.ethernet.dstAddr[23:23] = 0;;
hdr.ethernet.dstAddr[22:0] = hdr.ipv4.dstAddr[22:0];
}
(DataPlane)
Parameters
Directional
Headers and Metadata
Lookup Key
Hit
Key Action Action Data
Action
Code
Action
Selector
Hit/Miss
(Action Data)
Directionless
Parameters
Data
192.168.1.4
◦ Performs the lookup
◦ Executes the chosen action
• Control Plane (IP stack,
Key Action Action Data Routing protocols)
192.168.1.1 l3_switch port= mac_da= mac_sa= vlan=
192.168.1.2 l3_switch port= mac_da= mac_sa= vlan=… ◦ Populates table entries with
192.168.1.3 l3_drop specific information
192.168.1.254 l3_l2_switch port=
■ Based on the configuration
■ Based on automatic discovery
192.168.1.0/24 l3_l2_switch port= ■ Based on protocol calculations
10.1.2.0/22 l3_switch_ecmp ecmp_group=
(DataPlane)
Parameters
bit<48> new_mac_da,
Directional
bit<48> new_mac_sa, parameters
bit<12> new_vlan) ◦ Directional (from the Data Plane)
{
/* Forward the packet to the specified port */ ◦ Directionless (from the Control
standard_metadata.metadata.egress_spec = port; Plane)
/* L2 Modifications */
• Actions that are called directly:
hdr.ethernet.dstAddr = new_mac_da; ◦ Only use directional parameters Action
hdr.ethernet.srcAddr = mac_sa; Code
hdr.vlan_tag[0].vlanid = new_vlan;
• Actions used in tables:
◦ Typically use direction-less
/* IP header modification (TTL decrement) */ parameters
(Action Data)
Directionless
Parameters
hdr.ipv4.ttl = hdr.ipv4.ttl – 1;
} ◦ May sometimes use directional
parameters too
action l3_l2_switch(bit<9> port) {
standard_metadata.metadata.egress_spec = port;
}
Action
action l3_drop() {
Execution
mark_to_drop();
}
(DataPlane)
Parameters
Directional
const bit<1> NEXTHOP_TYPE_ECMP = 1;
parameters
struct l3_metadata_t { ◦ Directional (from the Data Plane)
bit<1> nexthop_type;
bit<16> nexthop_index; ◦ Directionless (from the Control
} Plane)
action l3_switch_nexthop(out l3_metadata_t l3_meta,
• Actions that are called directly:
bit<16> nexthop_index) ◦ Only use directional parameters Action
{ Code
l3_meta.nexthop_type = NEXTHOP_TYPE_SIMPLE;
• Actions used in tables:
l3_meta.nexthop_index = nexthop_index; ◦ Typically use direction-less
} parameters
(Action Data)
Directionless
Parameters
action l3_switch_ecmp(out l3_metadata_t l3_meta, ◦ May sometimes use directional
bit<12> ecmp_group) parameters too
{ ■ If present, directional parameters
l3_meta.nexthop_type = NEXTHOP_TYPE_ECMP; always go first
l3_meta.nexthop_index = (bit<16>)ecmp_group;
} Action
Execution
P4_16 has
typecasts too!
table ipv4_host {
key = {
meta.ingress_metadata.vrf : exact;
These are the only
hdr.ipv4.dstAddr : exact; possible actions. Each
} particular entry can have
actions = { only ONE of them.
l3_switch; l3_l2_switch;
l3_drop; noAction;
}
default_action = noAction();
size = 65536; /* Defined in core.p4 */
} action noAction() { }
table ipv4_lpm {
key = {
meta.ingress_metadata.vrf : ternary;
hdr.ipv4.dstAddr : ternary;
}
actions = {
l3_switch;
l3_l2_switch;
l3_switch_nexthop(meta.l3.nexthop_info);
l3_switch_ecmp(meta.l3.nexthop_info);
l3_drop;
noAction;
}
const default_action = l3_l2_switch(CPU_PORT);
size = 16384;
} Explicitly Specified Priority
/* Code */
apply {
assign_vrf.apply();
if (hdr.ipv4.isValid()) {
ipv4_host.apply();
}
}
}
match_kind {
regexp,
range,
fuzzy,
telepathy
}
/* Layer 4 */
packet.emit(hdr.icmp);
packet.emit(hdr.tcp);
packet.emit(hdr.udp);
}
}
table port_vlan {
key = {
standard_metadata.ingress_port : ternary; counter
hdr.vlan_tag[0].isValid() : ternary; ingress_bd_stats
hdr.vlan_tag[0].vid : ternary; 0
}
actions = { set_bd; mark_for_drop; } pkt/byte counts
default_action = mark_for_drop();
}
table port_vlan
Key Action Action Data 8191
apply { ABCD_0123 set_bd bd bd_stat_index A
port_vlan.apply(); set_bd bd bd_stat_index
. . . matched entry set_bd bd bd_stat_index A
set_bd bd bd_stat_index
} set_bd bd bd_stat_index
} set_bd bd bd_stat_index B
BA8E_F007 set_bd bd bd_stat_index
Definition Usage
/* Definition in v1model.p4 */ typedef bit<2> meter_color_t;
table ip_acl {
key = {
ipv4_metadata.lkp_ipv4_sa : ternary;
ipv4_metadata.lkp_ipv4_da : ternary;
l3_metadata.lkp_ip_proto : ternary;
table ip_acl counters = acl_counter l3_metadata.lkp_l4_sport : ternary;
Match Fields Action Sel Action Data Counter l3_metadata.lkp_l4_dport : ternary;
ABCD_xxxx_0123 acl_deny counter A
}
matched entry acl_permit 8b 8b pkt/byte counts actions = {
Tables can have
acl_permit; architecture-specific
acl_deny; properties
BA8E_F007_xxxx nop counter Z }
size = 4096;
counters = acl_counters;
}
apply {
apply {
if (hdr.ipv4.isValid()) { if (hdr.ipv4.isValid()) {
ck = ipv4_checksum.get( hdr.ipv4.hdrChecksum = ipv4_checksum.get(
{ {
hdr.ipv4.version, hdr.ipv4.ihl, hdr.ipv4.version, hdr.ipv4.ihl,
hdr.ipv4.diffserv, hdr.ipv4.totalLen, hdr.ipv4.diffserv, hdr.ipv4.totalLen,
hdr.ipv4.identification, hdr.ipv4.flags, hdr.ipv4.identification, hdr.ipv4.flag,
hdr.ipv4.fragOffset, hdr.ipv4.ttl, hdr.ipv4.fragOffset, hdr.ipv4.ttl,
hdr.ipv4.protocol, hdr.ipv4.srcAddr, hdr.ipv4.protocol, hdr.ipv4.srcAddr,
hdr.ipv4.dstAddr hdr.ipv4.dstAddr
});
});
if (hdr.ipv4.hdrChecksum != ck) {
}
mark_to_drop();
} }
} }
}
}
p4c-bm2-ss test.json
L
o
PRE g
simple_switch (BMv2)
Egress
Ingress
P4
D Debugger
test.json e
b
u
g
Parser Deparser
Packet Packet
generator Port Interface sniffer
veth0..n
Linux Kernel
Copyright © 2017 – P4.org 82
Step 1: P4 Program Compilation
p4c-bm2-ss
test.json
test.jso
test.json veth
n 0 2 4 2n
Linux
Kernel
2n
veth 1 3 5 +1
L
PRE o
g
g
i
Egress
n
BMv2
Ingress
g
test.jso
test.json
n
Parser Deparser
veth0..n
Linux Kernel
Copyright © 2017 – P4.org 85
Step 4: Starting the CLI
$ simple_switch_CLI BMv2 CLI
Program-independent
CLI and Client
TCP Socket
(Thrift)
test.p4 Program-independent
Control Server
L
test.json
PRE o
g
g
i
Egress
n
Ingress
BMv2
g
test.jso
test.json
n Parser Deparser
Port Interface
veth0..n
Linux Kernel
Copyright © 2017 – P4.org 86
Step 5: Sending and Receiving Packets
Program-independent
Control Server
test.jso L
n PRE o
g
g
• scapy i
n
p = Ethernet()/IP()/UDP()/”Payload” • scapy
Egress
BMv2
Ingress
g
sendp(p, iface=“veth0”) sniff(iface=“veth9”, prn=lambda x: x.show())
• Ethereal, etc.. • Wirehark, tshark, tcpdump
Packet Packet
Generator Sniffer
Parser Deparser
Port Interface
veth 0 2 4 2n
Linux
Kernel 2n
veth 1 3 5 +1
1.0.1 05/27/2017 VAG Minor Fixes. Added varbit parsing (IPv4 options)