You are on page 1of 88

Your app lives on a network

Networking for web developers

Wim Godden
Cu.be Solutions
@wimgtr
Who am I ?
Wim Godden (@wimgtr)
Where I'm from
Where I'm from
Where I'm from
Where I'm from
Where I'm from
Who am I ?
Wim Godden (@wimgtr)
Founder of Cu.be Solutions (https://cu.be)
Founder of Techpath Training Services (https://techpath.eu)
Open Source developer since 1997
Developer of PHPCompatibility, OpenX, ...
Speaker at PHP and Open Source conferences
Who are you ?
Developers ?

System engineers ?

Network engineers ?

Do you know how the Internet works ?


We’re dev/devops/sysops, not network engineers !
Know enough to build new stuff
Know enough to maintain existing stuff
What if...

Customer Support Desk Dev/devops


Do you know these ?

DNS
Routing table

UDP
TCP BGP

Source port IPv6

IP SYN

ACK Destination port


IPv4

Default gateway
MAC address
Basics : OSI model

Application
yer7 HTTP, DNS, SMTP, ...
La
Presentation
yer6 Serialization, data translation
La
Session
yer5 TLS, L2TP, SOCKS, PPTP, ...
La
Transport
yer4 TCP, UDP, ports, ...
La
Network
yer3 IP adressing
La
Data Link
yer2 Data protocol (ethernet, ...)
La
Physical
yer1 Wires, network card, wireless interface
La
Basics : packets

Physical cable
01011010111010 01011010111010
or wireless
Basics : packets
Packets always consist of :
Header
Contents
Packets contain other packets :
Packet type #1 header
Packet type #1 contents
Packet type #2 header
Packet type #2 contents
Packet type #3 header
etc.
Destination MAC (6 bytes) Source MAC (6 bytes) Type (2 bytes)
Payload (46 – 1500 bytes) CRC (4 bytes)

0-3 4-7 8-11 12-15 16-19 20-23 24-27 28-31


0 Version Header length DSCP ECN Total length

32 Identification Flags Fragment Offset


64 Time To Live Protocol Header Checksum
96 Source IP Address
128 Destination IP Address
160 Options (if required)
< Contents of the packet >

Bit 0-3 4-7 8-11 12-15 16-19 20-23 24-27 28-31


0 Source port Destination port
32 Sequence number
64 Acknowledgment number
96 Data offset Flags Window size
128 Checksum Urgent pointer
160 Options (if required)
< Contents of the packet >
Basics : packets
Part 1 : Ethernet frame
Destination MAC (6 bytes) Source MAC (6 bytes) Type (2 bytes)

Payload (46 – 1500 bytes) CRC (4 bytes)

Part 2 : IPv4 header (min. 160 bytes)


0-3 4-7 8-11 12-15 16-19 20-23 24-27 28-31
0 Version Header DSCP ECN Total length
length
32 Identification Flags Fragment Offset
64 Time To Live Protocol Header Checksum
96 Source IP Address
128 Destination IP Address
160 Options (if required)
< Contents of the packet >

Part 3 : TCP/UDP/… header and data


Basics : TCP packet

Bit 0-3 4-7 8-11 12-15 16-19 20-23 24-27 28-31


0 Source port Destination port
32 Sequence number
64 Acknowledgment number
96 Data Flags Window size
offset
128 Checksum Urgent pointer
160 Options (if required)
< Contents of the packet >
Basics : packets
Part 1 : Ethernet frame
Destination MAC (6 bytes) Source MAC (6 bytes) Type (2 bytes)

Payload (46 – 1500 bytes) CRC (4 bytes)


Sending on a local network Layer 1
Pure forwarding of packets using a hub

Problem :
Multiple devices sending at same time
→ network collision
→ packet retransmit at TTL
Sending on a local network Layer 2
Each network device (port) has a MAC address
Assigned by manufacturer
Can be overwritten (for VM or failover)
Same physical network → send packet to MAC address
Switch knows MAC address(es) of devices and forwards traffic
Sending IP traffic on local network Layer 3
Requires IP addresses
Where to send ? We need to know MAC address
Uses ARP (Address Resolution Protocol) for lookup
16:58:56.933019 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.15 tell 192.168.0.12, length 28
16:58:56.938019 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.0.15 is-at 00:50:56:8b:6a:b7, length 46

Stores IP ↔ MAC relation in ARP table


What’s “local” ?
→ Same IP subnet
OK, what’s a subnet ?
IP addressing (IPv4)
IPv4 addressing = CIDR notation
xxx.xxx.xxx.xxx where 0 <= xxx <= 255
0.0.0.0 → 255.255.255.255

In reality :
8 bits 8 bits 8 bits 8 bits
11000000 00000100 00100000 00000001
192 . 4 . 32 . 1

Total amount of IP addresses available :


256 * 256 * 256 * 256 = 28 * 28 * 28 * 28 = 232 = 4.3 billion

IP networking requires :
IP address
Subnet mask
Subnet mask
Defines the range to which the IP belongs
IPs within the same range can talk to each other directly (local)

IP range : 194.50.97.0 – 194.50.97.255


Subnet mask : 255.255.255.0
or
Subnet mask : /24

→ 194.50.97.5 and 194.50.97.20 are on the same local network


Subnet mask
Typical notation uses a “mask” :
192.168.0.0 → 192.168.0.255 = 192.168.0.0/24
IPv4 provides 232 addresses
A /24 mask gives 2(32-24) or 28 addresses = 256 addresses
Local network ranges :
10.0.0.0/8, 172.16.0.0/12 and 192.168.0.0/16
Given a range 194.7.1.0/24
If you want 8 addresses for servers
/28
2(32-28) = 24 = 2 * 2 * 2 * 2 = 16
Each subnet has 1 network address and 1 broadcast address
Each subnet needs a default gateway
16 – 3 = 13 usable addresses
Subnet = 194.7.1.0/28 or 194.7.1.0/255.255.255.240
This subnet doesn't have to be at the beginning :
194.7.1.16/28, 194.7.1.32/28, etc.
Subnets always start at a multiple of their number of addresses
Combinations make perfect sense too
194.7.1.0/25 = 2^(32-25) = 2^7 = 128 194.7.1.0 -194.7.1.127
194.7.1.128/27 = 2^(32-27) = 2^5 = 32 194.7.1.128-194.7.1.159
194.7.1.160/28 = 2^(32-28) = 2^4 = 16 194.7.1.160-194.7.1.177
194.7.1.178/28 = 2^(32-28) = 2^4 = 16 194.7.1.178-194.7.1.183
A little gem : is an IP inside a range

function ip_in_network($ip, $net_addr, $net_mask){


if ($net_mask <= 0) {
return false;
}
$ip_bin_string = sprintf("%032b", ip2long($ip));
$net_bin_string = sprintf("%032b", ip2long($net_addr));
return (substr_compare($ip_bin_string, $net_bin_string, 0, $net_mask) === 0);
}
IP addressing
“I think there is a world market for maybe five computers”
“640K is more memory than anyone will ever need”
“4.3 billion IP addresses is more than enough”
IP addressing (IPv6)
Created to solve lack of IP addresses (4.3 billion in IPv4)
Standard created in 90s (published in 1998)
Deployed on most major sites, but small sites behind
Addresses :
IPv4 address : 192.168.0.1
IPv6 address : 2001:0db8:0000:0000:0000:0000:0370:7334
Abbreviated : 2001:0db8::0370:7334
Can’t talk to eachother !
Address space :
2128 = 340,282,366,920,938,463,463,374,607,431,770,000,000
Client deployment rates (source : Google) :
Global : 22.24% (13.12% in June 2017)
US : 35.32% (29.78% in June 2017)
Canada : 23.27% (16.58% in June 2017)
Belgium : 53.28% (48.42% in June 2017)
Should you use it ? YES ! (But don’t forget about firewalling !)
Sending IP traffic on local network

MAC for
192.168.0.2 ?

AA:BB:CC:DD:EE:FF
Client Server
192.168.0.15/24 192.168.0.2/24
Let’s talk !
How do IP packets find their way ? → Routing !
Each (Layer 3) network node has a routing table
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.0.1 0.0.0.0 UG 204 0 0 eth1
10.0.0.0 0.0.0.0 255.255.0.0 U 204 0 0 eth1
10.0.64.0 192.168.201.101 255.255.192.0 UG 0 0 0 eth0
192.168.201.0 0.0.0.0 255.255.255.0 U 202 0 0 eth0

Can be viewed easily :


Linux : route or route -n
Windows : route print
Flags :
U = Up
G = Gateway
Non-G routes are routes defined by the network interface
Sending IP traffic to remote device Layer 3
Requires IP addresses
Where to send ?
Can not use ARP : MAC addresses are not shared beyond local network
Uses routing table
Matching route ? Send to the gateway specified
No matching route ? Send to default gateway
Provided by DHCP or
Set statically
Must be on same subnet → address found in ARP table
Sending IP traffic to remote device Layer 3
Requires IP addresses
Where to send ?
Can not use ARP : MAC addresses are not shared beyond local network
Uses routing table
Matching route ? Send to the gateway specified
No matching route ? Send to default gateway
Provided by DHCP or
Set statically
Must be on same subnet → address found in ARP table

MAC for Default gateway


192.168.0.1 ?

AA:BB:CC:DD:EE:FF

Client Router Internet Server


192.168.0.15 192.168.0.1 98.12.31.42 194.7.1.4
Destination : AA:BB:CC:DD:EE:FF
Contents : TCP packet to 194.7.1.4
See ARP table : arp -a
See default gateway : route -n (Lin)
route print (Win)
Basics : TCP packet

Bit 0-3 4-7 8-11 12-15 16-19 20-23 24-27 28-31


0 Source port Destination port
32 Sequence number
64 Acknowledgment number
96 Data Flags Window size
offset
128 Checksum Urgent pointer
160 Options (if required)
< Contents of the packet >
Establishing a TCP connection

Client Server

SYN Sequen
ce no =
1000

SYN ACK
1
nc e no = 100 00
Seque dge no = 90
le
ACK Acknow
Sequenc
Acknowle e no = 1002
Data dge no =
9001
Establishing a TCP connection

Brussels 45ms Montreal

Client Server

SYN 0 Sequen
ce no =
1000

45 SYN ACK
1
nc e no = 100 00
Seque dge no = 90
le
ACK 90 Acknow
Sequenc
Acknowle e no = 1002
Data dge no =
9001
135
Establishing a TCP connection
10ms London
Brussels 45ms Montreal

Client Server

SYN 0 Sequen
ce no =
1000
10
45 SYN ACK
1
nc e no = 100 00
Seque dge no = 90
20 le
ACK 90 Acknow
Sequenc
Acknowle e no = 1002
Data dge no = 30
9001
135
TCP Window Size

Brussels Montreal

Client Server

SYN
SYN ACK
92
rwnd = 81
ACK rwnd = 8192
DATA
384
rwnd = 16

sysctl net.ipv4.tcp_window_scaling
TCP Slow Start

Brussels 45ms Montreal

Client Server

0
45

90
135

180
225
TCP Slow Start

Brussels 45ms Montreal

Client Server

0
45

90
135

180
225
New vs existing connection
Brussels 45ms Montreal

Client Server
SYN 0
45 SYN ACK

ACK 90
GET /url
135
Processing request
235 DATA
(x4)
ACK
(x4) 280
DATA
325
(x8)

ACK
(x8) 370
415
New vs existing connection
Brussels 45ms Montreal

Client Server
DATA 0
GET /url
45
Processing request
DATA
145 (x12)

ACK
(x12) 180

225
TCP Performance
Upgrade to latest Linux kernel or OS
Check window size
Reduce latency (move servers closer to client)
Reuse already established connections
SSL/TLS

Client Server

SYN 0
45 SYN ACK

ACK 90
ServerHello
ClientHello 135 Certificate
ServerHelloDone
ClientKeyExchange
ChangeCipherSpec 180
Finished ChangecipherSpec
225 Finished

DATA 270
315
SSL/TLS with Session Resumption

Client Server

SYN 0
45 SYN ACK

ACK 90
ServerHello
ClientHello 135 ChangecipherSpec
Finished

ChangeCipherSpec
Finished
180
225
DATA

270
315
TLS → HSTS
HSTS = HTTP Strict Transport Security
Remembers that a site is HTTPS-only
Prevents users from going to http:// then redirected to https://
Prevents leaking of session cookies over unsecured wifi
UDP
User Datagram Protocol
Unreliable Datagram Protocol
Connectionless
→ No 3-way handshake required
Simple packet structure
Bit 0-3 4-7 8-11 12-15 16-19 20-23 24-27 28-31
0 Source port Destination port
32 Length Checksum
< Contents of the packet >

Packets might not arrive


Packets might arrive out of order
Ideal for streaming, gaming, ...
TCP/UDP ports
TCP
Bit 0-3 4-7 8-11 12-15 16-19 20-23 24-27 28-31
0 Source port Destination port
32 Sequence number
64 Acknowledgment number
96 Data offset Flags Window size

128 Checksum Urgent pointer


160 Options (if required)
< Contents of the packet >

UDP
Bit 0-3 4-7 8-11 12-15 16-19 20-23 24-27 28-31
0 Source port Destination port
32 Length Checksum
< Contents of the packet >
Source and Destination ports
Destination port : defined by service
HTTP : TCP port 80
HTTPS : TCP port 443
DNS : UDP port 53
Source port : for identification of a connection
5000 80
5001 80
5002 80
Client Server

See active connections with


source/destination ports :
netstat -n
Fetching a website
Need to fetch https://cu.be
TCP doesn’t know what cu.be is
→ needs an IP address
Looks up IP address through DNS
Open a socket
Connect to IP address on port 443
Send HTTPS request over the connection
Get data back
Get images, CSS, javascript over the same connection
Close the connection
Show the webpage
DNS lookups
Through a DNS server
Authoritative : in charge of the domain name
Recursive : asks the authoritative server, then caches for a while
→ Cache time is defined by TTL

Usually you will use a recursive server (owned by your provider)


IP for
cu.be ?
Ask the .be
IP for DNS server
cu.be ? Root DNS
IP for server
Client 194.50.97.38
Recursive cu.be ?
DNS Server Ask the cu
.be
DNS serve
r
IP fo .be DNS
cu.be r
? server
194.5
0.97.
38

cu.be DNS
server
DNS lookups
Actual lookups depend on type of DNS record
DNS holds lots of things :
A record = pointer to IPv4 addresses
AAAA record = pointer IPv6 addresses
CNAME records = aliases for A records
MX records = mail servers
NS records = DNS servers
TXT = various stuff (anti-spam mostly)

2 tools to debug DNS :


dig
nslookup
DNS fallback
Each domain has (should have) at least 2 DNS servers
Order is not important (round robin)
DNS = UDP based (port 53)
→ no acknowledgment
→ timeout after x seconds
→ tries other DNS server(s)
→ Can also work on TCP, but less often used
Sockets
The layer between your application and TCP, UDP, ...
Abstracts syntax
Makes it easy to switch between protocols
Provides an easy interface
No need to know implementation
Send a stream of data → split up in packets
Receive lots of data → converted from packets to string
See open sockets ?
→ netstat (-n)
Packets over the Internet

Client Router Internet Server


192.168.0.15 192.168.0.1 194.7.1.4

BGP = Border Gateway Protocol


BGP protocol decides how packets are routed
Each public network has AS (Autonomous System) number
AS3356 = Level3
AS39628 = Cu.be
BGP announces subnets over BGP to its uplink providers :
“AS39628 here… you can reach 194.50.97.0/24 through me”
BGP routes
BGP routing

Router
Router AS 10
AS 5

Client Router Router


AS 1 AS 52 Server
Router
AS 5
Router
AS 2
BGP routing
Looks up the IP range of destination → AS number
Looks at shortest number of AS hops in BGP routing table
If multiple routes found → calculate based on preference settings
Send packet to BGP gateway
The problem with mobile devices
Mobile devices switch between towers
Good mobile network → no problem
Poor mobile network → IP changes, lost packets, …
Three-way handshake is time consuming for slow connections
→ Use HTTP/2
→ Keep connections active
Apache :
KeepAlive on
KeepAliveTimeout 15
Nginx :
keepalive_timeout 60
Latency + jitter
HTTP
It’s what we use every day ;-)
There’s a “new” version : HTTP/2
Developed by Google as SPDY
Designed for speed
Multiple simultaneous requests/responses in 1 connection
Binary format (pro : more efficient – con : harder to debug)
TLS/SSL encryption is standard
Built-in prioritization
Server Push
Header compression
Try it out
Deploy it !
HTTP/2 – get it running
Apache (v2.4+)
Needs mod_http2
Add “Protocols h2 http/1.1” either globally or to a VirtualHost
Choose a strong SSLCipherSuite !
Nginx (v1.9.5+)
Add “http2” to the listen line
Make sure “ssl_prefer_server_ciphers” is set to on
Make sure the “ssl_ciphers” are set correctly
See IP information
ip addr : shows IPs, MAC addresses, port status, etc.
ifconfig : similar output, but includes packet and byte count
route (-n) : shows routing table
netstat (-n) : shows active connections
netstat -l -p : shows listening ports and processes
tcpdump : command-line based Wireshark
Network trouble example
Customer X
150.000 visits/day

News ticker :
XML feed from other site (owned by same customer)
Cached for 15 min
Customer X – fetching the feed

if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {

unlink(APP_DIR . '/tmp/cacheFile.xml');

file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents('http://www.scrambledsitename.be/xml/feed.xml')
);

$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');

What's wrong with this code ?


Customer X – no feed without the source

Feed source
Customer X – no feed without the source

Feed source
Customer X : timeout
default_socket_timeout : 60 sec by default
Each visitor : 60 sec wait time
People keep hitting refresh → more load
More active connections → more load
Apache hits maximum connections → entire site down
Customer X – fetching the feed

if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {


unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents('http://www.scrambledsitename.be/xml/feed.xml')
);

$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');


Customer X : timeout fix
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
)
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
Customer X : don't delete from cache
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
)
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
Customer X : don't delete from cache
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
)
);
}

$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');


Customer X : don't delete from cache

$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
$feed = file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
$feed
);
}
Customer X : process early

$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
$feed = file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
ParseXmlFeed($feed)
);
}
Customer X : file_[get|put]_contents atomicity

if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {


$feed = file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
ParseXmlFeed($feed)
);
} Relying on user → concurrent requests → possible data corruption
}
Better : run every 15min through cronjob
Network resources
Use timeouts for all :
fopen
curl
SOAP

Data source trusted ?
→ setup a webservice
→ let them push updates when their feed changes
→ less load on data source
→ no timeout issues
Logging → early detection
Dealing with timeouts
Possible options :
Show an error to the user, then bail out
Retry the request
(and bail out if it fails again)
Ignore the timeout if you can
Fall back to a cached version
Don’t show the data you were trying to collect

None of these are perfect, but all of them are better than waiting 60 seconds and then
showing an unhandled error !
Sendig HTTP requests : rights and wrongs
Right :
Use a library
Examples : guzzle/guzzle, rmccue/requests, krisswallsmith/buzz (also available for React),
nategood/httpful

Sort-of-ok :
Using curl

Wrong :
file_get_contents (or similar) on a URL
fsockopen to port 80, then sending ‘GET / HTTP/1.0’, …
Connecting to services
Always handle failures on connection
$link = mysql_connect() or die(mysql_error());
Connecting to services
Always handle failures on connection
Fallback to cache
Fallback to secondary service
At least show a nice error message
Did I mention logging and alerting ?
Another example :
$connection = new AMQPStreamConnection('localhost', 5672, 'guest', 'guest');
$channel = $connection->channel();
Connecting to services
Always handle failures on connection
Fallback to cache
Fallback to secondary service
At least show a nice error message
Did I mention logging and alerting ?
Another example :
$connection = new AMQPStreamConnection('localhost', 5672, 'guest', 'guest');
$channel = $connection->channel();

try {
$connection = new AMQPStreamConnection('localhost', 5672, 'guest', 'guest');
} catch (AMQPTimeoutException $e) {
// Do something nice for the user… they’re your user after all
}
$channel = $connection->channel();
Async for multiple or slow requests
Need multiple pieces of data → handle them asynchronously
PHP has amazing asynchronous libraries
Pthreads
ReactPHP
Icicle
Amp
...
Slow requests → asynchronous again or queue them
RabbitMQ
Zeromq
...
Tools to simulate bad networks - Wanem
Tools to simulate bad networks - Linux
IPTables
iptables -A INPUT -m statistic --mode random --probability 0.1 -j DROP
iptables -A OUTPUT -m statistic --mode random --probability 0.1 -j DROP

TC (Traffic Control)
tc qdisc add dev eth0 root netem delay 50ms 20ms distribution normal
tc qdisc change dev eth0 root netem reorder 0.02 duplicate 0.05 corrupt
0.01

Comcast (https://github.com/tylertreat/comcast)
“Simulating shitty network connections so you can build better systems”
Uses IPTables + TC in an intelligent way
If your data room looks like...
It can be done
Failover, disaster recover are great...
… if they work !

Should be tested at least once per year


If it doesn’t work, top priority to fix it
Includes :
Network failover
Network configuration recovery from backup
System failover
System restore from backup
Questions ?
Questions ?
Contact
Twitter @wimgtr
Slides http://www.slideshare.net/wimg
E-mail wim@cu.be

Thanks !

You might also like