Winsock Networking Tutorial (C++)

Networking introduction
First I will give you an introduction to basic networking principles and terms. Anyone with internet
access will have some knowledge about networks, servers, clients, but to ensure you know enough
to program with it I've included this chapter. You won't need all the details mentioned here when
programming winsock, but it's good to know something about the underlying techniques.
1. Networks and protocols
You probably already know what a network is, it's a collection of computers connected to each
other so they can exchange data. There are several types of networks, such as LANs (Local Area
Network), WANs (Wide Area Network) and of course the internet. To ensure that all traffic is going
smoothly, networks rely on protocols:
Protocol
A protocol is a set of rules describing the format in which data is transmitted
over a network.
As stated in the information box above, a protocol describes how to communicate over a network.
It can be compared with a human language: at the lowest level nearly everyone can make and
hear sounds (compare: electronic signals) but people won't understand each other unless they
speak a according to a specific language they both understand (compare: protocol).
2. Ethernet
Networks rely on several protocol layers, each one having its own task in the communication
process. A very commonly used configuration is the ethernet LAN with TCP/IP. In ethernet LANs,
computers can be connected using coaxial, twisted pair (UTP) or optic fiber cables. Nowadays, for
most networks, UTP cables are used. WANs and the internet (partly a combination of many WANs)
use many of the techniques used in ethernet LANs, so I will discuss ethernet LAN technology first.
MAC
The lowest layer of ethernet is the hardware level, called the Media Access Layer, or MAC for
short. This layer can be a network card, for example, which contains the serial network interface
and controller that take care of converting the raw data into electronic signals and sending it to
the right place.
Package that are sent over a network of course need to reach their destination. So there has to
be some kind of addressing. Various levels of the ethernet interface have different addressing
methods, as you will see later. At the lowest MAC level, addressing is done with MAC numbers.
MAC number
48-bit identifier that is hardcoded into each network interface unit. The
allocation of these numbers is done by the IEEE Registration Authority so each
ethernet chip has a world wide unique number (that is, if the manufacturer
didn't mess up :). MAC numbers are often noted as colon-separated hex
numbers: 14:74:A0:17:95:D7.
To send a packet to another network interface, the packet needs to include its MAC number. LANs
use a very simple method to send the packets to the right interface: broadcasting. This means
that your network card just shouts the package to every other interface it can reach. Each
Winsock Networking Tutorial (C++)
2010 by Thomas Bleeker (MadWizard)
Winsock C++ programming tutorial Page 1 / 46 Page 1 / 46
receiving interface looks at the destination MAC number of the packet, and only buffers it if
matches its own MAC number. While this method is easy to implement and quite effective on LANs,
bigger networks (WANs, internet) don't use this method for obvious reasons; you wouldn't want
everyone on the internet to send packets to everyone else on the internet. WANs use better
routing mechanisms, which I won't discuss here. Just remember that at the lowest level, addressing
is done with MAC numbers. Ethernet packets also include a CRC and error detection.
IP
Just above the hardware level is the IP level. IP simply stands for Internet Protocol. Just like the
MAC layer, IP too has its own way of addressing:
IP number
The numbers used to address at the IP level of the network interface. IPv4,
the version most widely used uses 32-bit values, noted in the well known
dotted format: 209.217.52.4. Unlike MAC numbers, IP numbers are not
hardcoded into the hardware, they are assigned to it at software level.
IP numbers shouldn't be something strange to you. The internet uses them to uniquely identify a
specific computer. IP addresses can be assigned to a network interface using software. Doing this
associates the IP number with the MAC address of the network interface. To address using IP
numbers, the associated MAC number needs to be resolved. This is done with the ARP (Address
Resolution Protocol). Each host maintains a list with pairs of IP and MAC numbers. If an IP is used
without a matching MAC number, the host sends out a query packet to the rest of the LAN. If any
of the other computers in the LAN recognize their IP number, it sends back the corresponding MAC
number. If no matching MAC number can be found the packet is sent to the gateway, a computer
that forwards packages to external networks. The IP to MAC conversion is actually done at the
data link layer (MAC layer)
The IP protocol adds the source and destination address (IP numbers) to the packet, as well as
some other package properties such as the TTL hops (time to live hops), the protocol version
used, header checksum, sequence count and some more fields. They are not important to us so I
won't explain them in detail.
TCP
The next layer is the TCP layer (or alternatively, the UDP layer). This layer is very close to the
network application and deals with many things. As final addition to the addressing, TCP adds a
port number to the package:
Port number
While IP numbers are used to address a specific computer or network device,
port numbers are used to identify which process running on that device should
receive the package. Port numbers are 16-bit, and thus limited to 65536
numbers. A process can register to receive packets sent to a specific port
number ('listening'). A notation often used when addressing a port number on
a device is 'IP:portnumber', eg. 209.217.52.4:80. Both sides of a connection
use a port number, but not necessarily the same.
Many port numbers are WKP (Well Known Ports), that is they are commonly associated with a
specific service. For example, the WWW uses port 80 by default, FTP uses port 21, e-mail uses 25
(SMTP) and 110 (POP). Although these are the ports usually used for those services, nobody
Networking introduction - MadWizard.org
prevents you from using different ports. However, it's a good practice to use port numbers higher
than 1024 for other, custom services.
While the IP layer doesn't care about the success of transmissions, TCP does. The TCP layer
ensures data does arrive, and correctly. It also lets the receiver control the data flow, ie. the
receiver can decide when to receive data. If a package is lost during the way to its destination,
TCP resends the package. TCP also reorders the packages if they arrive in an order different from
the original order. This makes the programmer's life easy as it can safely assume the data that is
sent is received and in the right order. UDP, an alternative for TCP, does not have these features
and cannot guarantee the arrival of packages. TCP is connection-oriented, and the best choice for
continuous data streams. UDP on the other hand is connectionless, and packet oriented. I won't
deal with UDP in this tutorial.
Software
Finally, above the TCP layer is the network software. In windows, your application does not
directly access the TCP layer but uses the WinSock API. The software layer provides a very
convenient way of dealing with networking. Thanks to all the underlying layers, you don't need to
worry about packets, packet size, data corruption, resending of lost packets etc.
3. The ethernet interface stack
The image above shows the encapsulation of the each protocol in the ethernet interface stack. It
all starts with the software layer, which has a piece of data that it wants to send over the
network. Even this data usually has a format (eg. HTTP, FTP protocols), although not shown in the
image. The user data first gets a TCP header including the source and destination port number.
Then the IP header is added, containing the source and destination IP address. Finally the data link
layer adds the ethernet header, which specifies the MAC numbers of the source and destination.
This is the data that is actually sent over the wires. As you can see there's a lot of overhead in an
TCP/IP package. The overhead can be minimized by choosing a large enough data size for the
package. Luckily winsock will arrange this for you.
Networking continued
Now that you know the basic layers of the network interface, I will continue with some other
principles concerning hostnames, connections and software level protocols.
1. DNS
DNS stands for Domain Name System, which accounts for the conversion of hostnames to and
from IP numbers. Because IP numbers are not easy to remember (well not many at
least), another more convenient naming system was created. Now, instead of an IP number, you
could use a hostname alternatively. Examples of hostnames are: madwizard.org,
somepc.someuniversity.edu, www.google.com, etc. Anyone browsing the internet has used them.
When connecting to a website, its IP is needed. So if you enter a hostname like
www.google.com, it first needs to lookup the corresponding IP number of google. This is where
DNS comes in. Your PC sends out a hostname lookup request to the DNS your provider has setup
in its network. If the DNS can resolve the hostname, it sends back the corresponding IP to you.
DNS are organized in a hierarchical way, forwarding unresolvable hostnames to a DNS at a higher
level, until the hostname is resolved.
2. Connections
TCP/IP is a connection-oriented protocol. The connection is always between two devices, and
each side uses its own IP and port number. Usually, one side is called the client, the other side
the server.
The client is the one that requests something, the server responses accordingly. For example,
when opening a website, the browser is the client, the webserver is the server. The browser
initiates the connection with the server and requests a specific resource. The server then sends
back a response and the data requested.
The server is continually waiting for incoming connections. This is called listening, which is
always done on a certain IP and port number. The client is only active when necessary, as the
client is always the initiator of a connection and the one that requests information. To create a
connection, the client needs to know both the IP and port number the server is listening on. A
connection is made to that server and hopefully accepted by the server. While communication
over a TCP/IP connection is two-way, many protocols (HTTP, FTP, etc) let the client and server
interact in turn.
Both the server and client side use an IP and port number, but the IP and port number of the
server are usually fixed. The standard port for the WWW is 80 (using HTTP). Google for example,
is a webserver that runs on port 80 and IP 216.239.39.101 (at the moment of writing). Each
client (read: anyone google-ing :) connects to this IP and port. So the webserver can have many
connections on the same port. This is no problem, since all traffic on that port is for the same
process. On the client side, the port number doesn't matter. Any port can be used. Some people
think that the port number used in a connection needs to be the same on both sides. This is not
true. Just open a website and quickly run 'netstat -an' in a command line. You might see a line
like this:
TCP xxx.xxx.xxx.xxx:2894 216.239.39.101:80 ESTABLISHED
xxx.xxx.xxx.xxx was my IP, 216.239.39.101 is google's IP. The number after the colon is the port
number. As you can see, the server side uses port 80, while the client uses a random (read: some
free) port number like 2894. Each client connection needs a different port number on the client
side, since every connection is associated with a different client.
Client
The program that initiates the connection, and requests information.
Server
The program that listens for incoming connections, accepts them and
responses according to the received requests. The IP and port number of the
server need to be known by the client to connect to it.
3. Protocols again
In the previous chapters I have showed several protocols at the different levels of a network
interface. The protocols I didn't discuss yet are the protocols that work at software level.
Examples of these are HTTP, FTP, POP3, SMTP. Most of them work in a client-server way, ie. the
client makes requests, the server responds. The exact format of the requests and responses are
described in these protocols. I won't discuss them further right now, but I will later when you
know the winsock basics to actually implement them.
Sockets and winsock
Winsock ('Windows Sockets') is the Windows API that deals with networking. Many functions are
implemented in the same way as the Berkeley socket functions used in BSD Unix.
1. Sockets
So what's a socket?
Socket
As explained in the previous chapter, you will work with two-way connections.
The endpoints of this connection are the sockets. Both the client and the
server have a socket. A socket is associated with a certain IP and port
number.
Almost all winsock functions operate on a socket, as it's your handle to the connection. Both
sides of the connection use a socket, and they are not platform-specific (ie. a Windows and Unix
machine can talk to each other using sockets). Sockets are also two-way, data can be both sent
and received on a socket.
There are two common types for a socket, one is a streaming socket (SOCK_STREAM), the other
is a datagram socket (SOCK_DGRAM). The streaming variant is designed for applications that
need a reliable connection, often using continuous streams of data. The protocol used for this
type of socket is TCP. I will only use this type in my tutorial as it's most commonly used for the
well known protocols like HTTP, TCP, SMTP, POP3 etc.
Datagram sockets use UDP as underlying protocol, are connectionless, and have a maximum
buffer size. They are intended for applications that send data in small packages and that do not
require perfect reliability. Unlike streaming sockets, datagram sockets do not guarantee data will
reach its destination nor that it comes in the right order. Datagram sockets can be slightly faster
and useful for applications like streaming audio or video, where reliability is not as high on the
priority list as speed and latency. Where the reliability is required, streaming sockets are used.
2. Binding sockets
Binding a socket means associating a specific address (IP & port number) with a given socket.
This can be done manually using the bind function, but in some cases winsock will automatically
bind the socket. This will become clear in the next paragraphs.
3. Connecting
The way you use a socket depends on whether you are on the client side or the server side. The
client side initiates a connection by creating a socket, and calling the connect function with the
specified address information. Before the socket is connected, it is not bound yet to an IP or port
number. Because the client side can use any IP and port number for the connection with the
server (provided that network the IP number is part of can reach the network of the destination
IP), often many useable combinations are possible.
When connect is called, winsock will choose the IP and port number to use for the connection
and bind the socket to it before actually connecting it. The port number can be anything that is
free at the moment, the IP number needs a bit more care. PCs may have more than one IP. For
example, a PC connected to both the internet and a local network has at least three IPs (the
external IP for use with the internet, the local network IP (192.168.x.x, 10.0.x.x etc.) and the
loop back address (127.0.0.1)). Here, it does matter to which IP the socket is bound as it also
determines the network you are using for the connection. If you want to connect to the local PC
192.168.0.4, you cannot do that using the network of your internet provider, as that IP is never
used in the internet and will not be found. So you would have to bind the socket to your IP in the
same network (192.168.0.1 for example). Similarly, when you bind the socket to the local loop
back address (127.0.0.1), you can only connect to that same address, as no other address exist
in that 'network'.
Fortunately, winsock will choose a local IP it can use for the IP you want to connect to
automatically. Nothing stops you from binding the socket yourself, but remember that you need
to take the situations above in consideration.
Note that the bind function gives the user the option to set the IP or port number to zero. In this
Sockets and winsock - MadWizard.org
case, zero means 'let winsock choose something for me'. This is useful when you do want to
connect using a specific IP on the client side, but do not care about the port number used.
4. Listening
Things are different on the server side. A server has to wait for incoming connections and clients
will need to know both the IP and port number of the server to be able to connect to it. To make
things easy, servers almost always use a fixed port number (often the default port number for the
used protocol).
Waiting for incoming connections on a specified address is called listening:
Listening
A socket is listening when it is in a state where it will 'listen' for incoming
connections. Usually, this is done on a socket bound to a specific address
known to the client.
As you can see from the definition above, sockets are often bound to an address before putting it
in the listening state. When the port number of this address is set to a fixed number, the server
will listen for incoming connections on that port number specifically. For example, port 80 (the
default for HTTP) is listened on by most web servers. The socket can be bound to a specific IP as
well but when zero is chosen it will listen on any addresses available, effectively allowing
connections from all networks. It may be set to a fixed IP, for example the IP of the local network
interface, so computers from the local network can connect to the server but not the ones
connected via the internet.
When a client requests a connection to a listening server, the server will accept it (or not) and
spawn another socket which will be the endpoint of the connection. This way the listening socket
is not used for any data transfer on the connection and can continue listening for more
incomming connections.
5. Connections: an example
Here's a graphical example of a webserver that can handle multiple connections.
1. The server socket is created
The server creates a new socket. When it's just created it is not yet bound to an IP or port
number.
2. The server socket is bound
Because the server is a webserver, it will be bound to port number 80, the default for HTTP.
However the IP number is set to zero, indicating the server is willing to recieve incomming
connections from all IPs available for the machine it runs on. In this example, we assume the
server has three IPs, one external (216.239.39.101), one internal (192.168.0.8) and of course the
loop back address (127.0.0.1).
3. The server is listening
After the socket is bound, it is put into the listening state, waiting for incomming connections on
port 80.
4. A client creates a socket
Assume a client in the same local network as the server (192.168.x.x) wants to request a
webpage from the server. To do the data transfer it needs a socket so it creates one.
5. The client socket tries to connect
madwizard.org/programming//1 8/19
The client socket is left unbound and tries to connect to the webserver.
6. The server accepts the request
The listening socket sees some client wants to make a connection. It accepts it by creating a
new socket (on the bottom right) bound to the one of the IPs of itself which can be reached by
the client (ie. they are in the same network, being 192.168.x.x) and the server port (80). From
this point, the client socket and the server connection socket just created will do the data
transfers, while the listening socket will keep listening for other connections. Note that the client
socket is now bound to an IP and port since it's connected. The dotted gray line shows the
separation of the client and server side.
7. Another client connects
7/23/2010 Networking introduction - MadWizard.org
If another client (from the external network) connects, the server will again create a new socket
to deal with the second connection. Note that the IP the socket on the server side is bound to is
different than the one from the first connection. This is possible because the listening server
socket was not bound to any IP. If it had been bound to 192.168.0.8, the second connection
would not be possible.
6. Blocking
The original functions in the Berkeley unix implementation of sockets were blocking functions. This
means that they will just wait when the operation requested cannot be completed immediately.
For example, when connecting to a server using the connect function, it did not return until the
connection had been made (or failed), thus making the program hang for a while. This is not really
a problem when dealing with a single connection using a console mode application but in the
Windows environment, this behavior is rarely acceptable. Any program with a window has a
window procedure that has to be kept running. Stalling it would delay user input, window
painting, notifications, and any other messages resulting in an application that seems to be
hanging while it's using socket functions.
To deal with this problem, winsock can set sockets into blocking or non-blocking mode. The
former (blocking mode) is the original way of using sockets, ie. not returning from the API before
the operation has finished (it will literally block the application). The latter (non-blocking mode) is
the mode you usually use when dealing with a real windows application (ie. not a console
application). When calling a function on a socket that is in non-blocking mode, the function will
always return as soon as possible, even when the operation to be performed could not be
completed immediately. Instead, a notification of some sort will be sent to the program when the
operation is finished, allowing the program to execute in the normal manner while the operation is
unfinished.
Winsock provides several methods of notification for non-blocking sockets, including window
messages and event objects. These methods will be discussed in detail later, for now just
remember there difference between blocking and non-blocking.
7. Winsock versions
The most commonly used winsock version is version 2.x, usually just called winsock 2 as there are
only minor differences. The latest version before version 2 was version 1.1. Some people say you
should use this version for compatibility reasons, as Windows 95 and NT 3 only ship version 1.1.
However, all later windows versions (98, ME, NT4, 2000 and XP) have version 2 by default and for
Windows 95 an update is available. So I recommend you just start with winsock 2, it adds a lot of
nice features and windows machines without winsock 2 are getting rare.
The two major versions of winsock reside in two different DLLs, wsock32.dll and ws2_32.dll,
being version 1.1 and version 2.x respectively. The libraries to use
are wsock32.lib and ws2_32.lib. The MASM32 package has most winsock constants in its
windows.inc, for C++ programs including windows.h suffices, it will include the winsock 2
definitions if the _WIN32_WINNT constant is at least 0x400 (NT version 4). The winsock 2 API
includes the full 1.1 API (with some minor changes), wsock32.dll is even just a wrapper for the
actual winsock ws2_32.dll.
This tutorial will assume you are using winsock 2.
8. Winsock architecture
Winsock provides two interfaces, the Application Programming Interface (API) and the Service
Provider Interface (SPI). This tutorial is about the API, it contains all the functions you need to
communcate using the well-known protocols. The SPI is an interface to add Data Transport
Providers (like TCP/IP or IPX/SPX) or Name Space Service Providers (like DNS). These extensions
are transparent to the user of the API.
Basic winsock functions
In this chapter of the winsock tutorial, I will show you the basic winsock functions that operate
on sockets. It is important to remember that this chapter is only an introduction to the socket
functions, so you will be able to follow the next tutorials. Do not start coding immediately after
you've read this chapter, the next chapters are just as important.
The basic functionality of each function is relatively simple, but things like the blocking mode
make it more complicated than it looks at first sight. The next chapters will cover the details, but
first you need to be familiar with the functions.
This chapter is quite long and you might not remember everything but that's okay. Just read it
carefully so you know what I'm talking about in the next chapters, you can always look back here
and use it as a quick reference.
1. WSAStartup & WSACleanup
int WSAStartup(WORD wVersionRequested, LPWSADATA lpWSAData);
int WSACleanup();
Before calling any winsock function, you need to initialize the winsock library. This is done with
WSAStartup. It takes two parameters:
wVersionRequested
Highest version of Windows Sockets support that the caller can use. The high-order byte
specifies the minor version (revision) number; the low-order byte specifies the major
version number.
lpWSAData
Pointer to the WSADATA data structure that is to receive details of the Windows Sockets
implementation.
As explained in the introduction, I will use winsock 2. This means you need to set the low byte of
wVersionRequested to 2, the high byte can be zero (the revision number is not important). The
WSADATA structure specified with the lpWSAData parameter will receive some information about
the winsock version installed.
The function returns zero if it succeeded, otherwise you can call WSAGetLastError to see what
went wrong. WSAGetLastError is the winsock equivalent of the win32 APIs GetLastError, it
retrieves the code of the last occurred error.
It is important to note that you might not get the version you requested in
the wVersionRequestedparameter. This parameter specifies the highest winsock version your
application *supports*, not 'requires'. Winsock will try hard to give you the version you requested
but if that is not possible, it uses a lower version. This version is available after the call, in
the wVersion member of the WSADATA structure. You should check this version after the call to
see if you really got the winsock version you wanted. There is also a member
called wHighVersion that gives the highest winsock version supported by the system. In short:
wVersionRequested parameter: The highest winsock version your application supports.
wHighVersion in WSADATA: The highest winsock version the system supports.
wVersion in WSADATA: min(wVersionRequested, wHighVersion).
Each call to WSAStartup has to match a call to WSACleanup, which cleans up the winsock library.
Although useless, WSAStartup may be called more than once, as long as WSACleanup is called
the same number of times.
An example of initializing and cleaning up winsock:
const int iReqWinsockVer = 2; // Minimum winsock version required
WSADATA wsaData;
if (WSAStartup(MAKEWORD(iReqWinsockVer,0), &wsaData)==0)
{
// Check if major version is at least iReqWinsockVer
if (LOBYTE(wsaData.wVersion) >= iReqWinsockVer)
{
/* ------- Call winsock functions here ------- */
}
else
{
// Required version not available
}
// Cleanup winsock
if (WSACleanup()!=0)
{
// cleanup failed
}
}
else
{
// startup failed
}
2. socket
SOCKET socket(int af, int type, int protocol);
The socket function creates a new socket and returns a handle to it. The handle is of type
SOCKET and is used by all functions that operate on the socket. The only invalid socket handle
Basic winsock functions - MadWizard.org
value is INVALID_SOCKET (defined as ~0), all other values are legal (this includes the value
zero!). Its parameters are:
af
The address family to use. Use AF_INET to use the address family of TCP & UDP.
type
The type of socket to create. Use SOCK_STREAM to create a streaming socket (using
TCP), or SOCK_DGRAM to create a diagram socket (using UDP). For more information on
socket types, see the previous chapter.
protocol
The protocol to be used, this value depends on the address family. You can specify
IPPROTO_TCP here to create a TCP socket.
The return value is a handle to the new socket, or INVALID_SOCKET if something went wrong.
The socket function can be used like this:
SOCKET hSocket;
hSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (hSocket==INVALID_SOCKET)
{
// error handling code
}
3. closesocket
int closesocket(SOCKET s);
Closesocket closes a socket. It returns zero if no error occurs, SOCKET_ERROR otherwise. Each
socket you created with socket has to be closed with an appropriate closesocket call.
s
Handle to the socket to be closed. Do not use this socket handle after you called this
function.
The use of closesocket is pretty straightforward:
closesocket(hSocket);
However, in real situations some more operations are necessary to close the socket properly. This
will be discussed later in the tutorial.
4. sockaddr and byte ordering
Because winsock was made to be compatible with several protocols including ones that might be
added later (using the SPI) a general way of addressing has to be used. TCP/IP uses an IP and
port number to specify an address, but other protocols might do it differently. If winsock forced a
certain way of addressing, adding other protocols may not have been possible. The first version
of winsock solved this with the sockaddr structure:
struct sockaddr
{
u_short sa_family;
char sa_data[14];
};
In this structure, the first member (sa_family) specifies the address family the address is for. The
data stored in the sa_data member can vary among different address families. We will only use
the internet address family (TCP/IP) in this tutorial, winsock has defined a
structure sockaddr_in that is the TCP/IP version of the sockaddr structure. They are
essentially the same structure, but the second is obviously easier to manipulate.
struct sockaddr_in
{
short sin_family;
u_short sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
The last 8 bytes of the structure are not used but are padded (with sin_zero) to give the
structure the right size (the same size as sockaddr).
Before proceeding, it is important to know about the network byte order. In case you don't know,
byte ordering is the order in which values that span multiple bytes are stored. For example, a 32-
bit integer value like 0x12345678 spans four 8-bit bytes. Intel x86 machines use the 'little-endian'
order, which means the least significant byte is stored first. So the value 0x12345678 would be
stored as the byte sequence 0x78, 0x56, 0x34, 0x12. Most machines that don't use little-endian
use big-endian, which is exactly the opposite: the most significant byte is stored first. The same
value would then be stored as 0x12, 0x34, 0x56, 0x78. Because protocol data can be transferred
between machines with different byte ordering, a standard is needed to prevent the machines
from interpreting the data the wrong way.
Network byte ordering
Because protocols like TCP/IP have to work between different type of systems
with different type of byte ordering, the standard is that values are stored
inbig-endian format, also called network byte order. For example, a port
number (which is a 16-bit number) like 12345 (0x3039) is stored with its most
significant byte first (ie. first 0x30, then 0x39). A 32-bit IP address is stored in
the same way, each part of the IP number is stored in one byte, and the first
part is stored in the first byte. For example, 216.239.51.100 is stored as the
byte sequence '216,239,51,100', in that order.
Apart from the sin_family value of sockaddr and sockaddr_in, which is not part of the protocol
but tells winsock which address family to use, all the values in both structures have to be in
network byte order. Winsock provides several functions to deal with the conversion between the
byte order of the local host and the network byte order:
// Convert a u_short from host to TCP/IP network byte order.
u_short htons(u_short hostshort);
// Convert a u_long from host to TCP/IP network byte order.
u_long htonl(u_long hostlong);
// Convert a u_long from TCP/IP network order to host byte order.
u_short ntohs(u_short netshort);
// Convert a u_long from TCP/IP network order to host byte order.
u_long ntohl(u_long netlong);
You might question why we should need four API functions for such simple operations as
swapping the bytes of a short or long (as that's enough to convert from little-endian (intel) to
big-endian (network)). This is because these APIs will work even if you are running your program
on a machine with other byte ordering than an intel machine (that is, the APIs are platform
independent), like Windows CE on a handheld using a big-endian processor. Whether you use
these APIs or your own macros/functions is up to you. Just know that the API way is guaranteed
to work on all systems.
Back to the sockaddr_in structure, as said above, all members except for sin_family have to be
in network byte order. For sin_family use AF_INET. sin_port is the port number of the address
(16-bit), sin_addr is the IP address (32-bit), declared as an union to manipulate the full 32-bit
word, the two 16-bit parts or each byte separately. sin_zero is not used.
Here are several examples of initializing sockaddr_in structures:
sockaddr_in sockAddr1, sockAddr2;
// Set address family
sockAddr1.sin_family = AF_INET;
/* Convert port number 80 to network byte order and assign it to
the right structure member. */
sockAddr1.sin_port = htons(80);
/* inet_addr converts a string with an IP address in dotted format to
a long value which is the IP in network byte order.
sin_addr.S_un.S_addr specifies the long value in the address union */
sockAddr1.sin_addr.S_un.S_addr = inet_addr("127.0.0.1");
// Set address of sockAddr2 by setting the 4 byte parts:
sockAddr2.sin_addr.S_un.S_un_b.s_b1 = 127;
The inet_addr function in the example above can convert an IP address in dotted string format
to the appropriate 32-bit value in network byte order. There is also a function called inet_ntoa,
which does exactly the opposite.
As a side note, winsock 2 does not require that the structure used to address a socket is the
same size of sockaddr, only that the first short is the address family and that the right structure
size is passed to the functions using it. This allows new protocols to use larger structures. The
sockaddr structure is provided for backwards compatibility. However, since we will only use
TCP/IP in this tutorial, the sockaddr_in structure can be used perfectly.
5. connect
int connect(SOCKET s, const struct sockaddr *name, int namelen);
The connect function connects a socket with a remote socket. This function is used on the client
side of a connection, as you are the one initiating it. A short description of its parameters:
s
The unconnected socket you want to connect.
name
Pointer to a sockaddr structure that contains the name (address) of the remote socket to
connect to.
namelen
Size of the structure pointed to by name.
The first parameter s is the client socket used for the connection. For example, a socket you've
just created with the socket function. The other two parameters, name and namelen are used
to address the remote socket (the server socket that is listening for incoming connections). This
is done by using a sockaddr structure (or sockaddr_in for TCP/IP), as described in the previous
section.
A possible use of this function is connecting to a webserver to request a page. To address the
server, you can use sockaddr_in structure and fill it with the server's IP and port number. You
might wonder how you get the IP of a hostname like www.madwizard.org, I will show you how to
do that later. For now, just assume you know the server's IP number.
Assuming a webserver is running on a local network PC with IP number 192.168.0.5, using the
default HTTP port 80, this would be the code to connect to the server:
/* This code assumes a socket has been created and its handle
is stored in a variable called hSocket */
sockaddr_in sockAddr;
sockAddr.sin_family = AF_INET;
sockAddr.sin_port = htons(80);
sockAddr.sin_addr.S_un.S_addr = inet_addr("192.168.0.5");
// Connect to the server
if (connect(hSocket, (sockaddr*)(&sockAddr), sizeof(sockAddr))!=0)
{
}
/* Note: the (sockaddr*) cast is necessary because connect requires a
sockaddr type variable and the sockAddr variable is of the sockaddr_in
type. It is safe to cast it since they have the same structure, but the
compiler naturally sees them as different types. */
6. bind
int bind(SOCKET s, const struct sockaddr *name, int namelen);
Binding a socket has been explained in the previous chapter. By binding a socket you assign an
address to a socket. Bind's parameters are:
s
The unbound socket you want to bind.
name
Pointer to a sockaddr structure that contains the address to assign to the socket.
namelen
Size of the structure pointed to by name.
For TCP/IP, the sockadrr_in structure can be used as usually. Let's look at an example first:
sockaddr_in sockAddr;
sockAddr.sin_family = AF_INET;
sockAddr.sin_port = htons(80);
sockAddr.sin_addr.S_un.S_addr = INADDR_ANY; // use default
// Bind socket to port 80
if (bind(hSocket, (sockaddr*)(&sockAddr), sizeof(sockAddr))!=0)
{
}
As you can see, a sockaddr_in structure is filled with the necessary information. The address
family is AF_INET for TCP/IP. In the example, we bind the socket to port number 80, but not to
an IP number. By specifying the INADDR_ANY value as IP address, winsock will choose an
address for you. This can be very useful for PCs with multiple network adapters (and thus multiple
IPs). If you do want to bind to a specific IP, just convert the IP to a DWORD in network byte
order and put it in the structure. Something similar is possible with the port number; when you
specify 0 as the port number winsock will assign a unique port with a value between 1024 and
5000. However, most of the time you want to bind to a specific port number.
Binding is usually done before putting the socket in a listening state, to make the socket listen on
the right port number (and optionally an IP number). Although you can also bind a socket before
connecting it, this is not commonly done because the address of the socket on the client side is
not important most of the time.
7. listen
int listen(SOCKET s, int backlog);
The listen function puts a socket in the listening state, that is it will be listening for incoming
connections. It has two parameters:
s
The bound, unconnected socket you want to set into the listening state.
backlog
Maximum length of the queue of pending connections.
The backlog parameter can be set to specify the length of the queue of pending connections that
have not yet been accepted. Usually, you can use the default value SOMAXCONN, allowing the
underlying service provider to choose a reasonable value.
Before listen is called, the socket must have been bound to an address, as shown in the previous
section. For example, if you bind a socket to port 80 and then call listen on the socket, all
incoming connections on port 80 will be routed to your application. To actually accept the
connection, another function called acceptis available, it will be explained in the next section.
The following code snippet shows how to call the listen function on a socket that has been bound
already:
/* This code assumes the socket specified by
hSocket is bound with the bind function */
if (listen(hSocket, SOMAXCONN)!=0)
{
}
8. accept
SOCKET accept(SOCKET s, struct sockaddr *addr, int *addrlen);
When the socket is in the listening state and an incoming connection arrives, you can accept it
with the accept function.
s
The socket that has been placed in a listening state with the listen function.
addr
Optional pointer to a buffer that receives the address of the remote socket. This parameter
is a pointer to a sockaddr structure, but its exact structure is determined by the address
family.
addrlen
Optional pointer to an integer that contains the length of addr. Before calling the function,
the value should be the size of the buffer pointed to by addr. On return, the value is the
size of the data returned in the buffer.
As you know, when a connection is accepted a new socket is created on the server side. This
new socket is connected to the client socket, all operations on that connection are done with
that socket. The original listening socket is not connected, but instead listens for more incoming
connections.
sockaddr_in remoteAddr;
int iRemoteAddrLen;
SOCKET hRemoteSocket;
iRemoteAddrLen = sizeof(remoteAddr);
hRemoteSocket = accept(hSocket, (sockaddr*)&remoteAddr, &iRemoteAddrLen);
if (hRemoteSocket==INVALID_SOCKET)
{
}
If accept succeeds, a connection is established and the return value is a new socket handle that
is the server side of the new connection. Optionally, you can set the addr and addrlen
parameters that will receive a sockaddr structure containing the remote address information (IP &
port number).
9. send and recv
int send(SOCKET s, const char *buf, int len, int flags);
s
The connected socket to send data on.
buf
Pointer to a buffer containing the data to send
len
Length of the data pointed to by buf.
flags
Specifies the way in which the call is made.
int recv(SOCKET s, char *buf, int len, int flags);
s
The connected socket to receive data from.
buf
Pointer to a buffer that will receive the data.
len
Length of the buffer pointed to by buf.
flags
Specifies the way in which the call is made.
To transfer data on a connection, you use the send and recv functions. Send sends the data in
the buffer on the socket and returns the number of bytes sent. Recv receives the data that is
currently available at the socket and stores it in the buffer. The flags parameter can usually be set
to zero for both recv and send.
In blocking mode, send will block until all data has been sent (or an error occurred) and recv will
return as much information as is currently available, up to the size of the buffer specified.
Although these functions may seem simple at first, they become more complicated in non-blocking
mode. When a socket is in non-blocking mode, these functions cannot block until the operation is
finished so they may not perform the operation fully (ie. not all data is sent), or not at all. The
next chapter will explain these issues in great detail, I won't discuss it here since this only a
function overview.
This example of recv and send on a connected socket in blocking mode will just send back all data
it receives.
char buffer[128];
while(true)
{
// Receive data
int bytesReceived = recv(hRemoteSocket, buffer, sizeof(buffer), 0);
if (bytesReceived==0) // connection closed
{
break;
}
else if (bytesReceived==SOCKET_ERROR)
{
}
// Send received data back
if (send(hRemoteSocket, buffer, bytesReceived, 0)==SOCKET_ERROR)
{
}
}
10. Usage
As stated in this chapter's introduction, this was only an overview of the main winsock functions.
Just knowing how the functions is not enough to program correctly with winsock. The next
chapters will tell you how to use them correctly, which I/O strategies exist and how blocking and
non-blocking mode works.
I/O models
In chapter 3 I briefly touched blocking and non-blocking sockets, which play a role in the
available winsock I/O models. An I/O model is the method you use to control the program flow of
the code that deals with the network input and output. Winsock provides several functions to
design an I/O strategy, I will discuss them all here in short to get an overview. Later in the tutorial
I will deal with most models separately and show some examples of them.
1. The need for an I/O model
So why do you need an I/O model? We don't have infinite network speed, so when you send or
receive data the operation you asked for may not be completed immediately. Especially with
networks, which are slow compared to 'normal', local operations. How do you handle this? You
could choose to do other things while you're waiting to try again, or let your program wait until the
operation is done, etc. The best choice depends on the structure and requirements of your
program.
Originally, Berkeley sockets used the blocking I/O model. This means that all socket functions
operate synchronously, ie. they will not return before the operation is finished. This kind of
behavior is often undesirable in the Windows environment, because often user input and output
should still be processed even while network operations might occur (I explained this earlier in
chapter 3). To solve this problem, non-blocking sockets were introduced.
2. Non-blocking model
A socket can be set into non-blocking mode using ioctlsocket (with FIONBIO as its cmd
parameter). Some functions used in I/O models implicitly set the socket in non-blocking mode (more
on this later). When a socket is in non-blocking mode, winsock functions that operate on it will
never block but always return immediately, possibly failing because there simply wasn't any time to
perform the operation. Non-blocking sockets introduce a new winsock error code which - unlike
other errors - is not exceptional. For now, keep the following in mind:
WSAEWOULDBLOCK
This constant is the error code a winsock function sets when it cannot
immediately perform an operation on a non-blocking socket. You get this
error code when you call WSAGetLastError after a winsock function failed.
Its name literally says 'error, would block', meaning that the function would
have to block to complete. Since a non-blocking socket should not block, the
function can never do what you ask it to.
Note that this isn't really an error. It can occur all the time when using non-
blocking sockets. It just says: I can't do that right now, try again later. The
I/O model usually provides a way to determine what's the best time to try
again.
3. I/O models
I've made several attempts to find a categorical description of the several I/O models but I haven't
really found a good one, mainly because the models' properties overlap and terms like
(a)synchronous have slightly different meanings or apply to different things for each model. So I
decided to just create a table with all the models to show the differences and explain the details
later.
Model Blocking mode
Notification method
none
on network
event
on completion
Blocking sockets blocking x
Polling non-blocking x
Select both blocking select
WSAAsyncSelect non-blocking
window
message
WSAEventSelect non-blocking event objects
Overlapped I/O: blocking N/A blocking call
Overlapped I/O: polling N/A x
Overlapped I/O: completion routines N/A callback function
Overlapped I/O: completion ports N/A completion port
I/O models - MadWizard.org
The first five models are commonly used and fairly easy to use. The last four actually use the same
model (overlapped I/O), but use different implementation methods. Actually, you don't really need
overlapped I/O unless you're writing network programs that should be able to handle thousands of
connections. Most people won't write such programs but I included them because good information
and tutorials about the overlapped I/O model is not easy to find on the web. If you're not
interested in overlapped I/O you can safely skip the future chapters about them.
One way to divide the I/O models is based on the blocking mode it uses. The blocking sockets
model naturally uses blocking mode, while the others use non-blocking mode (select may be used
for both). The blocking mode is not applicable to overlapped I/O because these operations always
operate asynchronously (the blocking mode cannot affect this nor the other way around).
Another way to divide them is using their differences in the notification method used (if any).
There are three subtypes:
None
There is no notification of anything, an operation simply fails or succeeds (optionally blocking).
On network event
A notification is sent on a specific network event (data available, ready to send, incoming
connection waiting to be accepted, etc.). Operations fail if they cannot complete immediately,
the network event notification can be used to determine the best time to try again.
On completion
A notification is sent when a pending network operation has been completed. Operations either
succeed immediately, or fail with an 'I/O pending' error code (assuming nothing else went wrong).
You will be notified when the operation does complete, eliminating the need to try the operation
again.
Blocking mode doesn't use any notifications, the call will just block until the operation finished.
WSAAsyncSelect is an example of a network event notification model as you will be notified by a
window message when a specific network event occurred. The completion notification method is
solely used by overlapped I/O, and is far more efficient. They are bound directly to the operations;
the big difference between the network event and completion notification is that a completion
notification will be about a specific operation you requested, while a network event can happen
because of any reason. Also, overlapped I/O operations can - like its name says - overlap. That
means multiple I/O requests can be queued.
In the next section I will show you the details of each model separately. To give you a more
intuitive view of the models, I've created timeline images and used a conversation between the
program and winsock as an analogy to how the model works.
Note
In many of these timelines I've assumed the winsock operation fails (in a WSAEWOULDBLOCK way)
because that is the interesting case. The function might as well succeed and return immediately if
the operation has been done already. I've left this case out in most of the timelines in favor of
clarity.
Blocking sockets are the easiest to use, they were already used in the first socket
implementations. When an operation performed on a blocking socket cannot complete immediately,
the socket will block (ie. halt execution) until it is completed. This implies that when you call a
winsock function like send or recv, it might take quite a while (compared to other API calls) before
it returns.
This is the timeline for a blocking socket:
As you can see, as soon as the main thread calls a winsock function that couldn't be completed
immediately, the function will not return until it is completed. Naturally this keeps the program flow
simple, since the operations can be sequenced easily.
By default, a socket is in blocking mode and behaves as shown above. As I told earlier, I will also
show each I/O model in the form of a conversation between the program and winsock. For blocking
sockets, it's very simple:
program: send this data
winsock: okay, but it might take some time
...
...
...
done!
5. Polling
Polling is actually a very bad I/O model in windows,
but for completeness' sake of I will describe it.
so the socket first has to be put into non-blocking
.
is the desired one, in this case repeating a winsock function
Polling is an I/O model for non-blocking sockets,
mode. This can be done with ioctlsocket
Polling in general is repeating something until its status
until it returns successfully:
4. Blocking sockets
Because the socket is non-blocking, the function will not block until the operation is finished. If it
cannot perform the operation it has to fail (with WSAEWOULDBLOCK as error code). The polling I/O
model just keeps calling the function in a loop until it succeeds:
winsock: sorry can't do that right now, I would block
winsock: done!
As I said, this is a really bad method because its effect is the same as a blocking function, except
that you have some control inside the loop so you could stop waiting when some variable is set, for
example. This style of synchronization is called 'busy waiting', which means the program is
continuously busy with waiting, wasting precious CPU time. Blocking sockets are far more efficient
since they use an efficient wait state that requires nearly no CPU time until the operation
completes.
Now you know how the polling I/O model works, forget about it immediately and avoid it by all
means :)
6. Select
Select provides you a more controlled way of blocking. Although it can be used on blocking sockets
too, I will only focus on the non-blocking socket usage. This is the timeline for select:
7/23/2010 I/O models - MadWizard.org
V
And the corresponding conversation:
program: okay, tell me when's the best time to try again (the select call)
winsock: sure, hang on a minute
...
...
try again now!
winsock: done!
You might have noticed that the select call looks suspiciously similar to the blocking socket
timeline. This is because the select function does block. The first call tries to perform the winsock
operation. In this case, the operation would block but the function can't so it returns. Then at one
point, select is called. Select will wait until the best time to retry the winsock operation. So
instead of blocking winsock functions, we now have only one function that blocks, select.
If select blocks, why use it for non-blocking sockets then? Select is more powerful than blocking
sockets because it can wait on multiple events. This is the prototype of select:
select PROTO nfds:DWORD, readfds:DWORD, writefds:DWORD, exceptfds:DWORD,
timeout:DWORD
Select determines the status of one or more sockets, performing synchronous I/O if necessary. The
nfds parameter is ignored, select is one of the original Berkeley sockets functions, it is provided for
compatibility. The timeout parameter can be used to specify an optional timeout for the function.
The other three parameters all specify a set of sockets.
readfds is a set of sockets that will be checked for readability
writefds is a set of sockets that will be checked for writability
exceptfds is a set of sockets that will be checked for errors
Readability means that data has arrived on a socket and that a call to read after select is likely to
receive data. Writability means it's a good time to send data since the receiver is probably ready to
receive it. Exceptfds is used to catch errors from a non-blocking connect call as well as out-of-
band data (which is not discussed in this tutorial).
So while select may block you have more control over it since you can specify more than one
socket to wait on for a specific event, and multiple types of events (data waiting, ready to send or
some error that has occurred). Select will be explained more detailed in later chapters.
7. Windows messages (WSAASyncSelect)
Many windows programs have some kind of window to get input from and give information to the
user. Winsock provides a way to integrate the network event notification with a windows's
message handling. The WSAAsyncSelect function will register notification for the specified
network events in the form of a custom window message.
WSAAsyncSelect PROTO s:DWORD, hWnd:DWORD, wMsg:DWORD, lEvent:DWORD
This function requires a custom message (wMsg) that the user chooses and the window procedure
should handle. lEvent is a bit mask that selects the events to be notified about. The timeline is as
follows:
Let's say the first message wants to write some data to the socket using send. Because the
socket is non-blocking, send will return immediately. The call might succeed immediately, but here
it didn't (it would need to block). Assuming WSAAsyncSelect was setup to notify you about the
FD_WRITE event, you will eventually get a message from winsock telling you a network event has
happened. In this case it's the FD_WRITE event which means something like: "I'm ready again, try
resending your data now". So in the handler of that message, the program tries to send the data
again, and this is likely to succeed.
The conversation between the program and winsock is much like the one with select, the
difference is in the method of notification: a window message instead of a synchronous select call.
While select blocks waiting until an event happens, a program using WSAASyncSelect can continue
to process windows messages as long as no events happen.
program registers for network event notification via window messages
program handles some message
program handles some other message
program gets a notification window message from winsock
winsock: done!
WSAAsyncSelect provide a more 'Windows natural' way of event notification and is fairly easy to
use. For low traffic servers (ie. < 1000 connections) it efficient enough as well. The drawback is
that window messages aren't really fast and that you'll need a window in order to use it.
WSAAsyncSelect brother is WSAEventSelect, which works in a very similar way but uses event
objects instead of windows messages. This has some advantages, including a better separation of
the network code and normal program flow and better efficiency (event objects work faster than
window messages).
Have a good look at the timeline and conversation, it looks a bit complicated but it really isn't:
program registers for network event notification via event objects
program waits for the event object to signal
winsock: done!
It's hard to draw a timeline for this function since event objects are a very powerful mechanism
that can be used in many ways. I chose for a simple example here as this I/O model will be
explained in great detail later in this tutorial.
At first, this model seems a lot like blocking: you wait for an event object to be signaled. This is
true, but you can also wait for multiple events at the same time and create your own event
objects. Event objects are part of the windows API, winsock uses the same objects. Winsock does
have special functions to create the event objects but they are just wrappers around the usual
functions.
All that winsock does with this model is signaling an event object when a winsock event happens.
How you use this notification method is up to you. That makes it a very flexible model.
The function used to register for network events is WSAEventSelect. It is much like
WSAAsyncSelect:
WSAEventSelect PROTO s:DWORD, hEventObject:DWORD, lNetworkEvents:DWORD
8. Event objects (WSAEventSelect)
WSAAsyncSelect will send you a custom message with the network event that happened
(FD_READ, FD_WRITE, etc.). Unlike WSAAsyncSelect, WSAEventSelect has only one way of
notification: signaling the event object. When the object is signaled, one or more events may
have happened. Which events exactly can be found out with WSAEnumNetworkEvents.
9. Use with threads
Before starting with the overlapped I/O models I first want to explain some things about the use of
threads. Some of the models explained can show different behavior when threads come into play.
For example, blocking sockets in a single threaded application will block the whole application. But
when the blocking sockets are used in a separate thread, the main thread continues to run while
the helper thread blocks. For low traffic servers (let's say 10 connections or so), an easy to
implement method is to use the select model with one thread per client. Each running thread is
bound to a specific connection, handling requests and responses for that particular connection.
Other ways of using threads are possible too, like handling multiple connections per thread to limit
the number of threads (this is useful for servers with many connections), or just one main thread
to handle the user input/GUI and one worker thread that deals with all the socket I/O.
The same thing holds for the other models, although some combine better with threads than
others. For example, WSAAsyncSelect uses window messages. You could use threads but you
somehow have to pass the received messages to the worker threads. Easier to use is
WSAEventSelect, since threads can wait on events (even multiple) so notifications can be directly
acted on in the thread. Pure blocking sockets can be used as well, but it's hard to get some
control over a thread that is blocked on a winsock function (select has the same problem). With
events, you can create a custom event (not winsock related) and use that to notify the thread
about something that hasn't got to do with socket I/O like shutting down the server.
As you can see, threads can be very powerful and change the abilities of an I/O model radically.
Many servers need to handle multiple requests at the same time so that's why threads are a logical
choice to implement this; threads all run at the same time. In later chapters I will discuss the use
of threads, for now it's enough to know you can use them.
10. Introduction to Overlapped I/O
Overlapped I/O is very efficient and when implemented well also very scalable (allowing many,
many connections to be handled). This is especially true for overlapped I/O in combination with
completion ports. I said before that for most uses overlapped I/O is a bit overkill but I will explain
them anyway.
The asynchronous models discussed so far all send some kind of notification on the occurrence of a
network event like 'data available' or 'ready to send again'. The overlapped I/O models also notify
you, but about completion instead of a network event. When requesting a winsock operation, it
might either complete immediately or fail with WSA_IO_PENDING as the winsock error code. In the
latter case, you will be notified when the operation is finished. This means you don't have to try
again like with the other models, you just wait until you're told it's done.
The price to pay for this efficient model is that overlapped I/O is a bit tricky to implement. Usually
one of the other models can stand up to the task as well, prefer those if you don't need really high
performance and scalability. Also, the windows 9x/ME series do not fully support all overlapped I/O
performance and scalability. Also, the windows 9x/ME series do not fully support all overlapped I/O
models. While NT4/2K/XP has full kernel support for overlapped I/O, win9x/ME has none. However
for some devices (including sockets), overlapped I/O is emulated by the windows API in win9x/ME.
This means you can use overlapped I/O with winsock for win9x/ME, but NT+ has a much greater
support for it and provides more functionality. For example, I/O completion ports are not available
at all on win9x systems. Besides, if you're writing high-performance applications that require
overlapped I/O I strongly recommend running it on an NT+ system.
As with the network event notification models, overlapped I/O can be implemented in different
ways too. They differ in the method of notification: blocking, polling, completion routines and
completion ports.
11. Overlapped I/O: blocking on event
The first overlapped I/O model I'm going to explain is using an event object to signal completion.
This is much like WSAEventSelect, except that the object is set into the signaled state on
completion of an operation, not on some network event. Here's the timeline:
As with WSAEventSelect, there are many ways to use the event object. You could just wait for it,
you could wait for multiple objects, etc. In the timeline above a blocking wait is used, matching
this simple conversation:
winsock: okay, but I couldn't send it right now
program waits for the event object to signal, indicating completion of the
operation
As you can see, the winsock operation is actually performed at the same time as the main thread is
running (or waiting in this case). When the event is signaled, the operation is complete and the
main thread can perform the next I/O operation. With network event notification models, you
probably had to retry the operation. This is not necessary here.
Just like the polling model mentioned earlier, the status of an overlapped I/O operation can be
polled too. The WSAGetOverlappedResult function can be used to determine the status of a
pending operation. The timeline and conversation are pretty much the same as the other polling
model, except for that the operation happens at the same time as the polling, and that the status
is the completion of the operation, not whether the operation succeeded immediately or would
have blocked.
program: are you done yet?
winsock: no
winsock: no
winsock: no
winsock: no
winsock: yes!
Again, polling isn't very good as it puts too much stress on the CPU. Continuously asking if an
operation completes is less efficient than just waiting for it in an efficient, little CPU consuming
wait state. So I don't consider this a very good I/O model either. This doesn't render
WSAGetOverlappedResult useless though, it has more uses, which I will show when the tutorial
comes to the chapters about overlapped I/O.
13. Overlapped I/O: completion routines
Completion routines are callback routines that get called when an operation (which you associated
with the routine) completes. This looks quite simple but there is a tricky part: the callback routine
is called in the context of the thread that initiated the operation. What does that mean? Imagine a
thread just asked for an overlapped write operation. Winsock will perform this operation while your
12. Overlapped I/O: polling
main thread continues to run. So winsock has its own thread for the operation. When the operation
finishes, winsock will have to call the callback routine. If it would just call it, the routine would be
run in the context of the winsock thread. This means the calling thread (the thread that asked for
the operation) would be running at the same time as the callback routine. The problem with that is
that you don't have synchronization with the calling thread, it doesn't know the operation
completed unless the callback tells him somehow.
To prevent this from happening, winsock makes sure the callback is run in the context of the
calling thread by using the APC (Asynchronous Procedure Call) mechanism included in windows. You
can look at this as 'injecting' a routine into a threads program flow so it will run the routine and
then continue with what it was doing. Of course the system can't just say to a thread: "Stop doing
whatever you were doing, and run this routine first!". A thread can't just be intervened at any
point.
In order to deal with this, the APC mechanism requires the thread to be in a so-called alertable
wait state. Each thread has its own APC queue where APCs are waiting to be called. When the
thread enters an alertable wait state it indicates that it's willing to run an APC. The function that
put the thread in this wait state (for example SleepEx, WaitForMultipleObjectsEx and more) either
returns on the normal events for that function (timeout, triggered event etc.) or when an APC was
executed.
Overlapped I/O with completion routines use the APC mechanism (though slightly wrapped) to
notify you about completion of an operation. The timeline and conversation are:
program enters an alertable wait state
the operation completes
winsock: system, queue this completion routine for that thread
the wait state the program is in is alerted
the wait function executes the queued completion routine and returns to the
program
APCs can be a bit hard to understand but don't worry, this is just an introduction. Usually a thread
is in the alertable wait state until the callback is called, which handles the event and returns to the
thread. The thread then does some operations if necessary and finally loops back to the wait state
again.
14. Overlapped I/O: completion ports
We've finally come to the last and probably most efficient winsock I/O model: overlapped I/O with
completion ports. A completion port is a mechanism available in NT kernels (win9x/ME has no
support for it) to allow efficient management of threads. Unlike the other models discussed so far,
completion ports have their own thread management. I didn't draw a timeline nor made a
conversation for this model, as it probably wouldn't make things clearer. I did draw an image of the
mechanism itself, have a good look at it first:
The idea behind completion ports is the following. After creating the completion port, multiple
sockets (or files) can be associated with it. At that point, when an overlapped I/O operation
completes, a completion packet is sent to the completion port. The completion port has a pool of
similar worker threads, each of which are blocking on the completion port. On arrival of a
completion packet, the port takes one of the inactive queued threads and activates it. The
activated thread handles the completion event and then blocks again on the port.
The management of threads is done by the completion port. There are a certain number of threads
running (waiting on the completion port actually), but usually not all of them are active at the
same time. When creating the completion port you can specify how many threads are active at the
same time. This value defaults to the number of CPUs in the system.
Completion ports are a bit counter intuitive. There is no relation between a thread and a
connection or operation. Each thread has to be able to act on any completion event that
happened on the completion port. I/O completion ports (IOCP) are not easy to implement but
provide a very good scalability. You will be able to handle thousands of connections with IOCP.
15. Conclusion
I hope you now have a global view of all the I/O models available. Don't worry if you don't fully
understand them, the next chapters will explain them more detailed, one at a time.
The first I/O model I'm going to explain to you is the simplest one, the blocking sockets. Winsock
functions operating on blocking sockets will not return until the requested operation has completed
or an error has occurred. This behavior allows a pretty linear program flow so it's easy to use them.
In chapter 4, you've seen the basic winsock functions. These are pretty much all functions you need
to program blocking sockets, although I will show you some additional functions that may be useful
in this chapter.
You might not be very interested in blocking sockets if you plan to use an I/O model that uses non-
blocking socket. Nonetheless, I strongly recommend you to read the chapters about blocking sockets
too since they cover the socket programming basics and other useful winsock features I will assume
you remember for the next chapters.
1. A simple client
The first example is a simple client program that connects to a website and makes a request. It will
be a console application as they work well with blocking sockets. I won't assume you have deep
knowledge of the HTTP (the protocol used for the web), this is what happens in short:
The client connects to the server (on port 80 by default)
The server accepts the connection and just waits
The clients sends its HTTP request as an HTTP request message
The server responds to the HTTP request with an HTTP response message
The server closes the connection*
*) This depends on the value of the connection HTTP header, but to keep things simple, we assume
the connection will always be closed.
HTTP follows the typical client-server model, the client and server talk to each other in turns. The
client initiates the requests; the server reacts with a response.
An HTTP request includes a request method of which the three most used
are GET and POST and HEAD. GET is used to get a resource from the web (webpage, image, etc.).
POST sends data to the server first (like form data filled by the user), then receives the server's
response. Finally, HEAD is the same as GET, except for that the actual data is not send by the
server, only the HTTP response message. HEAD is used as a fast way to see if a page has been
modified without having to download the full page data. In the example program I will use HEAD since
GET can return quite some data while HEAD will only return a response code and set of headers so
the program's output easier to read.
A typical HTTP request with the HEAD request method looks like this:
HEAD / HTTP/1.1 <crlf>
Host: www.google.com <crlf>
User-agent: HeadReqSample <crlf>
Connection: close <crlf>
<crlf>
The first / in the fist line is the requested page, in this case the server's root (default page).
Blocking sockets: client - MadWizard.org
Blocking sockets: client
HTTP/1.1 indicates version 1.1 of the HTTP protocol is used. After this first special line that contains
the command follows a set of header in the form "header-name: value", terminated by a blank line.
As line terminators, a combination of carriage return (CR, 0x0D) and line feed (LF, 0x0A) is used.
That last blank line indicates the end of the client's request. As soon as the server detects this, it
will send back a response in this form:
HTTP/1.1 Response-code Response-message <crlf>
header-name: value <crlf>
<crlf>
As you can see the response format is much like that of a request. Response-code is a 3-digit code
that indicates the success or failure of the request. Typical response codes are 200 (everything
OK), 404 (page not found, you probably knew this one :) and 302 (found but located elsewhere,
redirect). Response-message is a human-readable version of the response code and can be anything
the server likes. The set of headers include information about the requested resource. A HEAD
request will result in the above response. If the request method would have been GET, the actual
page data will be sent back by the server after this response message.
So far for the crash course HTTP, it's not really necessary to understand it all to read the examples
about blocking sockets, but now you have some background information too. If you want to read
more about HTTP, find the RFC for it (www.rfc-editor.org) or google for HTTP. Another great
introduction to HTTP is HTTP made really easy.
2. Program example
A possible output of the example program called HeadReq is shown here:
X:\>headreq www.microsoft.com
Initializing winsock... initialized.
Looking up hostname www.microsoft.com... found.
Creating socket... created.
Attempting to connect to 207.46.134.190:80... connected.
Sending request... request sent.
Dumping received data...
HTTP/1.1 200 OK
Connection: close
Date: Mon, 17 Mar 2003 20:14:03 GMT
Server: Microsoft-IIS/6.0
P3P: CP='ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR
SAMo C
NT COM INT NAV ONL PHY PRE PUR UNI'
Content-Length: 31102
Content-Type: text/html
Expires: Mon, 17 Mar 2003 20:14:03 GMT
Cache-control: private
Cleaning up winsock... done.
If the program's parameter (www.microsoft.com) is omitted, www.google.com is used.
3. Hostnames
So what do we need for the client? I'm assuming you have the address of the webpage
(www.google.com for example) and you want to get the default webpage for it, like the page you
get when entering www.google.com in your web browser (in order to keep things simple we will only
receive the server's response headers, not the actual page).
As you know from chapter 4, you can connect a socket to a server with the connect function, but
this function requires a sockaddr structure (or sockaddr_in in the case of TCP/IP). How do we build
up this structure? Sockaddr_in needs an address family, an IP number and a port number. The
address family is simply AF_INET. The port number is also easy; the default for the HTTP protocol is
port 80. What about the IP, we only got a hostname? If you remember chapter 2 there's a DNS
server that knows which IPs correspond to which hostnames. To find this out, winsock has a
function called gethostbyname:
hostent * gethostbyname(const char *name);
You simply provide this function a hostname as a string (eg. "www.google.com") and it will return a
pointer to a hostent structure. This hostent structure contains a list of addresses (IPs) that are
valid for the given hostname. One of these IPs can then be put into the sockaddr_in structure and
we're done.
4. Framework
The program we're going to write will connect to a web server, send a HEAD HTTP request and dump
all output. An optional parameter specifies the server name to connect to, if no name is given it
defaults to www.google.com.
First of all, we define the framework for the application:
#include <iostream>
#define WIN32_MEAN_AND_LEAN
#include <winsock2.h>
#include <windows.h>
using namespace std;
class HRException
{
public:
HRException() :
m_pMessage("") {}
virtual ~HRException() {}
HRException(const char *pMessage) :
m_pMessage(pMessage) {}
const char * what() { return m_pMessage; }
private:
const char *m_pMessage;
};
int main(int argc, char* argv[])
{
// main program
}
The winsock headers are already included by windows.h, but because we use some winsock 2
specific things we also need to include winsock2.h. Include this file before windows.h to prevent it
from including an older winsock version first. We will also need the STL's iostream classes, so we
included those too. Don't forget to link to ws2_32.lib, or you'll get a bunch of unresolved symbol
errors.
The HRException class is a simple exception class used to throw errors that occur. One of its
constructors takes a const char * with an error message that can be retrieved with the what()
method.
5. Constants and global data
The program will need some constants and global data, which we define in the following code
snippet:
const int REQ_WINSOCK_VER = 2; // Minimum winsock version required
const char DEF_SERVER_NAME[] = "www.google.com";
const int SERVER_PORT = 80;
const int TEMP_BUFFER_SIZE = 128;
const char HEAD_REQUEST_PART1[] =
{
"HEAD / HTTP/1.1\r\n" // Get root index from server
"Host: " // Specify host name used
};
const char HEAD_REQUEST_PART2[] =
{
"\r\n" // End hostname header from part1
"User-agent: HeadReqSample\r\n" // Specify user agent
"Connection: close\r\n" // Close connection after response
"\r\n" // Empty line indicating end of request
};
// IP number typedef for IPv4
typedef unsigned long IPNumber;
These constants and data define the default hostname (www.google.com), server port (80 for
HTTP), receive buffer size, and the minimum (major) winsock version required (2 or higher in our
case). Furthermore, the full HTTP request is put in two variables. The request is split up because the
hostname of the server needs to be inserted as the host header (see the HTTP message examples
above). While all strings in C automatically get a 0 byte at the end to terminate it, we don't actually
treat it as a null-terminated string. Only the text itself will be send, without the null terminator.
Finally, unsigned long is typedef'ed to IPNumber to make the code a bit clearer.
6. The main function
The first thing to do is initializing winsock. We will do this in the main function and write the actual
code for the HTTP request in a different function named RequestHeaders. The main function is:
{
int iRet = 1;
WSADATA wsaData;
cout << "Initializing winsock... ";
if (WSAStartup(MAKEWORD(REQ_WINSOCK_VER,0), &wsaData)==0)
{
// Check if major version is at least REQ_WINSOCK_VER
if (LOBYTE(wsaData.wVersion) >= REQ_WINSOCK_VER)
{
cout << "initialized.\n";
// Set default hostname:
const char *pHostname = DEF_SERVER_NAME;
// Set custom hostname if given on the commandline:
if (argc > 1)
pHostname = argv[1];
iRet = !RequestHeaders(pHostname);
}
else
{
cerr << "required version not supported!";
}
cout << "Cleaning up winsock... ";
// Cleanup winsock
{
cerr << "cleanup failed!\n";
iRet = 1;
}
cout << "done.\n";
}
else
{
cerr << "startup failed!\n";
}
return iRet;
}
The value the main function returns will be given back as exit code to the OS. Since the convention
for command line program is that an exit code of 0 indicates success while other values indicate
some kind of error, we will follow this and return the correct value depending on the success of the
winsock initialization and the RequestHeaders function.
First of all, WSAStartup is called. It wants the highest winsock version your program supports
(REQ_WINSOCK_VER) and fills in a WSADATA structure. After we check if this function succeeded,
we still need to check which winsock version has been loaded, since this might be less than
REQ_WINSOCK_VER (see chapter 4). If the major version number is at least REQ_WINSOCK_VER, we
got the right version.
Then, argc is checked to see if a parameter was given to the program. If there was, it should be a
hostname and instead of the default hostname, the parameter comes from argv and is passed on to
RequestHeaders.
If WSAStartup succeeded, a matching call to WSACleanup is needed. This is done at the end of
the code.
7. RequestHeaders
RequestHeaders is the function where all the magic happens. The basic structure of it is:
bool RequestHeaders(const char *pServername)
{
SOCKET hSocket = INVALID_SOCKET;
char tempBuffer[TEMP_BUFFER_SIZE];
sockaddr_in sockAddr = {0};
bool bSuccess = true;
try
{
// code goes here
}
catch(HRException e)
{
cerr << "\nError: " << e.what() << endl;
bSuccess = false;
}
return bSuccess;
}
As a parameter, RequestHeaders gets the name of the server to connect to. There are some
variables we will use, the socket handle, a temporary buffer used to store received data and a
sockaddr_in structure for the server's address. The socket handle is initialized to INVALID_SOCKET,
the only value that can't be used as a socket handle. bSuccess is a bool that is set to false if the
function fails. The main code is surrounded by a try-catch block, any error that occurs is thrown as
a standard STL exception and caught by this function. The cleanup code will be after the try-catch
block, so cleaning up happens both when everything succeeds and on failure.
The RequestHeaders function has the following tasks:
Resolve the hostname to its IP.
Create a socket.
Connect the socket to the remote host.
Send the HTTP request data.
Receive data and print it until the other side closes the connection.
Cleanup
I will show you how to implement each step in the next sections.
8. Resolving the hostname to its IP
To connect to the server, we need to fill a sockaddr_in structure with its address. As I said earlier,
this structure consists of an address family (always AF_INET), an IP number and a port number.
Although the port number is not always 80 for web servers, we will assume it is. I also explained
gethostbyname can be used to lookup a hostname at the DNS server and retrieve its IP number. The
next function of our program,FindHostIP, uses this winsock function.
Note that looking up a host involves a request to a DNS server so it might take some time (typically
only 10 milliseconds or so but that's slow compared to normal code). If the hostname isn't found, it
might even take seconds. Because we are using blocking sockets, the program will simply hang on
gethostbyname until it either succeeds or fails. While gethostbyname is running, we have no control
over our program. But as the program is a console program, this doesn't matter.
IPNumber FindHostIP(const char *pServerName)
{
HOSTENT *pHostent;
// Get hostent structure for hostname:
if (!(pHostent = gethostbyname(pServerName)))
throw HRException("could not resolve hostname.");
// Extract primary IP address from hostent structure:
if (pHostent->h_addr_list && pHostent->h_addr_list[0])
return *reinterpret_cast<IPNumber*>(pHostent->h_addr_list[0]);
return 0;
}
Gethostbyname takes a hostname as its single parameter and returns a pointer to
a hostent structure. Note that it cannot handle hostnames that are IPs in string form (like
"101.102.103.104"). Therefore our program does not accept an IP number as server name in the first
parameter. If you would want to allow this, the string can be converted into a number with
the inet_addr function.
If the function fails it returns NULL, which is the first thing we check. It means the server name
could not be resolved. If it did succeed, we now have a hostent structure pointer. This allocated
memory doesn't need to be freed; winsock has a piece of memory for each thread specifically for
storing this data in. However this does imply that on the next call to gethostbyname, you cannot
use the hostent structure returned by a previous call to it, since it would have been overwritten.
The hostent structure can contain a list of addresses, which do not necessarily have to be IP
numbers. Since we use TCP/IP, they will be IP numbers but the structure still has to support other
forms of addresses. The h_addr_list member of hostent points to a null-terminated array of other
pointers. Each pointer points in that array points to an address. Since the hostent structure does
not know the type of addresses used, you need to cast the pointers to the right type, in this case
IPNumber*. The FindHostIP code extracts the first available IP address from this structure and
returns it. Some additional pointer checks ensure that the program doesn't crash if the pointers are
not set or arrays are empty.
The return value of this function, the IP number in network byte order, is used by FillSockAddr:
void FillSockAddr(sockaddr_in *pSockAddr, const char *pServerName, int portNumber)
{
// Set family, port and find IP
pSockAddr->sin_family = AF_INET;
pSockAddr->sin_port = htons(portNumber);
pSockAddr->sin_addr.S_un.S_addr = FindHostIP(pServerName);
}
All it does is calling FindHostIP and storing the IP in the sockaddr_in structure pointed to by the
pSockAddr parameter. It also converts the port number from the portNumber parameter to network
byte order and stores it as well.
Back to the RequestHeaders function we call FillSockAddr to fill in our local sockaddr_in structure
with the right information:
// Lookup hostname and fill sockaddr_in structure:
cout << "Looking up hostname " << pServername << "... ";
FillSockAddr(&sockAddr, pServername, SERVER_PORT);
cout << "found.\n";
9. Creating a socket
The next step is to create a socket to connect with. This is quite simple, just call socket with the
right parameters:
// Create socket
cout << "Creating socket... ";
if ((hSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) == INVALID_SOCKET)
throw HRException("could not create socket.");
cout << "created.\n";
If socket fails, it returns INVALID_SOCKET. In that case, no further operations are performed and
the following cleanup code (after the catch() handler) is executed directly:
if (hSocket!=INVALID_SOCKET)
The cleanup code is always executed, whether an error occurred or not. It first checks if the socket
handle wasn't INVALID_SOCKET (no socket was created). If it isn't, the socket handle is valid and
needs to be closed.
10. Connecting the socket
Now that we have the socket, we can connect it to a remote host with connect. Connect uses the
sockaddr_in structure we've setup earlier with FillSockAddr and attempts to connect the given
socket with the addressed host. Here too, connect will block until a connection has been established
or something went wrong. The return value of connect is zero if the socket is connected, otherwise
it's SOCKET_ERROR. Before actually connecting, a message is print with the IP and port number of
the remote host. The inet_ntoafunction is used to convert the numeric IP into a string with the IP
in dotted format.
// Connect to server
cout << "Attempting to connect to " << inet_ntoa(sockAddr.sin_addr)
<< ":" << SERVER_PORT << "... ";
if (connect(hSocket, reinterpret_cast<sockaddr*>(&sockAddr), sizeof(sockAddr))!=0)
throw HRException("could not connect.");
cout << "connected.\n";
11. Sending the request
When the socket is connected the HTTP request can be send. It is sent in three parts, to easily
insert the hostname inside the request:
HEAD / HTTP/1.1 <crlf>
Host: www.google.com <crlf>
User-agent: HeadReqSample <crlf>
Connection: close <crlf>
<crlf>
The send calls are pretty straightforward, each call takes a buffer and sends the specified amount
of bytes from it to the remote host. Send will block until all the data has been sent, or fail and
return SOCKET_ERROR.
cout << "Sending request... ";
// send request part 1
if (send(hSocket, HEAD_REQUEST_PART1, sizeof(HEAD_REQUEST_PART1)-1, 0)==SOCKET_ERROR)
throw HRException("failed to send data.");
// send hostname
if (send(hSocket, pServername, lstrlen(pServername), 0)==SOCKET_ERROR)
// send request part 2
if (send(hSocket, HEAD_REQUEST_PART2, sizeof(HEAD_REQUEST_PART2)-1, 0)==SOCKET_ERROR)
cout << "request sent.\n";
Note that the buffer sizes specified are one less than sizeof(buffer), because we don't want to send
the null-terminator at the end of the string.
12. Receiving the response
The final step of the program before cleaning up is to receive data and print it until the other side
closes the connection. The HTTP header "Connection: close" in our request tells the server that it
should close the connection after it has sent its response. Receiving data is done with
the recv function that receives the currently available data and puts it in a buffer. I kept the
example simple by choosing to just dump this output instead of actually doing something with it, so
all we have to do is keep calling recv until the connection is closed. Recv too will block if no data is
available immediately and return if some has arrived. The return value of recv is either 0,
SOCKET_ERROR or the number of bytes read. SOCKET_ERROR of course indicates a socket error, 0
indicates closure of the connection. So basically we will loop until recv returns 0 (connection closed,
done) or SOCKET_ERROR (something went wrong). This leads to the following code:
cout << "Dumping received data...\n\n";
// Loop to print all data
while(true)
{
int retval;
retval = recv(hSocket, tempBuffer, sizeof(tempBuffer)-1, 0);
if (retval==0)
{
break; // Connection has been closed
}
else if (retval==SOCKET_ERROR)
{
throw HRException("socket error while receiving.");
}
else
{
// retval is number of bytes read
// Terminate buffer with zero and print as string
tempBuffer[retval] = 0;
cout << tempBuffer;
}
}
Take a look at the call to recv. tempBuffer is the buffer that will receive the data. As the size of the
buffer, we specify its actual size minus one. This is because we will put a 0 byte after the last byte
received to transform the raw data into a null terminated string we can easily print. Note that in
general, it might be perfectly possible to have a 0 byte in the received data since TCP/IP data is not
restricted to text. You'll have to treat it as binary data. However, the HTTP protocol does not allow
0 bytes in a HTTP response message (only text) so this won't happen. Even if it would happen, the
string would be printed wrong (the 0 byte would be wrongly seen as the terminator) but it isn't likely
to happen unless the HTTP server is bad (or the server is not a HTTP server). What this comes
down to is that this is just a quick and dirty way to print all the received data that works find for
correct HTTP HEAD responses. If you would actually do something with the data more care needs to
be taken (for example, a 0 byte in the received data may not be seen as a terminator but indicates
a bad HTTP server).
13. Cleaning up
Finally, the socket is closed (if it was created) as shown earlier and the RequestHeaders function will
return true or false depending on the success of the function. Back in the main function, winsock will
be cleaned up (WSACleanup) and the program quits after printing a last message.
14. Finished!
That's all, the program is finished.
Download the source zip file here: http://www.madwizard.org/download/winsock/headreq_cpp.zip
The zip file contains the source files and the binary executable.
Blocking sockets: server
Now that you've seen how a blocking client works, it's time for the blocking server example. This
chapter will explain how to build a simple server that ROT13 encodes the received data and then
sends it back. ROT13 (rot stands for rotate) is a very simple encryption method used by Caesar.
Each character in the alphabet is replaced by the character 13 positions farther (the characters
rotate 13 places). The encryption is symmetric, that is encryption works exactly the same as
decrypting. You can use rot13.com if you want to play with it.
1. Program flow
The program flow is as follows:
The server creates a server socket
The server socket is bound to an address
The server socket is put into the listening state
On connection attempt, the connection is accepted and a client socket is available
The client socket is read, every byte is ROT13'd and sent back.
When the client closes the connection, the program ends
2. Framework
The framework is almost the same as that of the blocking client example from chapter 6, only the
HRException has been renamed to ROTException and some include files were added:
#include <iostream>
#include <string>
#include <sstream>
#define WIN32_MEAN_AND_LEAN
#include <winsock2.h>
#include <windows.h>
using namespace std;
class ROTException
{
public:
ROTException() :
m_pMessage("") {}
virtual ~ROTException() {}
ROTException(const char *pMessage) :
m_pMessage(pMessage) {}
const char * what() { return m_pMessage; }
private:
const char *m_pMessage;
};
{
// main program
}
3. Constants and global data
The program uses a few constants for the default server port number (4444), the required winsock
version and receive buffer size.
const int REQ_WINSOCK_VER = 2; // Minimum winsock version required
const int DEFAULT_PORT = 4444;
const int TEMP_BUFFER_SIZE = 128;
4. The main function
The main function too has a lot of common with the blocking client from the previous chapter:
{
int iRet = 1;
WSADATA wsaData;
cout << "Initializing winsock... ";
if (WSAStartup(MAKEWORD(REQ_WINSOCK_VER,0), &wsaData)==0)
{
// Check if major version is at least REQ_WINSOCK_VER
if (LOBYTE(wsaData.wVersion) >= REQ_WINSOCK_VER)
{
cout << "initialized.\n";
int port = DEFAULT_PORT;
if (argc > 1)
port = atoi(argv[1]);
iRet = !RunServer(port);
}
else
{
cerr << "required version not supported!";
}
cout << "Cleaning up winsock... ";
// Cleanup winsock
{
cerr << "cleanup failed!\n";
iRet = 1;
}
cout << "done.\n";
}
else
{
cerr << "startup failed!\n";
}
return iRet;
}
Winsock is initialized, and cleaned up again when the program is finished. In between is the server
startup code. The program allows an optional parameter that specifies the port the server should
run on. If it is not set, the default port number is used (4444). Finally, RunServer is called with the
final port number as its parameter. The RunServer function contains the actual server code.
5. RunServer
RunServer is the function where the server is setup and connections are accepted. The basic
framework of this function is:
bool RunServer(int portNumber)
{
SOCKET hSocket = INVALID_SOCKET,
hClientSocket = INVALID_SOCKET;
bool bSuccess = true;
sockaddr_in sockAddr = {0};
try
{
// Create socket
cout << "Creating socket... ";
if ((hSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) == INVALID_SOCKET)
throw ROTException("could not create socket.");
cout << "created.\n";
// code goes here
Blocking sockets: server - MadWizard.org
}
catch(ROTException e)
{
cerr << "\nError: " << e.what() << endl;
bSuccess = false;
}
if (hSocket!=INVALID_SOCKET)
if (hClientSocket!=INVALID_SOCKET)
closesocket(hClientSocket);
return bSuccess;
}
A server socket is created in the usual way, and a variable to hold the client socket is reserved
too. The client socket doesn't have to be created, since winsock will do that for us later. You do
have to close both socket handles when you won't use them anymore though, this is done in the
cleanup part at the end of the code.
6. Binding the socket
After the socket is created, we will bind it to an address. As I've explained in the first chapters, a
server listens on a specific port number and possibly on a specific IP number as well. Before you
can let the server socket listen, it must be bound. The winsock API bind will do that for you. In
this example, the socket will be bound to the port specified by the portNumber parameter of
RunServer, the IP number is set to INADDR_ANY, indicating that the server will listen on all
available IP numbers. To bind a socket with bind, you need to fill in a sockaddr_in structure with
the address you want the socket be bound to. Setting up this structure is done in a separate
function called SetServerSockAddr:
void SetServerSockAddr(sockaddr_in *pSockAddr, int portNumber)
{
// Set family, port and find IP
pSockAddr->sin_family = AF_INET;
pSockAddr->sin_port = htons(portNumber);
pSockAddr->sin_addr.S_un.S_addr = INADDR_ANY;
}
This function is called in RunServer in the following way:
// Bind socket
cout << "Binding socket... ";
SetServerSockAddr(&sockAddr, portNumber);
if (bind(hSocket, reinterpret_cast<sockaddr*>(&sockAddr), sizeof(sockAddr))!=0)
throw ROTException("could not bind socket.");
cout << "bound.\n";
7. Letting the socket listen
If the binding succeeds, the socket is put into listening mode. As soon as it's in this state, any
client can make a connection attempt to the server on the port the socket is bound to. Setting
the listening mode is simply done by calling the listen winsock function:
// Put socket in listening mode
cout << "Putting socket in listening mode... ";
if (listen(hSocket, SOMAXCONN)!=0)
throw ROTException("could not put socket in listening mode.");
cout << "done.\n";
Listen has two parameters. The first is the socket you want to listen, the second is the length of
the queue of pending connections. Usually the default value of SOMAXCONN is okay for the latter
parameter. This value is the maximum number of connections that winsock will hold pending until
your program accepts them. You probably don't need to worry about this value most of the time.
8. Accepting connections
When the server socket is in the listening state, you need to accept the incoming connections
using the accept function. The accept function blocks until a connection request comes in,
establishes the connection and then returns a client socket handle. It is important to know that
the server socket's only purpose is now to listen for connections and accept them. As soon as you
accept a connection, a new socket is created by winsock. This socket is usually called the client
socket and that's the socket you will be receiving and sending data on. This often confuses
winsock beginners, some try to receive or send data on the listening socket, while they should use
the client socket.
Besides accepting a connection and returning a client socket handle, accept also fills in a
sockaddr_in structure with information about the client. Our example will use this information to
print a short description of the client that connected (in the form IP:port).
sockaddr_in clientSockAddr;
int clientSockSize = sizeof(clientSockAddr);
// Accept connection:
hClientSocket = accept(hSocket,
reinterpret_cast<sockaddr*>(&clientSockAddr),
&clientSockSize);
// Check if accept succeeded
if (hClientSocket==INVALID_SOCKET)
throw ROTException("accept function failed.");
cout << "accepted.\n";
// Wait for and accept a connection:
HandleConnection(hClientSocket, clientSockAddr);
The above code calls accept, and then handles the both the client socket handle and the
sockaddr_in structure to a new function, HandleConnection, which will deal with the connection.
After this code has executed, RunServer returns and closes the sockets, as shown earlier.
9. HandleConnection
The HandleConnection function handles the connection. The first thing it does is showing a short
description of the client. A separate function (GetHostDescription) is used to create this
description.
void HandleConnection(SOCKET hClientSocket, const sockaddr_in &sockAddr)
{
// Print description (IP:port) of connected client
cout << "Connected with " << GetHostDescription(sockAddr) << ".\n";
char tempBuffer[TEMP_BUFFER_SIZE];
// todo
cout << "Connection closed.\n";
}
The GetHostDescription function looks like this:
string GetHostDescription(const sockaddr_in &sockAddr)
{
ostringstream stream;
stream << inet_ntoa(sockAddr.sin_addr) << ":" << ntohs(sockAddr.sin_port);
return stream.str();
}
We will now write the part marked as 'todo' in the above HandleConnection framework. The
function should loop on recv until the connection is closed (recv returns 0). Every time data is
received, it is ROT13 encoded and sent back to the client. First of all, a simple function is written
to deal with the ROT13 encryption:
void rot13(char *pBuffer, int size)
{
for(int i=0;i<size;i++)
{
char c = pBuffer[i];
if ((c >= 'a' && c < 'n') || (c >= 'A' && c < 'N') )
c += 13;
else if ((c>='n' && c <= 'z') || (c>='N' && c <= 'Z'))
c -= 13;
else
continue;
pBuffer[i] = c;
}
}
Then the main loop is simple. First a recv call, that will receive data from the client. The rot13
function is called to encrypt the received data and finally send is used to send the encrypted data
back to the client:
// Read data
while(true)
{
int retval;
retval = recv(hClientSocket, tempBuffer, sizeof(tempBuffer), 0);
if (retval==0)
{
break; // Connection has been closed
}
else if (retval==SOCKET_ERROR)
{
throw ROTException("socket error while receiving.");
}
else
{
/* retval is the number of bytes received.
rot13 the data and send it back to the client */
rot13(tempBuffer, retval);
if (send(hClientSocket, tempBuffer, retval, 0)==SOCKET_ERROR)
throw ROTException("socket error while sending.");
}
}
10. Testing
That's all, it should work now. To test the program, you could use the telnet client supplied by
windows but you have to get the settings right. If you switch off the local echo you don't see
what you type but you do see what data you receive. This means you see the encrypted text
directly. However, I prefer a better client called PuTTY, you can find it here. I recommend you to
download it as well. Compile the program, run it and you will (hopefully) see a message that the
program is waiting for a connection:
X:\asm\rot13server>rot13server
Initializing winsock... initialized.
Creating socket... created.
Binding socket... bound.
Putting socket in listening mode... done.
Fire up putty, and in the configuration screen, type in localhost as the hostname, 4444 as the
port number (or a different one if you choose to run the program with some other port). Set the
protocol to Raw. Finally press Open to connect.
You will now see putty's console window. Here you can type text that will be send to the server.
Any data received will be printed in the same window. Note: putty by default has local line editing
enabled. This means that you can type and even edit the text you type as long as you stay on the
same line, since it's not send until you press enter. If you use a client that immediately sends
every character, you also get a response immediately. If you have such a client you should disable
local echo (ie. showing the text you type), otherwise you get your text and the received text
interleaved, which is pretty hard to read. This is not the case with putty. Here's a screenshot of
the connection in action:
11. Source code
Finally, the source code:
Download the source zip file here:
12. Conclusion
Now you've seen both a blocking client and a blocking server. Blocking sockets are relatively easy
to use because they fit in nicely in the program flow. Still, you've only seen pretty simple examples,
since both the client and the server we showed did practically nothing with the data other than
print it or in this case, encrypt and then send it back. It gets harder when we have to extract
meaningful information from the received data like when dealing with a protocol like POP3.
2010 by Thomas Bleeker (MadWizard)
http://www.madwizard.org/download/winsock/rot13server_cpp.zip

Winsock Networking Tutorial (C++)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Winsock Networking Tutorial (C++)

Uploaded by

Copyright:

Available Formats

Networking introduction

You might also like