You are on page 1of 9

A White Paper Overview of FTP and Firewalls

Executive Summary
This paper will discuss the common Internet application “FTP” and its interaction with equally-common Internet
Firewalls. In general, non-technical terms, FTP is used to transfer files from one machine to another. Its download
capability has been incorporated into most common web-browsers as one method of point-and-click downloading.
And, in non-technical term, Firewalls are typically used to restrict the Internet traffic that can reach a specific set
of machines with the intent of protecting those machines from troublesome and potentially malicious Internet traffic. Some
Internet services pose special difficulties for firewalls, especially firewalls that also provide Network Address Translation
(NAT) capability, and causes them to be identified as “problem services’. This paper focuses on the oldest of the “problem
services” – the File Transfer Protocol or FTP.
Because FTP was designed for ease of use in an era when network security was a distant consideration, its design
is often at cross-purposes with the restrictions imposed by modern firewalls. Careful examination of these interactions, and
consideration of ways in which careful firewall design can avoid or minimize most of the problems, can provide a basis for
developing methods of accommodating similar limitations that appear as new “problem services” come into being.
Background
FTP was one of the original TCP/IP services to be commissioned by DARPA in the late 1960’s, and it remains
one of the most heavily used Internet applications.1 It predates the Web’s popular HTTP protocol, for example, by over 20
years -- the first actual HTTP transfer across the Internet was on Christmas day 1990.2 Yet FTP is not unique in that it
substantially predates modern firewalls, so why is it more of a problem than other “ancient” services, such as telnet,
SMTP, and DNS? To understand that, a high-level understanding of how firewalls restrict Internet traffic is necessary.
For our purposes here, a “firewall” is a network element that receives IP datagrams and examines them, then
decides whether to pass them on to their destinations; in other words, it performs IP packet filtering. A firewall may be part
of a router or gateway, serving to protect a collection of computers on a Local Area Network, or it could be a software
process running on the same computer it protects (sometimes called a “personal firewall”).
All Internet data is transacted in the form of IP datagrams which I’ll frequently refer to as “IP packets”.
Essentially, these packets are comprised of two pieces:
♦ a header portion that contains (among other things) information about where the packet came from (its
source address) and where it is going to (its destination address)
♦ a contents portion, or “payload” that consists of the actual data being delivered from source to destination

The distinction is akin to the difference in ordinary paper letter mail between the outside of the envelope, which
contains the return address and the destination address (and a few other things), and the inside of the envelope, which holds
the contents of a message. Since the contents portion is commonly much larger than the header, and since the contents portion
varies considerably from service to service while the header does not, most firewalls simply examine the header associated
with every data packet, and from that make a yes/no decision about delivering the contents of the envelope.

page 1 of 9
Copyright © 2000, 2001 Echogent Systems, Inc.
Following this analogy, the firewall asks the TCP/IP equivalents of questions like:
♦ Is the letter expected?
♦ Is it being sent specifically to us?
♦ Is the post-mark valid?
♦ It is being sent from a known source of junk-mail?
For every 10 million letters you receive, one of them may be a mail-bomb -- preventing just that one is well worth
the trouble of examining the other 10 million.
This simplest form of firewalling – packet filtering based on TCP/IP header examination – can work very
effectively, even on hardware with very limited CPU power and memory. Work on the Linux Router Project
(www.LinuxRouter.org) has shown that a 486 processor can easily apply a firewall ruleset with 50 packet examination
rules to a typical WAN connection (cable modem, DSL, T-1) without breaking a sweat.
But every silver lining has its cloud: configuring such a header-examining, packet-filtering firewall can be
especially challenging when the service involved is FTP.

A Word on Port Numbers


In the header portion of an IP packet, in addition to the source and destination IP addresses (commonly seen
things such as 192.168.100.1), there’s a field indicating what sort of datagram is being carried. For FTP, the datagram is an
encapsulated TCP packet (there are many other types of IP packets). This TCP packet has its own TCP header which
contains the source and destination “port number” for the data. Servers (e.g., a web-server or ftp-server) and their
associated software processes “listen” on specific ports which are known to the server and the caller beforehand. The list of
standardized services and their associated port numbers are maintained by the Internet Assigned Numbers Authority
(www.IANA.org). They’ve broken up the range of all possible port numbers into three distinct sub-ranges:
l All standard services have an associated official port number. Called “Well Known” or “System” ports,
they span the range of 1 to 1023. Typically the services which listen to this low-numbered port range run at
root-level privilege on the server itself. Troublesome or malicious code is often most interested in attacking
systems and their services in this range, for a victory means root-level control over the conquered system.
l Above this low range, there is the very large “Reserved” range of port numbers, also known as “User”
ports, which span from 1024 to 49151. Many common applications that run without root-privilege, such as
Napster, are active in this range, as are the client applications, such as a web-browser, which initiate
connections to the above System ports.
l At the top there is the range of “Dynamic” port numbers, or Private port range, from 49152 to 65535. This
port range is usually utilized by transparent networking processes which dynamically rewrite the TCP/IP header
information, such as what is done during Network Address Translation.

Knowing what services use what ports is essential to the creation of firewalls, the design of the rules that guide
their decision making, and the operation of the actual services they protect. For example, suppose I am creating a firewall
designed to protect a web server. I know from the IANA that my web server needs to listen for incoming TCP/IP
connections on ports 80 and 443. So I would need to allow all TCP traffic destined for ports 80 and 443 on the host
running the Web server to pass through the firewall. However, I also know that there would then be 1021 other System
ports on that machine which I would not want any traffic to reach. My firewall ruleset – or, more exactly, the part that
applies to System ports -- would consist of two rules:

♦ first, allow anything destined for port-80 or 443 through;


♦ second, block everything else to ports in the range 1-1024.
page 2 of 9
Copyright © 2000, 2001 Echogent Systems, Inc.
In addition, the Web server itself never initiates a connection to a Web browser, so this firewall does not need to
distinguish between new incoming TCP connections and replies to existing TCP connections with regard to port-80 or 443.

If only all services were so easy.

Why FTP is Difficult

Standard Web servers are always listening for incoming data on port-80 (additionally, most standard Web servers
listen for encrypted data on port-443). They never have to listen to anywhere else, there's just one port associated with each
variant of this service, and each variant operates independently of the other.

In FTP, however, there are two ports involved in every connection: a standard FTP server listens for incoming
connection requests on port-21, but it transacts the data exchange on another port. This introduces problems because the
data connection on this other port, as well as the actual selection of the port number to use, can be initiated by either the
client or the server. Without knowing the type (technically, the "mode") of the original FTP request, it is difficult for a
firewall to determine the legitimacy of this second connection attempt.

At this point, an illustration is worth a thousand words.

Example #1: FTP as it first was.

In this illustration I am initiating an FTP connection from my PC to a remote FTP server somewhere on the
Internet. There are no firewalls in this example, making it much more informative than realistic. Note the order in which
each side of the connection takes turn speaking, and the port numbers associated with the start and end:

5150
start FTP,
port 5151
21

OK
5150 21
ansfer
5151 start DATA tr 20
My PC
FTP Server
OK
5151 20

Here, the client PC initiates an FTP connection by sending out a request from port number 5150, destined for port
21 on the FTP server. As part of the data payload of that packet, the FTP client tells the server, "when you speak to me, use
port-5151". After sending that request, the FTP client gets ready for the data exchange by opening port-5151 and listening
to it.
The FTP server does two things in response. First, it acknowledges the FTP initiation request, sending an "OK"
signal from its port-20 back to the port on the PC that the FTP client application used to begin the connection. Second, it
initiates the actual data transaction, sending data from its port-20 to the client PC's port-5151. After each packet of data
arrives at the PC, its FTP-client application sends an "OK, got it" signal back to the FTP server's port-20.
This works just fine. In fact, in the first few years of FTP, this is how it actually worked (though the "PC" back
then would have been a time-shared mainframe or mini-computer ). It is known as "Active-Mode FTP", because the FTP
server actively initiates the creation of the data channel.

page 3 of 9
Copyright © 2000, 2001 Echogent Systems, Inc.
Note carefully how the response works: in response to a packet that "starts" a connection (technically a packet
with the SYN flag, part of the TCP header, set), both the server and the client send a reply back to the exact same port from
which the connection originated. This "synchronize" (SYN) and "acknowledgement" (ACK) handshaking is fundamentally
what makes TCP connection oriented. Usefully, the distinction between a SYN and an ACK packet can be detected by a
firewall and can therefore be used as a criterion for making filtering decisions.

Let's now insert a firewall into the diagram, one which protects the FTP server:
Example #2: Typical
5150
start FTP, Co-Location Setup
port 5151
21
5150 OK 21

sfer
5151 start DATA tran 20
My PC Firewall
OK FTP Server
5151 20

In this second example, the FTP server is protected from the wilds of the Internet by a packet-filtering firewall. It
has been configured to allow Active-Mode FTP to work: new connections to port-21 on the FTP server are allowed in and
responses to connections (i.e., packets without the SYN flag set), but no new connections, are allowed through to port-20.
Note that the firewall itself is itself not directly accessed: packets which are destined for the FTP server are explicitly
routed through the firewall.
(This setup is typical of a co-location arrangement such as with Echogent's FTP server (ftp.Echogent.com) co-
located at Exodus (www.Exodus.net). The routing tables at Exodus simply indicate that the default route to our FTP
server's IP address is via the IP address of the external interface of our firewall.)
As with Example #1, everything works fine: the PC initiates an FTP connection, tells the server about port-5151,
and the FTP server then initiates a connection to my PC to that specific port. Again, as with Example #1, it all works.
But now suppose that the client PC is itself behind a firewall:

5150 65150 start FTP, port


5151
21

5150 65150 OK 21

sfer
5151 start DATA tran 20

DENY FTP Server


My PC and Firewall

Example #3: First signs of Trouble

In this third example, the situation is complicated by the insertion of a firewall on the end user's LAN. Here, we
have assumed the use of a type of firewall that is the most common type sold for use in homes and small-businesses.
Specifically, this firewall allows multiple computers to share a single Internet connection via a form of Network Address
Translation (NAT).

page 4 of 9
Copyright © 2000, 2001 Echogent Systems, Inc.
This form of NAT on the firewall takes a packet that comes from the LAN-side interface and:
• rewrites the IP header so the source address appears to be the NAT'ing firewall's own external IP address
• rewrites the TCP header to use a source-port number from the firewall’s own Dynamic port range
• maintains internally a table that lets it match up this Dynamic port to the true source address and port, so it
can properly direct responses.
So even though the PC initiated the connection, the packet that's released to the Internet appears to come from the
firewall itself. Done just this way, this it will break Active FTP'ing. Here's how:
The client on the PC initiates an FTP connection from PC port-5150. That packet heads through the client-side
firewall and is "NAT'd" so that it is released out onto the Internet with a different return port: e.g., 65150 instead of just
5150. Now there is nothing illegal going on here at all: the NAT'ing router is just conforming to the specification and the
intent of NAT and the IANA's use of that port number range.
The packet arrives at the FTP server's firewall which allows it through, and the FTP server responds with an
acknowledgement. That ACK arrives at port-65150 of the client-side firewall, is NAT'd back into 5150, and sent along to
the PC. So far so good.
The trouble is, the PC has told the FTP server, "when you speak to me, use port-5151" ... and that information was
in the payload of the FTP initiation packet. The FTP server does what it should do and initiates a data connection to port-
5151. But the client-side firewall is bewildered by this connection attempt: from its point of view, it know nothing about
connections coming in on port-5151 - since it only examines TCP/IP headers, not data payloads. In fact, it cannot
distinguish this incoming connection from someone initiating a illicit connection attempt. The client-side firewall does
what it should do and denies the packet, prematurely and ungently ending the FTP session.

Passive-Mode FTP

In the above example, everyone was adhering to common Internet standards, and everyone was doing what they
could for the betterment of Internet security in general, and still things broke. In fact, from the server’s point of view,
nothing changed at all: one-day active-mode FTP behind the firewall was fine, the next day it wasn’t working at all.

To mitigate this exact scenario, the Internet Engineering Task Force (www.IETF.org) adopted RFC-765,
specifying Passive-mode (PASV) FTP. In this mode, port-21 is still used for initiation, but it is then the server, not the
client, which specifies the TCP port#, as well as an IP#, for the ensuing data transfer. Additionally, it is the client, not the
server, which initiates the second connection to this provided address to start the data transfer. This requires modification
to the firewall that was protecting the FTP server, as is illustrated below in Example #4:

FTP Server
5150 65150 start FTP, PASV and Firewall
21

5150 65150 OK, IP#, 45678 21

65150 start Data Tran


sfer
45678
65150 OK
My PC
Example #4: PASV mode FTP works

page 5 of 9
Copyright © 2000, 2001 Echogent Systems, Inc.
Note that in this fourth example, the initial connection from the client contains a "PASV" instruction in the
payload, rather than a port number to be used for the data connection. The server replies with its own IP address and port
number 45678, meaning that the client should now connect to port-45678 to transact the data. The client initiates the data
connection, data flows, and the failure of example #3 has been averted.
On the client side, the switch to Passive Mode meant that very little else had to change. And in practice, Web
browsers typically use Passive-Mode FTP for file transfer for this very reason.
On the server side, two things needed to change:
• the FTP server was re-configured to offer a specific range of PASV ports, and
• the firewall was re-configured to accept incoming connections to this specific range of high-numbered ports.
FTP server packages such as ProFTPd (www.ProFTPd.org) or wu-ftpd (www.wu-ftpd.org) allow for exactly this
sort of configurability, and of course firewalls can be modified to allow traffic to reach the needed ports.
So is the problem solved for everyone forever? Is a server-side firewall and FTP package configured to use
Passive-Mode FTP a panacea for firewalled FTP service? Almost, but not quite.
Passive-Mode FTP is a panacea from the client’s point of view. But, all of the examples thus far have presumed
that the FTP server has an IP address that is known to the world. That is, it has a "real", routable IP address, and the
routing tables of the ISP that connects it to the Internet know that its firewall/router is the route to the FTP server. But what
if we wanted to put the FTP server behind a NAT'ing firewall? That is, in addition to using NAT to allow many clients to
share a single IP address, a system owner may also want to designate one of his or her LAN machines as an FTP server,
and have that service available externally. This is where Passive-mode introduces special challenges.
The following examples illustrates a Passive-Mode firewall server behind a NAT'ing firewall. First we'll show it
not working, then we'll fix it.

Example #5: PASV mode FTP behind a NAT’ing firewall (Broken)


start FTP, PASV
21 21
8
OK, my IP#, 4567 21 21
start Data Tran
sfer FTP Server
45678
My PC and NAT’ing
DENY
Firewall

In this arrangement, the server-side firewall is NAT'ing (as described earlier) as well as port-forwarding. That is,
the firewall is instructed that all connections arriving at the firewall's port-21 are to be "forwarded" along to port-21 of the
internal FTP server.
Now, to support Active-Mode FTP completely, all that needs to be added is a similar instruction for port-20. This
configuration is surprisingly similar to the earlier co-location model where the firewall was, in a sense, transparent to the
whole FTP process -- the only difference here is that it is the IP address of the firewall itself that the FTP client initiates a
connection to. Through the NAT'ing process, the actual IP address of the FTP server itself is hidden from the outside
world. No, this obfuscation does not mean that NAT'ing is inherently more secure that the earlier model ---- regardless of
NAT, normal operation of FTP requires the use of incoming calls.3

page 5 of 9
Copyright © 2000, 2001 Echogent Systems, Inc.
But Passive-Mode is another matter entirely. In this case, the port-forwarding process sends the "start PASV"
packet from the FTP client to the FTP server. The FTP server replies, as it should, with an IP address and port number to
connect to. The port-forwarding process is smart enough to "re-forward" the reply packet from the FTP server, so that it
appears to emerge from port-21 of the firewall, and that packet finds its way to the FTP client.
The problems, as you've probably realized already, are in the data portion of that reply packet: First, the FTP
Server sent, by default, its own "un-NAT'd" IP address out in response to the PASV request. Second, the PASV port
number specified by the server (i.e., 45678) is not being port forwarded and so will not find its way through the firewall to
the FTP server.
As with example #4, these two problems require configuration changes be made to the firewall and to the FTP
server. Success is illustrated below:
start FTP, PASV
21 21
8
OK, FW IP#,4567 21 21
start Data Tran
sfer FTP Server
45678 45678
My PC OK and NAT’ing
45678 45678
Firewall
Example #6: PASV mode FTP behind a NAT’ing firewall (working)

Now, the FTP server is replying to the PASV command with the IP address of the NAT'ing firewall, not of itself
which is the default. Furthermore, it has had its PASV port range specified, and that entire range of ports has been
configured into the firewall's port-forwarding rules. That is, that whole range is being port-forwarded from the external
interface of the firewall to the same port range on the FTP server.

Context aware Firewalling


A common theme in each of all those PASV solutions is that a large number of ports need to be "opened" and, in
the case of a NAT'ing firewall, subsequently forwarded, at the server end of the FTP connection. That is, the elegance of
Active-mode FTP, in which only 2 TCP ports are opened for any number of clients, has be traded in for a solution in which
a number of ports has been opened equal to the maximum number of simultaneous clients we want to support -- we've gone
from opening just two ports to potentially several thousand.
Of course, a malicious user only needs access to one unsecured port to attack the integrity of a system's security,
and so opening several thousand ports is perhaps not the best solution.
As you'll recall, the arrangement that motivated Passive-Mode FTP was similar to this topology:

5150 65150 start FTP, port 65


151
21

5150 65150 OK 21

5151 sfer
65151 start DATA tran 20
FTP Server
My PC and Firewall
Example #7: An alternative to PASV?

page 7 of 9
Copyright © 2000, 2001 Echogent Systems, Inc.
Recall the problem: in Active-mode FTP, the client indicates the port-number to which it will be expecting the
FTP server to initiate a data connection. If that FTP client is behind a NAT'ing firewall, then that firewall will be the
recipient of that data connection, not the client PC.
But what if that firewall on the client side was smart enough to detect that an FTP session was just initiated? That
is, suppose it "notices" that a LAN member has just initiated a data connection to some remote server's port-21. With some
special software, the firewall could break from the mold of looking only at the IP packet header, and actually examine the
IP payload of that outgoing packet. There, it would learn two things:
• Active-mode FTP is being used, and
• the client is expecting a connection on its port-5151.
With this information in hand, the firewall could then dynamically re-write that TCP/IP payload to indicate its
own port-65151. It could then add an associated entry into the NAT'ing tables to route properly the previously troublesome
data connection coming from the server. Rather pleasingly, this approach actually works - not just for FTP, but for other
"problem services" as well.
This process of dynamically updating the firewall based on payload-data initiated from within the LAN is
supported in many firewall products, such as Cisco's (www.Cisco.com) "Context-Based Access Control" (CBAC) or
Checkpoint's (www.Checkpoint.com) context-sensitive I-Gear.
A simpler example is the ip_masq_ftp.o module which is used in many Linux-based firewalls to enable such
dynamic updates specifically for FTP -- other Linux-kernel modules exist for other troublesome services (such as IRC,
ICQ, and HR323).. This modular approach is different from a full-blown context awareness in that it allows the firewall
administrator to add only as much functionality as needed and no more.

Conclusion
Nothing illustrates the intricacies of FTP and firewalls better than an actual cheat sheet on setting them up to work
together. So let me conclude with just that: a cookbook detailing the configuration steps required to enable either running
or using (i.e., connecting to) an FTP server behind a firewall. In most cases, only the firewall itself needs to be especially
configured. In some cases, as shown, the FTP server itself needs special attention.

page 8 of 9
Copyright © 2000, 2001 Echogent Systems, Inc.
Case-1: Running an Active FTP server behind a non-NAT’ing firewall Case-5: Running a Passive FTP server behind a non-NAT’ing firewall
Ü Firewall: allow TCP packets with the SYN flag set to reach port-21 of the Ü Firewall: allow TCP packets with the SYN flag set to reach port-21 of the
FTP server FTP server
Ü Firewall: allow TCP packets with the SYN flag cleared to reach port-20 of Ü Firewall: allow TCP packets with the SYN flag set to reach a limited set of
the FTP server high-number ports on the FTP server
Case-2: Using an Active FTP server from behind a non-NAT’ing firewall Ü Server: configure the passive port range offered to use only the ports
Ü Not recommended. Much safer to insist on use of PASV FTP’ing. specified in the above step
Case-6: Using a Passive FTP server from behind a non-NAT’ing firewall
Ü Firewall: allow TCP packets with the SYN flag cleared from any port-21
Ü Firewall: allow all TCP packets from any port-20 to reach a limited set of Ü Nothing special required from the user’s point of view.
high-numbered ports on the client. Again, not recommended. Case-7: Running a Passive FTP server behind a NAT’ing firewall
Ü Client: configure the client-software to use only the ports specified in the Ü Firewall: allow TCP packets with the SYN flag set to reach port-21 of the
above step firewall
Case-3: Running an Active FTP server behind a NAT’ing firewall Ü Firewall: allow TCP packets with the SYN flag set to reach a limited set of
Ü Firewall: allow TCP packets with the SYN flag set to reach port-21 of the high-numbered ports on the firewall
firewall Ü Firewall: port-forward all TCP packets from port 21 and this limited set of
Ü Firewall: allow TCP packets with the SYN flag cleared to reach port-20 of high-numbered ports to the NAT’d FTP server
the firewall Ü Server: configure the passive port range offered to use only the ports
Ü Firewall: port-forward all TCP packets from port 20 and 21 to the NAT’d FTP specified in the above step
server Ü Server: configure the passive IP# told to clients to be the external IP# of the
Case-4: Using an Active FTP server from behind a NAT’ing firewall firewall

ÜFirewall: install ip_masq_ftp.o or equivalent Case-8: Using a Passive FTP server from behind a NAT’ing firewall
Ü Nothing special required from the user’s point of view.

Credits
The author wishes to thank Ray Olszewski for his contributions and editorial support. Further thanks to Matthew
Schalit, and Jeff Newmiller for their feedback and inspiration, as well as the tireless support of the Linux Router Project
(www.linuxrouter.org) mailing list devotees.
About the Author
This white paper was authored by Scott C. Best, sbest@echogent.com, founder and director of engineering at
Echogent Systems, Inc., a networking software company. Permission is granted to copy, distribute and/or modify this
documentation under the terms of the GNU Free Documentation License (an editable, “transparent” version of this text is
available upon request). Comments, corrections, feedback appreciated.
References
1. Getting Connected, Kevin Dowd, O’Reilly and Associates, 1996, pg 337.
2. Weaving the Web, Tim Berners-Lee, HarperCollins, 1999, pg 30.
3. Firewalls and Internet Security, W.Cheswick and S.Bellovin, Addison Wesley, 1994, pg 57

page 9 of 9
Copyright © 2000, 2001 Echogent Systems, Inc.