You are on page 1of 7

Detection of Encrypted Tunnels across Network

Boundaries
Maurizio Dusi, Manuel Crotti, Francesco Gringoli, Luca Salgarelli
DEA, Università degli Studi di Brescia,
via Branze, 38, 25123 Brescia, Italy
E-mail: <firstname.lastname>@ing.unibs.it

Abstract— The use of covert application-layer tunnels to bypass door to malicious traffic by encapsulating it into clear-text
security gateways has become quite popular in recent years. payloads. This is the case with HTTP tunneling techniques,
By encapsulating blocked or controlled protocols such as peer- which we recently covered in [1]. On the other hand, when the
to-peer, chat and e-mail into others allowed by the security
policies, such as HTTP, SSH or even DNS, both legitimate tunneling mechanism is encrypted, any ALG-based technique
and malicious users can effectively neutralize many security for filtering tunnels becomes completely ineffective. This is
restrictions enforced at the network edge. Traditional firewalling the case with SSH tunnels, which can be activated with any
techniques, based on Application Layer Gateways and even Secure Shell (SSH) [2] implementation.
pattern-matching mechanisms are becoming practically useless These tunnels can be configured to protect cryptographically
as tunneling tools grow more sophisticated.
In this paper we propose an effective solution to this problem any TCP traffic stream between an SSH client (typically the
based on a statistical traffic classification technique. Our mecha- tunnel entry point) and an SSH server (the exit point), thus
nism relies on the creation of a statistical fingerprint of legitimate making the tunneled connection not detectable by existing
usage of a given protocol, such as regular remote interactive payload-based and pattern-matching classifiers employed by
logins or secure copying activities. Such fingerprint can then be ALGs. In this scenario, if network administrators allow SSH
used to detect with high accuracy non-legitimate sessions, i.e.,
sessions that tunnel other protocols. Results from experiments to go through the boundary of their network, they lose any
conducted on a live network suggest that the technique can be control over what the users may port-forward over SSH. On the
very effective, even when the application layer protocol used as other hand, completely blocking SSH connections might not be
a tunnel is encrypted, such as in the case of SSH. possible, for example when the use of such protocol to protect
I. I NTRODUCTION remote-terminal sessions is essential to the job performed by
the users.
The majority of local area networks today enforce security In this paper we present a technique based on the statistical
policies at their boundaries. The policies are usually designed analysis of traffic for the enforcement of network-boundary
to achieve at least two objectives. On one hand, they try security policies. The key idea is that the information carried
to limit the damages to the network that legitimate clients by packets at the network layer, such as packet-size and
may cause, for instance by inadvertently downloading virus- inter-arrival time between consecutive packets, are enough
infected e-mails or by leaking secret business information to infer the nature of the application protocol that generated
to the outside world. On the other hand, network-boundary those packets. By characterizing, or fingerprinting, the allowed
security policies can limit or sometimes eliminate troubles protocols when used in their native form, we show that it is
caused by worms and other types of malware. These network possible to detect with great accuracy when they are being
pests, whenever installed inside the LAN, could connect to used to tunnel other protocols. The technique is robust enough
the Internet and either waste substantial networking resources to work even when the tunneling mechanism uses encryption
or, even worse, provide uncontrolled communication channels to protect the privacy of the traffic itself, such as in the
between the LAN and the attacker’s computers. case of SSH. The mechanism we present, called “tunnel
Network-boundary security policies are usually imple- hunter”, is empirically derived from a näive Bayes approach. It
mented by combining two types of devices. A firewall controls follows from our previous work [3] and extends the technique
that incoming and outgoing traffic passes through an Applica- described in [1], to deal with encrypted SSH tunnels, and to
tion Level Gateway (ALG), and enables only traffic using ports prove that it is effective at detecting more types of tunneled
allowed by the security policy. ALGs are then responsible for traffic, such as peer-to-peer (P2P).
checking that the nature of the traffic crossing the boundaries The main research contributions of this paper are:
is conforming to policies, and that it is not malicious. • the definition of a statistical technique, based on a pattern
Firewalls and ALGs can become completely ineffective in recognition approach, to detect when a certain encrypted
at least two cases. On one hand, the tunneling mechanism application-layer protocol is being used to tunnel another
can be smart enough to fool the ALG into opening the protocol on top of it;
This work was supported in part by a grant from the Italian Ministry for • the report on a series of experiments carried out on a real
University and Research (MIUR), under the PRIN project RECIPE. network that proves the effectiveness of the technique we
propose here.
The rest of the paper is organized as follows. Section II
describes related work. Section III illustrates the SSH tunnel-
ing technique we have based our work on. In Sections IV
and V we describe our methodology and its application to
real traffic on a live network reporting experimental results.
Finally, Section VII concludes the paper.
II. R ELATED WORK
Fig. 1. How tunnels work: high-level scheme.
The analysis of Internet traffic to detect application-layer
protocols has a long history. The use of Deep Payload Inspec-
tion (DPI) techniques is today very popular for the classifica- of operation is quite similar, and is depicted in Figure 1.
tion of network traffic and is implemented in many Network These tools follow a client-server model: a client host
Intrusion Detection Systems such as BRO and Snort [4], inside the protected network connects to an outside server
[5]. However, the ability to build pattern-matching regular using an application protocol that is allowed by the network-
expressions that can effectively classify tunneled traffic has boundary security policies. Each endpoint then provides the
yet to be demonstrated, and it becomes completely ineffective (de)encapsulation of the tunneled protocol and forwards the
anyway when employed on encrypted tunnels. original data to the actual server or client involved in the
A novel kind of approach that could overcome these im- communication. Full control of the tunnel endpoints is ob-
pairments is based on the analysis of traffic from a statistical viously required, but this is usually not a problem: common
point of view as opposed to the deterministic techniques used setups involve a laptop inside the office and a home computer
in DPI-based approaches. Starting with the seminal studies connected to the Internet through a DSL line, or a computer
of Paxson such as [6], researchers have proposed several on any other network connected to the Internet.
algorithms based on traditional classification techniques such At least three protocols can be used to tunnel Internet traffic
as hierarchical clusterization [7], [8], Nearest Neighbor and at the application-layer: DNS, HTTP and SSH.
Linear Discriminant Analysis [9], and Bayesian Learning Tunnels over DNS are very powerful, since the DNS proto-
Machine [10]. The list of statistical approaches to traffic col is rarely blocked on the Internet: a good implementation
classification is growing quite long, but to the best of our is available at [14]. However, due to its complexity, this
knowledge, none of these techniques has been yet tested technique only allows very low throughput.
on tunneled traffic with the purpose of strengthening the Tunnels over HTTP are quite popular since network admin-
application of network-boundary policies. istrators normally let HTTP traffic pass their network bound-
Recently Levine et al. [11] proposed a statistical technique aries, even though usually filtering it through an application-
that can violate the privacy of encrypted HTTP streams. Given layer proxy and a firewall. However, both legitimate users
a web-site, the authors collect the lengths of the packets and malicious network software (viruses and malware) can
composing the web request: the mean value of each packet violate these policies by using the HTTP protocol to tunnel any
length traces the size profile of the site. In the same way, other application through the network-boundary, as depicted in
they derived the time profile from the inter-arrival times of Figure 1. In our previous work [1] we tested the open source
the same packets. They compare the profiles of several web- package htc/hts [15] and we introduced a mechanism that can
sites to a generic web-request trace: by means of a cross- block all tunneled sessions and detect actual HTTP sessions
correlation measure they infer the similarity of such a trace to with an accuracy of over 99%.
each given profile and show how it is possible to assess the
destination address of the corresponding flow. Liberatore et A. SSH tunnels
al. in [12] follow a similar approach, evaluating the technique The SSH protocol is designed to exchange data between two
with a larger data set of web-profiles. hosts through a secure encrypted connection over an insecure
Another work that deals with privacy in encrypted data network: run on top of TCP, it provides data confidentiality
streams is the one by Wright et al. [13], where it is shown that and integrity. This protocol is typically used for command
encrypted IPSec tunnels which carry only a single application execution through a secure shell. It also supports secure file
protocol leak enough information about the flows in the tunnel copy between peers, and, in addition, tunneling of arbitrary
to allow to precisely assess their number. The main implication TCP connections. Tunneling over SSH, also known as port
of these works is that encryption is not enough to protect forwarding, cryptographically protects otherwise insecure pro-
privacy when statistical mechanisms are used to analyze traffic tocols, increasing data and system security.
flows. While in the case of HTTP tunnels advanced ALGs might
deeply inspect the payload carried by each flow to deduce
III. T UNNEL TECHNIQUES : BASICS some kind of information, in the case of SSH applying
Several mechanisms are available today to build tunnels at well-known signatures to encrypted traffic is completely use-
the application-layer. From a general point of view, their mode less, making tunnels built on SSH very powerful. Network-
password to the server, that replies with an ACK or a NACK.
In the last case, the client has other chances to re-send the
correct password. The public-key method can either require
one or two round-trips: the specifications define an optional
initial exchange where the client could send information on
its public key to the server before sending a signature with its
own private key.
Packets exchanged during the entire SSH authentication
process are not useful to a classifier to detect the type of
session that the user is opening, i.e., whether the session
is tunneling other protocols or is being used for remote
command execution or secure file copy. The classifier can
easily discard all the packets exchanged up to the client’s
SSH MSG NEWKEYS message and the server response, since
Fig. 2. Authentication stage in SSH. they are sent in the clear. It is more difficult to detect when user
authentication ends, and actual data starts being exchanged,
because the second authentication stage is encrypted. In this
boundary security policies that allow the use of SSH for paper we solve this issue assuming that network administrators
remote command execution or secure file copy have no other are required to choose and allow only one kind of SSH user
choice than to accept, with today’s ALGs, that any protocol authentication method to implement tunnel hunter on their
can be tunneled in and out of the intranet by means of SSH networks.
tunnels. We will discuss this and other aspects related to the way
SSH is designed following a client-server model: the server SSH configuration parameters affect our mechanism in Sec-
is usually implemented with a daemon running in background tion IV-E.
and accepting connections to port 22. Data encryption is
ensured by the SSH protocol that, before any traffic can IV. T UNNEL HUNTER
be exchanged, requires the peers to negotiate cryptographic The goal of this work is to outline a statistical technique that
credentials [16], [17]. The authentication procedures of SSH provides a behavioral characterization of an application layer
impact the way tunnel hunter is implemented. Therefore, we protocol to detect tunneling activities. It is important to point
need to go into more details on how they work and how they out that the precision in detecting non-tunneled traffic must
affect our technique. be very high. In other words, the system must minimize the
number of false-positives, intended as the number of legitimate
B. The SSH authentication process flows that are incorrectly blocked. Ideally, no legitimate SSH
An SSH session involves two different authentication phases sessions should be stopped by tunnel hunter.
before client and server can start exchanging data: (i) host The algorithm we present here has been empirically derived
authentication and (ii) user authentication. Figure 2 outlines from a näive Bayes approach, presented in our previous
the SSH authentication process. work [3], where it was used to classify different protocols
Host authentication, as reported in [17], provides strong such as POP3, SMTP and HTTP. Here we exploit the same
encryption, cryptographic host authentication, and integrity basic classification algorithm to detect SSH tunnels.
protection. The key exchange method, public key algorithm, We built our model considering only the packets that carry
symmetric encryption algorithm, message authentication algo- application-layer data: since the technique aims at classifying
rithm, and hash algorithm are all negotiated. It is expected the application-layer, we do not consider packets without TCP
that in most environments only two round-trips are needed payload, and we simply discard them.
for full key exchange, server authentication, service request,
and acceptance notification of service request: in the worst A. Statistical pattern recognition: basic concepts
case, a round-trip more could be required. This authenti- Statistical pattern recognition is built on three main defi-
cation phase is transmitted un-encrypted and ends with an nitions: pattern, feature and class [18]. The pattern is an r-
SSH MSG NEWKEYS message. All messages sent after this one dimensional data vector !x = (x1 , . . . , xr ) of measurements,
must use the negotiated keys and algorithms, and are privacy whose components xi measure the features of an object. The
and integrity protected. feature represents the variable specified by the investigator
User authentication, as reported in [16], is intended to be and holds a key role in classification. The concept of class is
run over the SSH transport layer protocol, i.e., the encrypted used in discrimination: assuming a set of C classes, named
channel derived by the host authentication phase. Public-key ω1 , . . . , ωC , each pattern !x will be associated with a variable
is the only mandatory authentication method, although pass- that denotes its class membership. If the classes are gathered
words are also accepted. Successful password authentication from an a priori training set, that is, the information about how
in SSH requires a single round-trip: the client transmits the to group the data into classes is known and it is not inferred
directly from the data itself, the approach is called supervised, a non-parametric density estimation of the class. Since we
otherwise it is called unsupervised. dispose of the samples, we adopt the histogram method. This
Given a set of data and a set of features, the goal of a pattern method partitions the r-dimensional space of the class into a
recognition technique is to represent each element as a pattern, number of equally-sized cells and estimates the density at a
and to assign the pattern to the class that best describes it, point !x as follows:
according to chosen criteria. nj
p̂(!x) = # ,
B. Pattern and features definition j∈N nj dV
Our technique considers TCP traffic as classification target where nj is the number of samples in the cell of volume
and it aims at detecting tunnel activities over the SSH applica- dV that straddles the point !x, N represents cells and dV
tion protocols. The technique models a legitimate behavior of is the volume of the cell. If the pattern !x is r-dimensional,
an application protocol by extracting basic statistical features the approach requires N r cells. According to our features
from the TCP session that carries it. The features are gathered definition, N is a matrix of 1461x1001 elements, replicated
directly at the IP-level and the pattern is derived straight from for all the considered r pairs. Although we can act on the
the flows composing the TCP session. We define the flow as free parameter r for reducing the complexity, we opt for
the unidirectional stream of packets exchanged throughout a not estimating the class-conditional density, finally getting a
TCP session: Fclient is the flow that goes from the initiator complexity of r · N cells. In making this choice, we assume
to the responder, while Fserver is the one going backward. that consecutive pairs of (s, ∆t) variables are independent.
Packets that do not carry TCP payload are discarded: they, in We will see that this assumption not only greatly reduces the
fact, do not introduce any additional information useful for mechanism’s complexity, but also leads to excellent results. We
classification. also fixed the maximum size of the pattern to L elements: each
The features we chose to represent each flow are the packet flow of the training set then gives a contribution to estimate
size s and the inter-arrival time ∆t between two consecutive the first min(r, L) densities, where r is the dimension of the
packets. Therefore, a session can be represented by two pattern associated to the flow, i.e., the length of the sequence
patterns, according to direction: composed of all the (s, ∆t) pairs.
! "
s1 . . . sr There is a problem that affects the histogram method:
!x = ,
∆t1 . . . ∆tr since it provides a discontinuous description of the sample
where r corresponds to the number of packets composing distribution, the value of each cell in the matrix can abruptly
the flow excluding the first, since we can start computing the fall to zero at the boundaries of any populated region. This
∆t after observing the second packet. may mislead the description of the protocol and, even worse,
The variable s is discrete and, on Ethernet links, it assumes does not takes into account “noise” factors such as network
values in the range [40, 1500] bytes. The variable ∆t instead congestion: a variation of the round trip time could introduce
is quantized with law qlog (·) and, depending on the clock noise in the measurements of ∆t and one of the packets of the
resolution of the capture device, spans on a logarithmic scale flow could fall in a zero-valued cell of the matrix even if the
from 10−7 to 103 seconds, with step 10−2 . surrounding cells are populated with non zero values. To solve
The choice of these variables is mainly supported by this issue, we introduce a Gaussian filtering: in literature, this
preliminary works of Paxson [6], [19], and by our recent approach is known as kernel method or, equivalently, as Parzen
work [3], where the effectiveness in inferring the nature of method. Since we prefer to deal with density estimation even
the application protocol starting from these variables is shown, after the filtering operation, matrices are rescaled so they still
albeit with the support of preliminary results. sum up to 1.
We apply this procedure separately to a training set of
C. Class definition: the concept of protocol fingerprint Fclient and Fserver flows, finally getting two different descrip-
With regards to SSH tunnels, the classifier has to decide tions of the traffic class under observation. Each description
whether a certain flow belongs to actual SSH traffic1 , rather composes what we call the fingerprint of the “legitimate” use2
than using SSH to hide other kinds of application protocols. of the SSH protocol, which we indicate with M !.
As a consequence, the design of the classifier has to provide
the characterization of only one class ω1 , the SSH one. In the D. Tunnel detection algorithm
rest of the paper we will generically refer to ω1 as the protocol In its general form, given a set of classes, our classifier
class to characterize. assigns a pattern to the class that minimizes what we called
We gather the class description by means of a supervised the anomaly score:
approach: we collect a set of flows, i.e., a training set, that
we previously certified as belonging to real ω1 traffic. We min(r,L)
$ ε
then convert each flow composing the training set into its S(!x|ωj ) = , (1)
p(xi |ωj ) · min(r, L)
equivalent pattern representation and we use it to provide i=1

1 i.e., used for remote command execution or for copying files. 2 i.e., when used not to tunnel other protocols.
where p(xi |ωj ) represents the probability that the i-th errors during the public-key exchange, etc., n0 should point
element of the flow pattern belongs to class ωj , that is the to different packet indexes.
value Mj (si , ∆ti ). Since the denominator must be non-zero, In this work we solve this issue with a simplification:
we define p(xi |ωj ) as max (ε, Mj (si , ∆ti )), where ε is an we assume that all legitimate, non-tunneling SSH sessions
arbitrarily small quantity that we fixed at 10−12 . As a conse- use a uniform, two-round-trip user authentication phase. This
quence, a single pair falling out from the fingerprint (class) makes the classifier select the third Fclient packet after it
does not make the whole sum go to infinity. By construction, sees the SSH MSG NEWKEYS message as the n0 pair. Any
the anomaly score can assume values in the range [0, 1]. client based on OpenSSH will generate a two-round-trip user
The classification algorithm has to decide whether an un- authentication phase when using public-keys as credentials, as
known flow F is conforming to the legitimate class ω1 : this many other SSH implementations would, so this choice was
task can be accomplished after the so called class ω0 , or reasonable at least in our environment. Clearly, other network
rejected region, has been defined. This region is the com- administrators might want to fix n0 to other values, depending
plementary part of the acceptance region and we expect that on the type of SSH user authentication they need to support.
all the flows not belonging to ω1 will fall into this class. We Having to fix an a-priori value for n0 has an implication
limit the acceptance region by introducing a threshold T in the on the behavior of tunnel hunter that needs to be pointed
classification process and a simple decision rule as follows: out. In case of authentication errors, a further attempt will
% result in a connection that, deviating from the pre-set statistical
ω1 if S(!x|ω1 ) < T,
!x ∈ behavior, is going to be blocked by the classifier. However, this
UNKNOWN otherwise.
should not be a major problem: simply, if the user is entitled
to connect, they will try again.
In general, we do not need to wait the end of the flow to
emit a classification verdict, but limit our observation to L + 1 V. E XPERIMENTAL RESULTS
packets. As we explain in Section V, we tune the threshold
considering that each (s, ∆t) pair gives a contribution in the In this Section we report experimental results that show
range [0, 1/ min(r, L)]: the number of pairs that we let fall the effectiveness of tunnel hunter at detecting tunnel activities
in a zero-valued cell of the fingerprint dictates the threshold over SSH. The aim of tunnel hunter is to detect (block) SSH
bound. sessions that are being used to tunnel other application-layer
protocols: SSH sessions used for remote command execution
E. Dealing with SSH authentication and secure copy, instead, must pass through. We refer to
The algorithm we described above is designed for classify- non-tunneling SSH sessions as “legitimate”. We think this
ing a flow by considering its first packets. In the case of SSH is a common scenario with enterprise network manager: on
tunnels, this choice could be counterproductive: as depicted one hand, they need their users to be able to access remote
in Section III-A, the SSH protocol provides an initial, two- computers to run interactive tasks, or to copy files. On the
phase authentication stage that is independent on the use of other hand, they do not want users to violate security policies
the connection – remote command execution, secure copy or by tunneling other protocols on top of SSH.
port forwarding. We ran all the experiments on the 100Base-TX link that
Therefore, to eliminate the contribution of the whole au- connects the edge router of our campus network to the Internet.
thentication stage and apply tunnel hunter to SSH traffic, we This network serves around one thousand users and it is
introduce the concept of “shifted window”: in this way we composed of several 1000Base-TX layer-2 segments.
can explicitly specify the pairs (s, ∆t) that will be considered In order to recognize tunneled traffic, an initial training
by the classification algorithm to compute the anomaly score. phase is needed to build the protocol fingerprints, as described
The intent is to have the anomaly score calculated only on in Section IV-C. We started by collecting a large number of
the packets that represent actual data exchange, excluding the non-tunneled sessions for SSH, by running Tcpdump [20] on
ones that carry authentication information. the edge gateway of our campus network. This is a crucial part
To this end, we modify Equation 1 as follows: of the process since it gathers the training set: only legitimate
traffic has to be collected and all tunneling sessions must be
n
$ ε filtered out.
S[n0 →n] (!x|ω1 ) = , (2)
p(xi |ω1 ) · (n − n0 ) After an appropriate fingerprint has been installed, as a
i=n0
new flow comes, its anomaly score versus the fingerprint is
where n0 is the index of the first considered pair, and should calculated packet by packet. At any time, starting from the n0 -
point to the first packet that carries actual user traffic in each th pair, the score can be compared to the threshold. We will see
SSH session. in the following what packet number is best in our scenario
Fixing n0 to the correct value requires making a few for detecting tunneled traffic, while guaranteeing a low rate
assumptions. In fact, depending on the type of user authentica- of flows incorrectly classified as non-conforming. Moreover,
tion, such as password or public-key, or on whether there are we will discuss about the use of the shifted window in the
errors during the authentication process, e.g., wrong password, classification process of SSH tunnels.
We collected traces of legitimate SSH sessions by running After several weeks, we recorded about another six hundred
Tcpdump on the campus edge gateway while several col- interactive sessions and about one thousand and seven hun-
leagues helped us realizing interactive sessions and securely dred bulk-transfer sessions. We also recorded encrypted flows
copying files to same remote servers located inside the network tunneling Chat, POP3, SMTP and P2P (BitTorrent) protocols.
of the Engineering faculty of the University of Rome Tor This was simple to do since we knew a priori the addresses
Vergata, in the network of the University of Trento, on a data of the tunnel exit points. Once again, we excluded the host
center in Utah (USA), and finally on a home PC connected to authentication phase from every session: this let us filter out
Internet through a fast, 8Mb/s DSL line. The SSH clients used from the evaluation set all the spurious sessions that did not
in this phase were run on a mix of different operating systems complete host authentication. Table I reports the number of
such as Mac OS X, Linux and various versions of Microsoft sessions that finally composed the SSH evaluation set.
Windows. As discussed above, the application of tunnel hunter to
We configured all the clients to respect the assumptions de- SSH requires the use of the shifted window optimization,
scribed in Section IV-E: public-key as the user-authentication as described in Section IV-E. We studied the performance
method, public-key verification enabled. We also disabled of the technique varying the size and the starting packet of
compression on both SSH clients and servers. Users not the window. The best results are shown in Table I and were
conforming to these policies would see their legitimate SSH achieved configuring the algorithm to work on Fclient flows,
traffic blocked by tunnel hunter, which once again seems like setting the threshold T = 3/8, and computing the anomaly
a reasonable restriction that a network administrator could set. score in the window [6 → 13]. The optimization of these
During a three-week period, we collected the training set, parameters is discussed later in Section VI.
i.e., around four thousand legitimate SSH sessions, from which As the results suggest, tunnel hunter can detect legitimate
we derived the SSH/SCP fingerprint. It is worth noting that SSH/SCP sessions with more than 98% accuracy3 . In other
we gathered the fingerprint without considering the host- words, the system can indeed be configured to minimize false-
authentication stage, since we discard it during the feature positives: all legitimate SSH sessions and the vast majority
extraction process by looking for the SSH MSG NEWKEYS of SCP sessions are not blocked by the classifier. At the
message. same time, the technique is quite effective at blocking tunnel
activities. Even the worst blocking rate is very close to 90%.
Protocols Hit ratio # sessions Naturally, one could even think of improving the blocking
SSH 98.95% 600 rate of tunneled protocols, at the expense of increasing the
SCP 99.69% 1700
POP3 over SSH 87.89% 2360
false-positives. However, even wanting to minimize the false-
SMTP over SSH 99.93% 4300 positive rate as we did in this case, the precision with which
CHAT over SSH 88.31% 2100 other tunneled protocols are detected is already very promis-
P2P over SSH 88.77% 1600
ing.
TABLE I
SSH SESSIONS CORRECTLY CLASSIFIED ( OPTIMAL PARAMETERS ) AND VI. O PTIMIZING CONFIGURATION PARAMETERS
CARDINALITY OF THE EVALUATION SET.
Finding the optimal values for the threshold T and the initial
“shifted window” value n0 (see Section IV-E) is necessary in
order for tunnel hunter to perform properly.
We then collected traffic for the evaluation set. We started
Both parameters can be optimized by running the algorithm
running a SSH tunnel, using the package available at [21]. We
on a subset of SSH and SCP Fclient flows used neither
run one end of the tunnel on a Mac OS X workstation inside
for fingerprinting nor for evaluating, and fixing the accuracy
the campus network. The other end was in turn located on the
of detection to the desired value. In our case, we chose to
same server used to collect actual SSH traffic.
optimize the parameters so that tunnel hunter would have a
In order to be able to tunnel P2P traffic as well, we setup maximum of 1% false positives, i.e., would correctly classify
an additional tool at the tunnel entry point. In fact, since peer- at least 99% of legitimate SSH and SCP traffic.
to-peer protocols such as BitTorrent are designed to simul- By varying the threshold’s T in the range [0 : 1], and n0
taneously open a large number of connections towards other in the range [1 : 10] and requiring the algorithm to perform
peers, the actual destination “server” continuously changes in with 99% accuracy as explained above, we found the optimal
this case. To make sure to convey all peer-to-peer connections value for T = 3/8, which corresponds to accepting that three
through the SSH tunnel, we implemented a gateway on a Linux out of eight packets fall out of the fingerprint’s description.
workstation inside the campus network. We then configured Parameter n0 ’s optimal value came out to be the sixth pair.
a few machines, dedicated to P2P activities, to use it as This type of optimization procedure would need to be
default gateway: their outgoing traffic was then encapsulated applied by any network manager that wanted to implement our
by this gateway inside the tunnel. The tunnel exit point technique. Depending on the type of tunneled traffic, on the
provided traffic de-encapsulation and masquerading through
Iptables [22] so that backward traffic was also crossing the 3 Weighing the interactive SSH and SCP cases according to the size of
tunnel. evaluation sets.
network architecture, and on the target rate of false positives, R EFERENCES
the optimal value in their case might differ from ours. [1] M. Crotti, M. Dusi, F. Gringoli, and L. Salgarelli, “Detecting HTTP
Tunnels with Statistical Mechanisms,” in Proceedings of the 42th IEEE
VII. C ONCLUSIONS AND F UTURE W ORK International Conference on Communications (ICC 2007), (Glasgow,
Scotland), pp. 6162–6168, June 2007.
In this paper we have presented a statistical mechanism [2] T. Ylonen and C. Lonvick, “The Secure Shell (SSH) Protocol Architec-
called tunnel hunter that can successfully recognize whenever ture,” RFC 4251, IETF, Jan. 2006.
a generic application protocol is tunneled on top of SSH. [3] M. Crotti, M. Dusi, F. Gringoli, and L. Salgarelli, “Traffic Classification
through Simple Statistical Fingerprinting,” ACM SIGCOMM Computer
Tunnel hunter is an application of the concept of protocol Communication Review, vol. 37, pp. 5–16, Jan. 2007.
fingerprint [3], which provides a behavioral description of an [4] V. Paxson, “Bro: a system for detecting network intruders in real-time,”
application protocol by considering three simple properties of Computer Networks, vol. 31, no. 23–24, pp. 2435–2463, 1999.
[5] M. Roesch, “SNORT: Lightweight Intrusion Detection for Networks,”
IP packets: their size, inter-arrival time and arrival order. The in LISA ’99: Proceedings of the 13th USENIX Conference on Systems
main idea exploited by this paper is that the fingerprint can Administration, (Seattle, WA, USA), pp. 229–238, Nov. 1999.
be derived by training the system on legitimate, non-tunneling [6] V. Paxson, “Empirically derived analytic models of wide-area TCP
connections,” IEEE/ACM Transactions on Networking, vol. 2, no. 4,
SSH usage, and later used to detect application-layer tunnels pp. 316–336, 1994.
that are run on top of a Secure Shell. The system provides [7] F. Hernández-Campos, F. D. Smith, K. Jeffay, and A. B. Nobel, “Sta-
parameters that can be configured to obtain any pre-set goal tistical Clustering of Internet Communications Patterns,” in Computing
Science and Statistics, vol. 35, July 2003.
with respect to the desired false-positive ratio. [8] A. McGregor, M. Hall, P. Lorier, and J. Brunskill, “Flow Clustering
The experimental results we obtained are very promising. Using Machine Learning Techniques,” in Proceedings of the 5th Passive
First and foremost, virtually no legitimate traffic is blocked and Active Measurement Workshop (PAM 2004), (Antibes Juan-les-Pins,
France), pp. 205–214, Mar. 2004.
by our mechanism, since over 99% of valid SSH and SCP [9] M. Roughan, S. Sen, O. Spatscheck, and N. Duffield, “Class-of-service
sessions are correctly recognized as legitimate by tunnel mapping for QoS: a statistical signature-based approach to IP traffic
hunter. Second, and equally important, the vast majority of classification,” in IMC ’04: Proceedings of the 4th ACM SIGCOMM
conference on Internet measurement, (Taormina, Sicily, Italy), pp. 135–
tunneled traffic is blocked by the mechanism. 148, Oct. 2004.
These results, together with the ones presented in [1] for [10] A. W. Moore and D. Zuev, “Internet traffic classification using bayesian
HTTP tunnels, suggest that this technique can be used to analysis techniques,” in SIGMETRICS ’05: Proceedings of the 2005
ACM SIGMETRICS International Conference on Measurement and
complement existing ALGs in two different ways. On one Modeling of Computer Systems, (Banff, Alberta, Canada), pp. 50–60,
hand, it can augment their ability to recognize outgoing June 2005.
tunneled traffic, even if encrypted, therefore improving the [11] G. Bissias, M. Liberatore, D. Jensen, and B. N. Levine, “Privacy Vul-
nerabilities in Encrypted HTTP Streams,” in Proc. Privacy Enhancing
effectiveness of security policies against non-complying users. Technologies Workshop (PET 2005), (Dubrovnik, Croatia), May 2005.
On the other hand, it can help anti-virus and anti-spyware [12] M. Liberatore and B. N. Levine, “Inferring the source of encrypted http
tools in detecting outgoing tunneled traffic generated by robots connections,” in CCS ’06: Proceedings of the 13th ACM conference on
Computer and Communications Security, (Alexandria, Virginia, USA),
and compromised machines. Although this paper reports only pp. 255–263, 2006.
results related to the first class of applications, we plan on [13] C. V. Wright, F. Monrose, and G. M. Masson, “On Inferring Application
continuing our tests to verify the applicability of this technique Protocol Behaviors in Encrypted Network Traffic,” Journal of Machine
Learning Research, vol. 7, pp. 2745–2769, Dec. 2006.
to the second class. [14] F. Heinz and J. Oster, “DNS tunnel.” http://nstx.dereference.de/nstx.
The next natural step is to improve the model, first by adapt- [15] L. Brinkhoff, “GNU httptunnel.” http://www.nocrew.org/software/httptunnel.html.
ing the “shifted window” mechanism described in Section IV- [16] T. Ylonen and C. Lonvick, “The Secure Shell (SSH) Authentication
Protocol,” RFC 4252, IETF, Jan. 2006.
E to make it dynamic. This should allow the system to counter [17] T. Ylonen and C. Lonvick, “The Secure Shell (SSH) Transport Layer
attacks where tunnels would be opened after a few packets of Protocol,” RFC 4253, IETF, Jan. 2006.
a legitimate session were exchanged. [18] A. Webb, Statistical Pattern Recognition. Wiley, 2nd ed., 2002. ISBN
0-470-84514-7.
Furthermore, at this time tunnel hunter can only signal that [19] V. Paxson and S. Floyd, “Wide area traffic: the failure of Poisson
an SSH session is being used not to perform secure copy modeling,” IEEE/ACM Transactions on Networking, vol. 3, no. 3,
or interactive remote logins, but cannot discriminate which pp. 226–244, 1995.
[20] “Tcpdump/Libpcap.” http://www.tcpdump.org.
protocol (POP3, HTTP, SMTP, etc.) is actually being tunneled. [21] “OpenSSH.” http://www.openssh.org.
Adding this capability to our technique will require further [22] N. C. Team, “netfilter/iptables.” http://www.netfilter.org.
research.