You are on page 1of 5

Practical Internet Steganography: Data Hiding in IP

Deepa Kundur Kamran Ahsan


Texas A&M University, College Station Government of Ontario
Department of Electrical Engineering 123 Edward Street
3128 TAMU, College Station Texas, USA 77843-3128 Toronto, Ontario Canada
deepa@ee.tamu.edu kamran.ahsan@cbs.gov.on.ca

Abstract images. In this position paper, we assert that a more


important problem that has received little attention
This paper investigates practical techniques and to date is that of Internet steganography. The sheer
uses of Internet steganography. Internet steganog- volume of Internet traffic provides a higher band-
raphy is the exploitation of Internet elements and width vehicle for covert communications which leads
protocols for the purpose of covertly communicating to a plethora of applications impractical for standard
supplementary data. Each scenario facilitates the image steganography. Furthermore, it can be shown
interaction of fundamental steganographic principles that through “ideal compression” steganography in
with the existing network security environment to images can be eliminated [2]. However, given the
more generally bridge the areas of data hiding, net- design mantra of the Internet involving communica-
work protocols and security. tion, connectedness and collaboration which has led
to an “open” intentionally redundant specification,
it is highly unlikely that Internet steganography can
1. Introduction be subverted.
The objectives of this paper are to:
Steganography is the process of discreetly hiding
1. demonstrate two existing covert channels at the
data in a given host carrier for the purpose of en-
hancing value or subliminally communicating infor- network layer of the Internet,
mation. Steganography is possible through the ex-
2. relate the available bandwidth for covert com-
istence of covert channels. The concept of a covert
munication, traditionally associated with a vi-
channel was first introduced by Lampson [1] as a
olation of system security policy, to potentially
channel that is used for information transmission,
performance enhancing network applications.
but that is not designed nor intended for commu-
nication. The host carrier may be any message or
In the next section we provide a practical frame-
information source that contains redundancy or ir-
work for Internet steganography. Sections 2 and 3
relevancy. For instance, digital images are often used
provide simple examples of covert channels in the
to embed covert information; the limited capability
Internet Protocol (IP) followed by application sce-
of human vision allows data hiding or watermarking
narios that can make effective use of the unutilized
of this type of media.
bandwidth.
The host carrier can also be an element or mech-
anism of an overt communication channel such as a
network packet or protocol. An overt channel can 1.1 Framework
be exploited to create the presence of a covert chan-
nel by judiciously selecting components of the overt In this work we consider the situation in which
channel that contain or exhibit redundancy or irrele- the covert channel piggybacks on an already estab-
vancy. These components by virtue of their ambigu- lished overt channel such as the Internet. The em-
ous nature can be modified for the purpose of em- bedding and detection processes for covert commu-
bedding and later detecting supplementary covert in- nications do not interfere with normal overt informa-
formation without affecting overt information flow. tion flow. We assume that our communicating par-
Many recent research efforts have been focused on ties denoted as Alice and Bob, transfer information
steganography of media host carriers such as digital overtly over a computer network, and employ data
hiding involving the TCP/IP protocol suite to com- in packet headers for data hiding; we adopt more
municate supplementary information covertly. Data specific approaches. The proposed data hiding sce-
hiding is employed through a stego-algorithm that narios are more practical and robust since they are
takes as input the covert message Ck , a sequence of based on redundancies and multiple interpretations
network packets {Pk } known as the cover-network of process strategies of the Internet protocol. This
packet sequence, and possibly a secret key to gener- makes the scenarios non-detectable to various auto-
ate a stego-network packet sequence {Sk } that con- mated security mechanisms and intermediate nodes.
tains the overt payload of {Pk } while piggybacking This research also expands on the work pre-
Ck . The stego-network packet sequence {Sk } is sent sented in [5], since concrete steganographic encod-
to Bob over a computer network. The covert mes- ing and decoding techniques are suggested for inter-
sage Ck traverses a generally non-ideal channel char- operability with network security mechanisms such
acterized by the network behavior on the packet Sk . as firewalls and routers. The header fields we select
Given a model of the channel, Bob estimates the for data hiding are flexible for sending covert data
covert message to produce Ck∗ . without interfering with standard network processes
The basic framework is shown in Figure 1. We and for being accessible to routers and firewalls. In
summarize the relevant points below: contrast, [5] suggests the use of the TCP sequence
• For reasons of security, Alice and Bob may em- number and acknowledgment number fields, which
ploy the use of a symmetric secret key, so that reflect the number of bytes sent and acknowledged.
only Bob can detect the information accurately. These fields, we believe, are therefore inflexible for
In addition, for the packet-sorting approach, Al- covert data transfer.
ice and Bob are both assumed to implement IP Overall, we attempt to provide improved data-
Sec. hiding approaches on the basis of several interest-
ing papers in the area. Moreover, we take into ac-
• The stego-network packet, Sk , may pass count, in part, the effect of the network on the stego-
through one or more intermediate network network packet Sk and provide novel application sce-
nodes in order to reach Bob. The covert chan- narios utilizing concepts proposed by [6]. Together,
nel, by definition, must be non-detectable by we hope to present a more comprehensive picture of
these nodes, which can be ensured if and only the data hiding in computer networks.
if the intermediate node finds no critical differ- The reader should note that our techniques for
ence between {Pk } and {Sk } while processing Internet steganography are more specific and ad hoc
the sequence. The reader should note that we than those commonly found for images. The pri-
define “non-detectability” with respect to stan- mary reason for this is because our host carrier
dard network node capability and not with re- is fundamentally more structured than bitmap me-
spect to a human observer. dia [7]. The notions of redundancy and irrelevancy
• At an intermediate node, a stego-network exploited for covert communications are tied to the
packet Sk may be dropped due to the non- available model of the host carrier. This leads to
availability of buffer capacity. However, this more general formulations of the problem for digi-
possibility is assumed to be non-existent in our tal image steganography in which there exists well-
analysis of the proposed algorithms. We are fo- known mathematical representations of the host and
cusing on that network traffic that is most un- human perception. In contrast, for carriers such as
likely to be dropped due to buffer unavailabil- network packets and protocols in which ambiguity
ity. Such a condition is possible, if we consider in representation is more specific and must be iden-
QoS mechanisms through which network traffic tified through brute-force study, the techniques are
can be prioritized as a preferred class. In addi- ad hoc.
tion, we assume that there is remote possibility
that the same stego-network packet may be cor- 2 Packet Header Manipulation
rupted during transmission and consequently
dropped by the data link layer mechanisms.
In designing our data-hiding approaches, we con-
1.2 Previous Work and Perspective sider functional interfaces required in all IP imple-
mentations; a summary of these interfaces is found in
Our research can be considered in part, a logical [8]. In this way, we avoid identifying covert channels
extension to [3] and [4] in which Handel and Sand- specific to particular IP software implementations,
ford and Wolf propose the reserved or unused fields which are not available for general use.
Alice Bob

Covert Estimated Covert


Stego-Network Information
Information
C Packet
k
S Covert Information C*
Stego- k k
P Effective Non-Ideal Detection
k Algorithm Channel Algorithm
Network
Packet

Secret
Key

Figure 1. The general covert channel framework in TCP/IP

bit is used to send a covert bit to Bob. Bob accord-


ingly reads the DF bit and gets the covert message
from Alice, who is sitting on the same network.
On the basis of the above explanation, it can be
shown the datagrams in Tables 1 and 2 bear the
same meaning to the overt network provided that Al-
ice and Bob have the MTU information beforehand
and know, for the purposes of this example, that
the MTU is greater or equal to 472. In such a case,
Figure 2. The IP header neither datagram will be fragmented although they
differ in the DF bit. Thus, this redundancy leads
to the possibility of covert information through ju-
A close study of [8] reveals that redundancy exists dicious selection of each representation. Datagrams
in the Internet Protocol’s fragmentation strategy. 1 and 2 sent by Alice can therefore communicate “1”
Figure 2 displays the IPv4 header. The Flags field and “0” respectively to Bob. The constraint, how-
contains fragmentation information. The first bit ever, is that both parties require prior knowledge of
is reserved ; the second is denoted DF (to represent the MTU.
Do not Fragment), and the third is denoted MF (to
represent More Fragment). An un-fragmented data- ID field Flags Frag.Offset Total Length
gram has all zero fragmentation information (i.e., XXX..XX 010 000...0 472
MF = 0 and 13-bit Fragment Offset = 0), which
gives rise to a redundancy condition; i.e., DF (Do
Table 1. Datagram 1 to communicate “1”.
not Fragment) can carry either “0” or “1”, subject
to the knowledge of the maximum size of the data-
gram. This aspect is exploited in our scenario.
ID field Flags Frag.Offset Total Length
Consider two workstations on the same network XXX..XX 000 000...0 472
with users Alice and Bob, who have decided to
have a covert communication employing the proto-
col suite of the network. They are aware that the Table 2. Datagram 2 to communicate “0”.
network administrator is very security conscious and
that the TCP/IP software is configured properly as To demonstrate how both datagrams are simi-
per the security policy of the organization. Alice lar from the perspective of a network, we note that
and Bob have knowledge of the MTU (maximum Datagram 1 is of moderate length, but fragmenta-
transmission unit) of their network and are aware of tion is not allowed, since the DF bit is set. Data-
the fragmentation strategy, which follows the stan- gram 2 of the same length has the fragmentation bit
dard design considerations of IP [8]; any datagram of unset, yet fragmentation is not possible since it is
length below or equal to the MTU is not fragmented. below the value of the MTU. Since Alice and Bob
The embedding algorithm avails the fragmenta- know the MTU of their network and have agreed to
tion strategy of the Internet Protocol, and the DF send a datagram of a size smaller than the MTU,
there will be no fragmentation. be used for the accounting and billing processes
of the same networks.
2.1 Potential Applications
3 Packet Sorting
Associating supplementary information sent via
covert mechanisms employing packet header manip- In this section, we describe in general terms the
ulation algorithms find the following application sce- use of packet ordering for covert communication.
narios: There are n! ways in which to arrange a set of n
objects. If the order of these objects is not of con-
1. Enhanced filtering criteria in packet-filtering cern, then there is an opportunity through judicious
routers (firewalls). If the additional information selection of an arrangement of the n objects, to com-
pertains to a user or an application, a more re- municate covertly a maximum of log2 n! bits. For
inforced filtering policy can be defined. There- a set of n = 25 objects, for instance, 83.7 bits can
fore, in addition to performing filtering on the be communicated. The capacity for data hiding in-
basis of source and destination addresses, source creases dramatically for large n.
and destination ports and protocol, the router Applying this principle to our framework, we con-
can have enhanced criteria for filtering packets sider data hiding through network packet sequence
based on user and application information. ordering. Modifying the sequence number of pack-
ets requires no change of the packet content, so it is
2. A client server architecture wherein several
expected that the network protocols or processing of
clients can make a request to the FTP server
packets will not be significantly different. Based on
of, say, a library. A log file can be maintained,
these advantages, the feasibility of data hiding using
for audit purposes, based on the request sent
packet sorting within the IP layer is explored.
by various users (based on their user informa-
The packet sorting and resorting processes require
tion tied to their “request packets”). Moreover,
a reference in order to relate sorted packet numbers
serving the request, like transferring a digital
to their actual natural order to decode the covert
image to the user, can have the same user infor-
information. Such a reference is available within
mation or library information tied to the con-
the IP Sec environment, which includes two proto-
tent packets. This scenario of security tied to
col headers, ESP and AH that provide a 32-bit se-
the content avoids the violation of digital copy-
quence number field. The primary aim of this field is
rights by the specific user.
to detect replay attacks; hence it is directly related
3. A logging process for the above application sce- to packet sequencing. The anti-replay mechanism
nario based on user or application specific infor- is intended to determine whether a received packet
mation will complete the picture (i.e., logging of is a duplicate or not. This replay protection there-
valid user), maintain the record of user requests fore also provides a means to associate an order to
based on user information and, ultimately serve packets which can be exploited for covert communi-
the user requests by having either the user in- cation.
formation or the server/source (library) infor- Unlike the packet header manipulation approach,
mation tied to the content packets to avoid the in this scenario the cover object is a sequence of net-
copyright violation. work packets {Pk } rather than a single packet Pk to
embed one covert message symbol. The stego algo-
4. Value is added to content delivery networks. A rithm “sorts” the original cover-network packet se-
content delivery network is an overlay network quence, resulting in the generation of “sorted” pack-
to the public Internet or private networks, built ets with the sorted sequence numbers in their head-
specifically for the high performance delivery of ers. The reader should note that by sorting we are
content. This concept adds intelligence to net- only referring to modification of the sequence num-
working wherein the network makes path deci- ber fields of a packet stream. We are not physically
sions based on more than simple labels such as swapping location values of the packets at the trans-
the IP address. Having specific information tied mitter. The covert data detection process identifies
to the content itself would ease up the process the covert information sent by estimating it from the
of content delivery networking. Decisions based sorted packet stream.
on specific information within the content can The reader should note that the stego-network
incorporate smart routing mechanisms within packet sequence, {Sk }, may pass through one or
the networks. The same information can also more intermediate network nodes in order to reach
Bob. Under ideal network conditions in which Our demonstration of the existence of the covert
proper physical sequencing is guaranteed by the net- channels through these algorithms points to the pos-
work, the estimation of the covert information is sibility of associating additional information in the
a trivial task of using a look up table to map the network packets. These steganographic scenarios
sorted sequence to a corresponding covert bit se- provide flexibility to explore this possibility both in
quence. However, since in general the network can- the IP Sec and non-IP Sec environments.
not guarantee proper sequencing for packet delivery,
the transmission process must be modeled as a non- References
ideal channel characterized by position error(s) im-
posed by the practical network behavior. The covert
[1] B. W Lampson, “A note on the confinment prob-
data estimation process at the receiver must attempt
lem,” in Proc. of the Communications of the
to recover the information in order to minimize bit
ACM, October 1973, number 16:10, pp. 613–615.
error. By virtue of our practical formulation, the
process is somewhat ad hoc. We do not detail such [2] R. J. Anderson and A. P. Petitcolas, “On the
a method in this paper, but the reader is referred limits of steganography,” IEEE Journal on Se-
to [9] for one possible non-ideal channel model for lected Areas in Communications, vol. 16, no. 4,
the network and a corresponding estimation process. pp. 474–481, May 1998.

3.1 Usage Scenarios - Packet Sorting [3] T. Handel and M.Sandford, “Hiding data in the
OSI network model,” Cambridge, U.K., May-
June 1996, First International Workshop on In-
We believe that the packet sorting technique in
formation Hiding.
the IP Sec environment finds following applications:
[4] M. Wolf, “Covert channels in LAN protocols,” in
1. Preliminary (added) authentication in IP Sec Proceedings of the Workshop on Local Area Net-
environment. worK Security (LANSEC’89), T.A.Berson and
T.Beth, Eds., 1989, pp. 91 – 102.
2. A mechanism to facilitate enhanced anti-traffic
analysis by having packets with sorted sequence [5] C. H. Rowland, “Covert channels in the TCP/IP
numbers. protocol suite,” Tech. Rep. 5, First Monday, Peer
Reviewed Journal on the Internet, July 1997.
3. Enhanced security mechanisms for IP Sec pro-
tocols, especially for ESP operating in tunnel [6] R. Ackermann, U. Roedig, M. Zink, C. Griwodz,
mode, which enable enriched security for vir- and R. Steinmetz, “Associating network flows
tual private networks (VPNs). with user and application information,” in Eight
ACM International Multimedia Conference, Los
4 Final Remarks Angeles, California, 2000.
[7] G. Fisk, M. Fisk, C. Papadopoulos, and J. Neil,
In this paper, we have presented two simple covert “Eliminating steganography in internet traffic
channels within the Internet Protocol. They both with active wardens,” in Proc. Fifth Int. Work-
exploit redundancies in the representation of infor- shop on Information Hiding, F. A. P. Petitco-
mation. Our first approach based on packet header las, Ed., October 2002, number 2578 in Lecture
manipulation makes use of the redundancy in the Notes in Computer Science.
representation of the fragmentation strategy. Our
second approach based on packet sorting employs [8] University Southern California Information Sci-
the sequence number field in IP Sec and its some- ences Institute, “Internet protocol, DARPA in-
what redundant relation to the physical ordering of ternet program , protocol specification,” Septem-
the packets (i.e., in ideal network conditions, there ber 1981, Specification prepared for Defense Ad-
is a one-to-one correspondence between the two; in vanced Research Projects Agency.
a general network the sequence number and physical
[9] K. Ahsan and D. Kundur, “Practical data hid-
ordering are correlated) in order to transmit covert
ing in TCP/IP,” in Proc. ACM Workshop on
information. Simple covert channels based on irrel-
Multimedia Security, 2002.
evancy within the IP specification are also possible.
For instance, the unused or reserved bits within spe-
cific fields can be exploited to carry covert data [3].

You might also like