You are on page 1of 6

Anomaly Detection in SMTP Traffic

Hao Luo, Binxing Fang, Xiaochun Yun

School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001
{luohao, bxfang, yxc}

Abstract service attacks of SMTP is error mails bouncing back

attack [4], and a report shows on October 2003, at least
We investigate an effective and robust mechanism two domains in the United States had been received
for detecting SMTP traffic anomaly. Our detection hundreds of thousands of error mails from all over the
method cumulates the deviation of current delivering Internet [5].
status from history behavior based on a weighted sum Another important threat of SMTP is email-based
method called the leaky integrate-and-fire model to viruses, and email viruses have become one of the
detect anomaly. The simplicity of our detection method major Internet security threats today. An email virus is
is that the method need not store history profile and a malicious program, which hides in an email
low computation overhead, which make the detection attachment, and becomes active when the attachment is
method itself immunes to attacks. The performance is opened. A principal goal of email virus attacks such as
investigated in terms of detection probability, the false Melissa is to generate a large volume of email traffic
alarm ratio, and the detection delay. Our results show over time, so that email servers and clients are
that leaky integrate-and-fire method is quite effective eventually overwhelmed with this traffic, which
at detecting anomaly in the SMTP traffic. Compared effectively disrupting the usage of the email service.
with non-parametric Cumulative Sum method, the Modern email viruses are more damaging, taking
evaluation results show that our detection method has actions such as creating hidden backdoors on the
lower false alarm ratio and higher detection infected machines that can be used to commandeer
probability. these machines in a subsequent coordinated attack.
In this paper, we propose an effective and robust
1. Introduction method for detecting SMTP traffic anomaly, which is
complementary to the alert threats mentioned above.
The SMTP [1] is used as the basis for most The effect of our detection method is that the method
electronic mail. Email is the most popular Internet need not store history profile and low computation
service now [2], and it allows people to communicate overhead. Instead of monitoring the ongoing traffic at
by exchanging electronic messages globally. These the front end or the victim server, our method check
messages can be delivered in a few seconds to a couple the SMTP server’s delivery log. The benefit of
of hours. An added attraction is the relatively low cost checking SMTP log to detect traffic anomaly is that we
of sending large messages. Combined, these benefits need not monitor raw traffic of the server exchanging
give users a convincing argument for access to email, and make computation overhead very low and the
and thus the connection of their systems to the Internet. SMTP log provides detail information about receiving
SMTP is a simple protocol and contains only a few and sending status. The key feature of our method is to
basic commands. There are several security threats that utilize the leaky integrate-and-fire model to cumulate
associated with these commands and Denial-of-service the deviation of current delivering status from the
attack is one of the most popular threats of SMTP. history status. The leaky integrate-and-fire model is a
Denial-of-service attacks based on SMTP are aimed at weighted sum model, and the newer input data will
flooding a network or computer with massive email play a more important role in the result. The old data
messages to prevent legitimate use. In most cases a will be dropped from the result by a weighted factor. In
computer is affected because it cannot handle the load this way, our method archives high detection
created by receiving large numbers of messages at the probability and lower false alarm ratio. The efficacy of
same time, or running out of storage space, or cannot our detection method is validated by simulating
handle large messages [3]. An example of Denial-of- experiment with real background test data.

Proceedings of the Third International Conference on Information Technology: New Generations (ITNG'06)
0-7695-2497-4/06 $20.00 © 2006 IEEE
The remainder of this paper is organized as follows. server’s normal behavior. However, unlike the
The section 2 shows the related works of network traditional network intrusion detection system that
anomaly detection. In section 3 we discuss the leaky detects the anomaly directly by the deviation of current
integrate-and-fire model based SMTP traffic anomaly behavior from the profiled normal history behavior,
detection method. In section 4, we evaluate our our method cumulates the deviation in a period to
anomaly detection method and compare our method detect the anomaly according to the method of
with a non-parametric Cumulative Sum method. Integrate-and-fire model described. Compared with the
Finally, section 5 presents our conclusions. CUSUM-type algorithms, the detection algorithm
based on Integrate-and-fire model is more sensitive to
2. Related Works current network status.
Our method uses SMTP server’s log to detect the
It is possible to continuously track the behavior of anomaly. SMTP server log provides a mail server’s
the network by online learning and statistical receiving and sending information including failure
approaches. Statistical analysis has been used to detect message such as mail with invalid account. The log
both anomalies corresponding to network failure, as also includes delivery time of each mail. Since our
well as network intrusions [6]. work is detecting SMTP traffic anomalies, this data
A predictive detection method [7] was used in web source is sufficient.
server anomaly detection, by analyzing time series
measurements of the number of http operations per 3.1. The SMTP Behavior Deviation Evaluation
second. The statistical model considered both seasonal
and trend components, which were modeled using a Let {xn, n=0,1,…} be the serial of mail numbers that
Holt-Winters algorithm. Time correlations were a mail server received within one sampling period, and
modeled using a second order auto-regressive model. let {yn, n=0,1,…} be the corresponding sent mail
After removing the non-stationarities from the time numbers in the same sampling period. We define {Δn,
series measurements, anomalies were detected using a n=0,1,…} be the number of received mails minus that
generalized likelihood ratio algorithm. This method of the corresponding sent mails collected within one
need store history profile for future using. sampling period.
A wavelet approach was proposed and implemented In general, the mean of {Δn} is dependent on the
by Paul and others [8], they used wavelet filter to accounts number of SMTP server, and it may also
process four classes of network traffic anomalies: depend on the access patterns, for example, varying
outages, flash crowds, attacks and measurement with time of the day and week. To make our detecting
failures. Their results showed that wavelet filters were algorithm more general, we should eliminate these
quite effective at exposing the details of both ambient dependencies. Thus, {Δn} is normalized by the average
and anomalous traffic. However, the authors also number Yn of {yn}. Yn can be computed by using an
mentioned that their signal analysis method could not
detect anomalies in real time. exponentially weighted moving average (EWMA) of
The authors of [6] proposed approaches for previous measurements.
detecting SYN flooding attacks using CUSUM-type Yn = β Yn −1 + (1 − β ) yn (1)
algorithm, and this case made use of the standard Where β is the EWMA factor that represents the
sequential change point detection approach. The memory in the estimation. Define Xn=Δn/ Yn , and { Xn }
approach of [6] applied the time series measurements
is no longer dependent on the network size or time-of-
of the difference of the number of SYN packets and the
corresponding FIN packets in a time interval. The
So we can define the deviation of SMTP behavior
simulation results have shown that SYN flooding
for a given interval n as:
attacks can be detected with high accuracy by
CUSUM-type algorithms. Dn = X n − X n−1 (2)
Where X n is the mean rate of Xn and estimated from
3. SMTP Anomaly Detection measurements prior to n-1. The mean X n is also
computed by EWMA method. The deviation of SMTP
In this section, a real time statistical analysis behavior Dn is used as input data of our anomaly
method we developed using the theory of Leaky detection method.
Integrate-and-fire model is discussed. Like most
statistical anomaly detection systems, we compare the 3.2. The Leaky Integrate-and-fire Model
observed sequence with the profile in representing the

Proceedings of the Third International Conference on Information Technology: New Generations (ITNG'06)
0-7695-2497-4/06 $20.00 © 2006 IEEE
Leaky integrate-and-fire model have been proposed Therefore:
as model of neurons for a long time. It can be used for L '( n) = exp( −
)( L '( n − 1) + Dn ) (8)
processing time-varying signals [9] and also can be K
used in powerful computing systems [10]. The simplest As the negative SMTP behavior deviation means no
form of integrate-and-fire model consists of a resistor anomaly in our detection, according (8), here we let
R in parallel to a capacitor C driven by an external ⎧ 1 +
current I(t). The voltage V(t) across the capacitor C is ⎪ L (n) = exp(− )( L(n − 1) + Dn ) (9)
⎨ K
compared to a threshold δ. If V(t)=δ at time t an output ⎪⎩ L (0) = 0
spike φ(t) is generated and V(t) is reset to an initial be our network status function. Where n>0 and x+ is
voltage Ur. Between spikes, the voltage of a leaky equal to x if x > 0 and x+ is equal to 0 otherwise. We
integrate-and-fire model is governed by: will use L(n) in making detection decisions. Here we
dV (t ) V (t ) I (t ) (3) call K as cumulating factor.
=− +
dt RC C Let H represents the anomaly threshold. At interval
Suppose that a spike has occurred at ti. For t>ti the n, if L(n)>H, an alarm will be signaled at time n,
stimulating current is I(t). The V(t) can be expressed otherwise the network status is normal. If the alarm is
as: signaled at time n, L(n) will be reset to 0.
t − ti 1 t −t s (4)
C ∫0
V (t ) = U exp(− )+ exp(−

r )I (t ) ds
RC RC 3.4. Parameter Specification
When leaky integrate-and-fire model is used to
detect SMTP anomaly, the deviations of SMTP The tuning parameters of above algorithm are the
behavior in each interval of t>ti are inputted, and the cumulating factor K for computing the network health
V(t) are tested as alarm condition. The detail of status, the alarm threshold H, and the EWMA factor β.
detection algorithm will be described in Section 3.3. In general, the EWMA factor β is chosen as 0.98[6],
here we also chose β=0.98 as our EWMA factor in
3.3. Anomaly Detection Approach experiments. To implement our leaky integrate-and-
fire anomaly detection algorithm, we still need to
In our SMTP traffic anomaly detection approach, specify two tunable parameters: K and H. The
the SMTP health status is obtained by the output of cumulating factor K decides how we cumulate the
leaky integrate-and-fire model. In the process of SMTP status deviation to detect the anomaly, and the
capacitor recharging, when the input current is alarm threshold H depends on K. Dn has different
constant, the earlier input current will raise voltage contribution to L(n) with different K. Fig.1 shows the
faster. Therefore, in our detection method, the percentage of (exp(-n/K) Dn) in L(50), where we set Dn
deviation of SMTP behavior Dn will be inputted into =1, n=1,2,…,50.
leaky integrate-and-fire model from the current interval 65

to the last spike occurred interval. That means we input 60

55 K=1
Percentage of exp(-n/K) in L(n)

current Dn first, and than the one just before current, 50 K=2
and so on. In this way, the current SMTP delivery 45
status will play a more important role in the detection 35 K=6
result. Because Dn is the discrete value, suppose that a 30
spike has occurred at interval nk, the output of leaky 20 K=10
15 K=11
integrate-and-fire model at interval n can be gotten 10

from (3) as: 5

n − nk 1 n−n n − nk − i + 1

∑ (5) 1 6 11 16 21 26 31 36 41 46
V '( n) = U exp(−
r )+ exp( − )D nk + i n
RC C i =1 RC
Fig. 1. Results of exp(-n/K) with different K
Let Ur=0, L’(n) = CV’(n) and K=RC, from (5) we
We can see clearly from Fig.1 that the smaller K,
the more contribution exp(-1/K) does, and the shorter
n − nk − i + 1
L '( n) = ∑
exp( −
i =1
)D (6)
nk + i
history profile is referred. When K=1, exp(-1/5)
contributes 63.21% to L(50), and about 8 intervals are
So we have: evidently referred in L(50); when K=15, exp(-1/15)
n − nk n − nk + 1 contributes 8.79% to L(50), and all 50 intervals are
L '(n − 1) = exp( − ) Dnk +1 + exp( − ) Dnk + 2 (7)
K K referred in L(50). Here we can see when K=5,
1 3
+... + exp(− ) Dn −1
K ∑ exp(− n / 5) contributes about 45% of integrate result,
n =1

Proceedings of the Third International Conference on Information Technology: New Generations (ITNG'06)
0-7695-2497-4/06 $20.00 © 2006 IEEE
mean duration 10 time intervals. The inter-arrival time
∑ exp(−n / 5) contributes about 91% of result. This
n =1
between consecutive attacks is random distributed in
means when we chose K=5, the calculating result not 60-180 time intervals with mean values 120 intervals.
only emphasizes the first three inputs, but also refers Our detection method is not sensitive to the attacking
enough history information. So in our detection pattern: it can detect the attacks with both constant and
algorithm, we chose K=5 as our cumulating factor. burst intensity attacks.
Suppose we should raise an alarm when xn increases
to 1.6 times of normal value. When we decide 4.1. Parameter Selection
cumulating factor K=5, we can calculate H by the
following algorithm: According to the directions described in section 3.4
Function GetThreshold(K) and [6], we chose K=5 and H=2.4 for leaky Integrate-
FOR I = 1 to TO 10 DO and-fire anomaly detection method and chose a=1.1
e = e + exp(-i/K) and threshold TH=2.2 for CUSUM is our test set. In
RETURN e*0.6 order to evaluate the parameters we select, we
EndFunction enumerate each possible combination of parameters of
When we set K=5, we can get H=2.4 following the two anomaly detection methods.
above algorithm. For leaky integrate-and-fire method, we test
threshold H from 2 to 4 increased by 0.1 and test
4. Performance Evaluation cumulating factor K from 1 to 15 increased by 1. For
CUSUM method, we test a from 0.6 to 1.6 with step
In this section, we firstly chose parameters of our 0.05 and test threshold TH from 0.6 to 6 with step 0.1.
method. In order to compare our method with the We reserve the parameter pairs those can archive
CUSUM-type algorithm described in [6], we also average 100% detection probability in 10 round tests.
chose parameters for algorithm in [6]. The algorithm of The test set is generated by overlapping constant
[6] is given by intensity attacks with the duration of 10 intervals (10
minutes). The intensity of attacks is 60% of mean
g n = [ g n −1 + ( X n − a)]+ (10)
actual receiving mails rate.
In addition to parameters choice, we evaluate how For CUSUM algorithm described in [6], we select
the parameters of our detection algorithm affect the parameter pair with a=1.1 and threshold TH=2.2, in
detecting performance. our test set, CUCUM method gets FAR=0.0077 and
Secondly, we investigate the performance of our DD=2.00. For our method, we select K=5 and H=2.4.
leaky integrate-and-fire method presented in the In the test set, our method gets FAR=0.0043 and
previous section for detecting SMTP traffic anomaly. DD=0.7. Our select parameters have smaller false
The performance metrics considered include the alarm ratio and lower detection delay in all reserved
detection probability, the false alarm rate, and the parameter pairs. Considering the tradeoff between false
detection delay. The detection probability (DP) is the alarm ratio and detection delay, the parameter pairs we
percentage of attacks for which an alarm is raised, the select for performance evaluation are suitable.
false alarm ratio (FAR) is the percentage of alarms that
did not correspond to an actual attack, and the 4.2. Evaluation of Cumulating Factor
detection delay (DD) is the detection delay after the
attack starts. Fig.2 shows the how the accumulating factor K
Our experiments use actual SMTP server delivery affects the false alarm ratio and detection delay, where
logs taken from our campus mail server as background the threshold H is adjusted by the algorithm described
data. We use mail server’s log during 2.5 days and in Section 3.4. The Fig.2 is obtained by taking the
measure the SMTP deliveries in one minute. Our test average of 10 runs.
set includes 120412 receiving mails information and Fig.2 shows the effect of cumulating factor K on
80358 sending mails information with average detection delay and false alarm ratio. As the
receiving speed 33.45 mails per minutes and sending cumulating factor K decides the length of history that
speed 22.32 mails per minutes. the detection method uses. The bigger K, the longer
The attacks were generated synthetically, and this history is referred in making decision, and at the same
allowed us to control the characteristics of the attacks, time, the bigger K, the lower weight of current
hence to investigate the performance of the detection delivering status are considered in detection result.
algorithms for different attack intensity. The typical This means the current delivering status influents
attacking duration observed in the Internet is 10 results less. The longer history may lead to long
minutes [11], therefore the attacks are generated with

Proceedings of the Third International Conference on Information Technology: New Generations (ITNG'06)
0-7695-2497-4/06 $20.00 © 2006 IEEE
detection time because current delivery status is not 120

sensitive to the final detection result. The smaller K, 40
the shorter history is considered and the bigger weight 0 720 1440 2160


of current networking status has, the faster we can 18

detect the anomaly, but at the same time, the final 0
0 720 1440 2160
detection results are more sensitive to the current 1

delivering status, it will make more false alarm. 0
0 720 1440 2160
Detection Delay

3.0 1

1.5 0
0 720 1440 2160
0.5 Time Intervals
b. Detection results of high intensity attacks
1 3 5 7 9 11 13
0.04 Fig.3. Detection Results
0.03 Fig.3a and Fig.3b show the results for leaky

0.02 integrate-and-fire method in the condition of low and

high constant intensity attacks, respectively. The
horizontal axes in these figures are the number of time
1 3 5 7 9 11 13
K interval. In each graph, from top to bottom, we have
Fig.2. Effect of Accumulating Factor K the SMTP deliveries trace with attacks, the attacks, and
In our test set, when K=5, the detection results has a the detecting results of CUSUM algorithm and leaky
good tradeoff between detection delay and false alarm integrate-and-fire method.
ratio. The above graphs show that our method has good
performance in both low intensity attacks and high
4.3. Evaluation of Anomaly Detection intensity attacks. In low intensity attacks, our method
yields a detection probability of 100% and false alarm
Our experiments consider attacks with constant ratio 0.24%. In high intensity attacks, our method gets
intensity, i.e. the attacks reach amplitude in one time similar results with 100% detection probability and
interval. Firstly we generate two types of intensity to 0.33% false alarm ratio. CUSUM algorithm has good
compare our method with CUSUM algorithm performance in high intensity attacks, but in low
described in [6], and there are low and high constant intensity attacks, it can’t archive 100% detection
intensity attacks. In low constant intensity attack, the probability, and the false alarm ratio of CUSUM
added attacks’ amplitude is 17 mails, and it is about algorithm is much worse than our method
50% of mean normal SMTP receiving speed. The high Secondly we generate a serial of different intensity
intensity attacks’ is about 77% of mean normal SMTP attacks to evaluation our detection performance. The
receiving speed. Fig.3 shows the detection results. detail average results with 10 runs are shown in Fig.4.
100 Received The horizontal axis in Fig.4 is attack mails injected per

50 interval.
0 0.008
0 720 1440 2160

17.0 0.006

8.5 0.004

0.0 0.002
0 720 1440 2160 15 17 19 21 23 25 27 29


0 1.0
0 720 1440 2160 0.5
15 17 19 21 23 25 27 29



0 720 1440 2160 LIF
Time Intervals
a. Detection results of low intensity attacks
15 17 19 21 23 25 27 29
Attack Amplitude

Fig.4. Average Detecting Results of different

Intensity Attack
From Fig.4, we can see clearly that the detection
probability of our method is higher than CUSUM
algorithm, but when the attack intensity is larger than
68% of mean rate of normal SMTP receiving speed (23
attack mails per interval), both can archive 100%

Proceedings of the Third International Conference on Information Technology: New Generations (ITNG'06)
0-7695-2497-4/06 $20.00 © 2006 IEEE
detection probability. But our method gets 100% much higher and the false alarm ratio is lower than
detection probability when the attacks are larger than CUSUM method.
50% of mean rate of normal SMTP receiving speed (17
attack mails per interval). 6. References
The detection delay of our method is similar with
CUSUM algorithm, in low intensity attacks with 15 [1] J. Postel, Simple Mail Transfer Protocol, RFC 821,1982
external attack mails injected, the detection delay of
both methods are about 2.5 intervals, and in high [2] R.Caceres, P.Danzig, S.Jamin, and D. Mitzel,
intensity attacks with 29 attacks mails are injected per ”Characteristics of widearea TCP/IP conversations”,
interval, two methods only need about 1 interval to Computer Communication Review,SIGCOMM, ACM Press,
raise alarms. New York, NY,USA, 1991, pp.101–112.
Our method has better false alarm ratio than [3] B. Harris, R. Hunt, “TCP/IP security threats and attack
CUSUM algorithm in all scenarios. The average false methods”, Computer Communications, Elsevier, 1999,
alarm ratio of our method is about 0.4% and 0.7% for pp.885-897
CUSUM algorithm.
The difference in the performance of our detecting [4] N.Yamai, K.Okayama, T.Miyashita, S.Maruyama, and
method and CUSUM method is our method uses M.Nakamura, “A Protection Method against Massive Error
weighted sum method to cumulate the behavior Mails Caused by Sender Spoofed Spam Mails”, Proceeding
deviation and CUSUM method treats all deviation of the 2005 Symposium on Application and the Internet,
fairly, therefore, our method is more sensitive to 2005, IEEE Computer Society, Italy, pp.384-390
current network status than CUSUM method, therefore [5] Brian McWilliams, “Wired News: Time-Travel Spammer
our method has better detection probability and lower Strikes Back”, Lycos, Inc., http://www.wired.
false alarm ratio than CUSUM algorithm, especially in com/news/technology/0,1282,61026,00.html, 2003.10.
low intensity attacks. Detection of low intensity attacks
is important because early detection of anomaly with [6] H. Wang, D. Zhang, and K. G. Shin, “Detecting syn
increasing intensity attacks will enable defensive flooding attacks”, IEEE INFOCOM, New York City, NY,
action to be taken earlier. 2002. pp. 1530-1539

[7] J. Hellerstein, F. Zhang, and P. Shahabuddin. “A

5. Conclusions statistical approach to predictive detection”. Computer
Networks, Elsevier, 2001, pp. 77–95.
In this paper, we propose an effective and robust
mechanism for detecting SMTP traffic anomaly. Our [8] P. Barford, J. Kline, D. Plonka, and A. Ron, “A signal
detection method cumulates the deviation of current analysis of network traffic anomalies”, SIGCOMM, ACM
delivering status based on the leaky integrate-and-fire Press, New York, NY, USA, 2002. pp. 71-82
model, which is a weighted sum method. The effect of
our detection method is that the method need not store [9] L.S. Smith. “Onset-based sound segmentation”,
Advances in Neural Information Processing Systems. MIT
history profile and low computation overhead. Our Press, 1996, pp. 729-735.
results show that leaky integrate-and-fire method is
quite effective at detecting attacks, especially in low [10] R.D. Patterson, M.H. Allerhand, C. Giguere, “Time-
intensity attacks. Compared with non-parametric domain Modelling of Peripheral Auditory Processing: A
Cumulative Sum method, the evaluation results show Modular Architecture and a Software Platform”. Journal of
that our detection method has lower false alarm ratio the Acoustical Society of America, 1995, pp.1890-1894
and higher detection probability. Especially in low
intensity attacks, our method detecting accuracy is [11] D. Moore, G. Voelker and S. Savage, “Inferring Internet
Denial of Service Activity”, Proceedings of USENIX
Security Symposium 2001, 2001.

Proceedings of the Third International Conference on Information Technology: New Generations (ITNG'06)
0-7695-2497-4/06 $20.00 © 2006 IEEE