You are on page 1of 6

MILCOM 2021 Track 3 - Cyber Security and Trusted Computing

A Study on Security Analysis of the Unlinkability


in Tor
Jing Tian1,2 , Gaopeng Gou1,2 , Yangyang Guan1,2 , Wei Xia1,2 , Gang Xiong1,2 , Chang Liu1,2 *
1
Institute of Information Engineering,Chinese Academy of Sciences, Beijing, China
2
School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China
tianjing1993@iie.ac.cn
MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM) | 978-1-6654-3956-5/21/$31.00 ©2021 IEEE | DOI: 10.1109/MILCOM52596.2021.9653038

Abstract—Tor is an important tool for anonymous communi- plex nature of noise in Tor, which makes the flow correlation
cation. However, due to its popularity, Tor is also concerned by attacks more efficient.
censors or other malicious attackers. A large body of existing In order to remove the threat of these attacks, many pro-
work examines Tor’s susceptibility to flow correlation attacks,
which are a form of deanonymization attack. Currently, a variety tocol obfuscation tools have been deployed on Tor, such as
of traffic obfuscation plugs are deployed on Tor to defeat flow FTE [8], Meek [9], Obfs4 [10]. However, most obfuscation
correlation attacks. However, whether plug-based defense can plugs deployed only obfuscate packet contents, but not traffic
effectively resist those attacks has not been tested and verified. features [9], [11], [12]. So whether such plugs can resist the
This paper focuses on how to effectively defeat the threat of existing flow correlation attacks is still a question. Besides,
flow correlation attacks on Tor. We first conduct experiments to
illustrate the actual defense effect of obfuscation plugs against
the defense effect of obfuscation plugs in face of statistical
flow correlation attacks, and the results show their effectiveness. correlation metrics-based attacks is a relatively understudied
However, the obfuscation-based defense also faces many other aspect.
problems such as censorship. So we explore the techniques used In this work, we comprehensively analyze the defense effect
in differential privacy (F P Ak and d⇤ -private) to apply a tiny of obfuscation tools. We demonstrate that the performance of
perturbation on Tor traffic. Our findings on differential privacy
suggest that the perturbations generated by the two mechanisms
plug-based defense is relatively poor when the obfuscation
can successfully flaw the existing flow correlation attacks on Tor. plug does not obfuscate packet features (using Obfs4 with IAT
Index Terms—Differential Privacy, Tor, Traffic obfuscation, mode ”off” as the example). And even the plug can obfuscate
Flow correlation attacks. packet features (using Obfs4 with IAT mode ”on” as the ex-
ample), with the continuous enhancement of learning capacity,
I. I NTRODUCTION the attackers can also invalid the defense by expanding the
obfuscation traffic training dataset [1].
Anonymous system offers an important and effective To fill this gap, our goal is to find a generic and effective ap-
method to protect the privacy of users. The unlinkability proach that protects Tor (and similar anonymity systems) from
is an important property that all anonymous systems strive flow correlation attacks. Defending against flow correlation
to achieve. A system is deemed to have unlinkability if an attacks is essentially perturbing the flow correlation classifier.
adversary who is able to scan any number of network flows Both statistical correlation metrics-based and DL-based attacks
cannot determine whether the egress and ingress segments are are learning the natural statistical features of Tor traffic. So we
from the same connection [1]. As one of the most popu- think about how to perturb the statistical characteristics of Tor
lar anonymous systems, Tor provides unlinkability by many traffic to make the classifiers misclassify.
mechanisms, such as onion-circuits and anonymous domain We make a close reading of the researches on how to con-
generation [2]. In spite of the efforts to build unlinkability duct perturbations that can mislead the classifiers [13]–[16].
system, Tor is still threatened by flow correlation attacks [1], We find the perturbations generated by differential privacy
[3], [4]. An adversary, as long as he observes the two ends of a (DP) algorithm can guarantee that certain classes cannot be
target user’s Tor connections, can carry out the flow correlation distinguished by any classifier [15], [16]. The DP algorithm
attacks to acquire the relationship information of senders and will return a ”privacy preserving” sequence which protects the
receivers. correct class of the flow pairs. Zhang et al. [15] demonstrate
Most flow correlation techniques are based on statistical the effectiveness, security and performance when the DP
correlation metrics, such as cosine similarity [5] and Spear- algorithm is used to counter traffic analysis on encrypted video
man’rank correlation coefficient [6]. By calculating similarity streaming packets. Inspired by that, we adjust the original DP
or statistical dependence of traffic characteristics (such as algorithm to fit the constraints of Tor traffic and explore the
packet timings and packet sizes), the attackers can correlate adaption of these algorithm to defend against flow correlation
traffic with reasonable true positive (TP) rate and false positive attacks.
(FP) rate [7]. With the recent development of deep learning We explore two DP mechanisms, F P Ak [17] and d⇤ -
(DL), an adversary can carry out more powerful attacks [1]. private [16], to apply differential perturbations on Tor traffic.
The DL model will automatically capture the dynamic, com- The Fourier Perturbation Algorithm (F P Ak ) provides privacy

978-1-6654-3956-5/21/$31.00 ©2021 IEEE 323


MILCOM 2021 Track 3 - Cyber Security and Trusted Computing

protection for correlated time series data through Discrete B. Differential privacy
Fourier Transform (DFT) in an differential private manner. The
Differential privacy (DP) is concerned with whether a small
d⇤ -private mechanism makes the data meet the requirements
change in a database can cause privacy leakage. In order to
of differential privacy through a series of operations [16]. Our
make it difficult for observers to detect changes by observing
experiment results show that both F P Ak and d⇤ -private can
the output of the computation over the database, random noise
disturb the rates of TP and FP to the baseline result (that is,
is added to the computation results. A randomized algorithm
random guessing). With the properly parameters selected, even
A gives (✏, )-differential privacy for any set of outputs ⌦, and
the classifiers are retrained and reinforced with noised data, the
for any neighbouring datasets of D and D0 , if A satisfies
F P Ak and d⇤ -private can sill successfully defeat the existing
flow correlation attacks.
Pr[A(D) 2 ⌦]  exp(✏) · Pr [A (D0 ) 2 ⌦] + (1)
Contribution. We highlight the following three main con-
tributions: The parameter ✏ refers to the privacy budget, which is nega-
• We reveal the real threat of flow correlation attacks on tively correlated with the intensity of noise.
Tor by conducting state-of-the-art attacks. Another formal definition of DP is (d, ✏) -privacy, which
• We reveal that most of obfuscation plugs are flawed in is proposed by Chatzikokolakis et al. [18]. Specifically, a
the face of the flow correlation attacks and they also face mechanism A satisfies (d, ✏) -privacy if
many other threats.
• We are the first to apply differential privacy to defend Pr(A(D) 2 ⌦)  exp (✏ ⇥ d (x, x0 )) ⇥ Pr (A (D0 ) 2 ⌦) (2)
against flow correlation attacks. The proposed approach
can greatly mitigate the attacks even the DL model is The d (x, x0 ) is a function that satisfies d(x, x) =
retrained with the noised data. 0, d (x, x0 ) = d (x0 , x) and d(x, z)  d(x, y) + d(y, z) for
Organization. We recall the background knowledge in all x, y, z 2 ⌦. The d⇤ -private mechanism will be introduced
Sec.II. Sec.III illustrates our motivation by conducting flow in Sec.VI.
correlation attacks on Tor. Sec.IV describes the plug-based In our context, we explore whether the perturbations gen-
defense and demonstrates its limitations. Sec.V shows the erated by algorithm A can defeat the flow correlation attacks.
defense effect of our approach. Lastly, we conclude our work Our approach relies on two aspects. The first one is that
in Sec.VI. the generation of perturbations can be viewed as a privacy
protection problem in a specific domain [15]. The second one
II. BACKGROUND is that DP can successfully resist most of privacy attacks and
provide a provable privacy guarantee [19]. So we use DP to
A. Flow correlation attacks on Tor generate a more privacy-protected representation of input.
Flow correlation attacks on Tor refer to link the egress and
ingress flows which come from the same Tor connection by III. F LOW CORRELATION ATTACKS ON T OR
comparing the flow characteristics. The attacks can be divided
into two categories, one is based on statistical correlation In this section, we illustrate our motivation with an example
metrics. It is performed by measuring the similarity or the of how to infer the user association relationships by conducting
statistical dependence of two random variables. The similarity flow correlation attacks on Tor.
measure is a function that quantifies the similarity between two
objects. The most commonly used similarity measure for real- A. Datasets
valued vectors is cosine similarity, which is used in multiple
correlation systems [5], [7]. The statistical dependence can be We use the publicly available datasets of DeepCorr [1]
used to indicate the relevance of two flows. For example, the (we pick 25000 random flows) and CAIDA 2019 anonymized
Pearson correlation coefficient [5] is used to reflect the degree Internet traces1 (we pick 250,00 random flows each at least
of linear correlation between the two variables. 1000 packets long). The format of the former dataset is the
timing features (using the inter-packet delays to represent)
The other is based on deep learning. Deep learning can
and packet size sequence of the ingress and egress flows.
learn the inherent laws and representation levels of samples,
The format of the latter dataset is raw pcap, from which we
eliminate the need for the process of feature construction by
extract the Inter-Packet Delays (IPD) sequence and packet size
humans. One of the most attractive deep learning structures is
sequence, and use the network jitter generation method used
CNN. Recently, researchers begin to apply CNN to analyze Tor
in the literature [5], [20], [21] to simulate network jitters. So
traffic. A recent study by Nasr et al. [1] proposes DeepCorr, a
the network flow, Fi , is represented as follows:
CNN-based system which has a excellent performance in cor-
relating Tor flow pairs. Compared with statistical correlation
metrics-based attacks, DeepCorr is more efficient, which has Fi = [Ti ; Si ] (3)
an extremely high rate of TP with a low rate of FP (e.g., TP
is 0.8 when FP is 10 3 ). 1 The caida dataset, https://www.caida.org/data/passive/passive dataset.xml

324
MILCOM 2021 Track 3 - Cyber Security and Trusted Computing

B. Threat model
We use the threat model of flow correlation attacks used
in previous work [1], [6]. The attackers intercept egress and
ingress network flows by controlling malicious Tor relays or
cooperating with malicious ISP, and try to link associated
flow pairs by computing the statistical correlation metrics of
traffic characteristics or using DL model. The attackers need
to determine which of the following two hypotheses is true:
• Correlated (H1 ): Fi and Fj are correlated, i.e., Fi and Fj Fig. 1. Correlation results with the four attacks.
are from the same Tor connection and the Fj is a noisy
version of Fi naturally perturbed by Tor network.
• Non-correlated (H0 ): Fi and Fj are not correlated, i.e., IV. P LUG - BASED DEFENSE
Fj is not a noisy version of Fi . A. Verification for plug-based defense
We have that: Tor has developed a variety of obfuscation plugs, which
⇢ fall into three categories: randomization, protocol mimicry and
H1 : Tj = Ti + t ; Sj = Si + s tunneling. Randomization refers to the use of encryption, ran-
(4)
H0 : Tj = T ⇤ + t ; Sj = S ⇤ + s dom padding, and other methods to randomize the characteris-
tics of Tor traffic, such as Obfs3 [11], Obfs4 [10] and Scram-
where Ti is the IDP sequence of Fi , Si is the packet size
bleSuit [22]. Protocol mimicry is imitating or masquerading
sequence of Fi . The T ⇤ and S ⇤ are the traffic characteristics
as popular whitelisted protocols which are rarely suspected
of an arbitrary flow not related to Fj . The t and s are the
by adversaries. For example, the SkypeMorch [23] integrates
perturbations that are naturally generated by the Tor network.
traffic between Tor clients and Tor bridges into Skype traffic
and the FTE (format-transforming encryption) transforms the
C. Attacks format of arbitrary packet contents into specified formats.
In this paper, for the statistical correlation metrics-based Tunneling technology is one extreme of the mimicry logic
attacks, we choose three correlation algorithms, including and a typical tunneling plug on Tor is Meek [9].
Pearson correlation [5], Cosine similarity correlation [5] and At present, the most effectively and commonly used obfus-
Spearman rank correlation [6]. For deep learning-based at- cation plugs are Obfs4 [24]. So we use the publicly available
tacks, we used the structure proposed by Nasr et al. [1], which Obfs4 dataset of DeepCorr. It consists of 500 flows over Obfs4
is called DeepCorr. (with two modes of Obfs4: IAT mode ”on” and ”off”). The
IAT mode ”on” means the plug will obfuscate traffic features
D. metrics and the value of IAT is set to 1. The IAT mode ”off” does
not obfuscate traffic features and the value of IAT is set to
As in previous work [1], [5], [6], we use two metrics for 0. Since the experiment of Obfs4 against DeepCorr has been
evaluating the performance of flow correlation attacks, namely conducted in [1], in this paper, we only show the effect of
the rates of TP and FP. Obfs4 against statistical correlation metrics-based attacks. The
result is shown in Fig. 2. We can know that the obfuscation
TP FP
TPR = ;FPR = (5) technique (Obfs4 with IAT mode ”on”) can indeed mitigate the
TP + FN TN + FP statistical correlation metrics-based attacks to a certain extent.
The TPR represents the proportion of associated flow pairs However, the defense performance is relatively poor when
which are correctly predicted to be associated. The FPR the obfuscation plug does not obfuscate traffic features. For
represents the proportion of non-correlated flow pairs that are example, when FPR is 0.001, the TPR of Spearman correlation
identified as correlated in error. Note that the value of the reaches over 0.6.
detection threshold, ⌘, trades off FPR and TPR. So we use the B. Drawback
ROC curve to illustrate the effect of flow correlation attacks.
In addition to poor performance when the IAT mode is ”off”,
the plug-based defense also has the following drawbacks.
E. Correlation results.
Expanding the training dataset Although obfuscation
We use the four attacks to correlate Tor flow pairs. As plugs can make the attack effect of DeepCorr worse, but the
shown in Fig. 1, the TP rates of four attacks are about 0.8 dataset for training in DeepCorr is only 400 flows and Nasr
when FP rates are about 0.01 (The x axis is log10 (F P )). et al [1] point that the accuracy of DeepCorr will be much
The experimental results illustrate the actual threat of flow higher for a real-world adversary who collects more training
correlation attacks on Tor. flows and achieves adequate training.
In this section, we explain our motivations, namely: the real Censorship Tor has attracted the attention of censors and
threat of flow correlation attacks on Tor. the nodes of Tor all over the network have begun to be

325
MILCOM 2021 Track 3 - Cyber Security and Trusted Computing

Fig. 2. Plug-based defense.

blocked [3]. At present, almost nobody uses the FTE protocol. TABLE I
Obfs2 and obfs3 has been announced as early as 2014 to R ESTRICTIONS AND ADJUSTMENT
be out of service [11], [25]. ScrambleSuit are also available
Feature Constrain Adjustment
for detection by various means, such as active detection IPD Non-negative Set negative values to minimum.
technology [24]. Therefore it is easily blocked by censors. Delay sensitive Control the noise intensities by ⌥
In addition, Zhang et al. [3] find that Meek faces a threat that in F P Ak and d⇤ -private.
coalescing egress traffic at cloud service providers increases Packet size Non-negative Set negative values to minimum.
Fixed-length cell Set the values of special Tor cell
vulnerability to flow correlation attacks. length values to 512 bytes.
High latency In order to avoid slow down connection, Tor
relays refrain from obfuscating traffic features [1] and the
majority of Tor bridges run Obfs4 with IAT mode ”off” [26],
which means they solely obfuscates packet contents, but B. Fourier Perturbation Algorithm (F P Ak )
not traffic features. It can be seen from Fig. 2 that Obfs4 Fourier Perturbation Algorithm (F P Ak ) is a DP algorithm
without traffic obfuscation (IAT = 0) is powerless against flow for time-series data and it relies on the Discrete Fourier
correlation attacks. Transform (DFT). DFT transforms the representation of a n-
Therefore, considering the above aspects, Tor needs a new dimensional time series data from the original domain R to the
defense mechanism to defeat flow correlation attacks and frequency domain F. Specifically, the j th element in frequency
ensure the anonymity of users. domain sequence F = (F[1], ......, F[n]) is given as
V. D IFFERENTIAL PRIVACY- BASED DEFENSE n
X p
2⇡ 1
ji
We adapt two DP mechanisms: F P Ak and d -private, ⇤ F(j) = DFT(R)j = e n Ri (6)
which meet these two definitions respectively. Fig. 3 shows i=1

overview of differential privacy-based defense. Similarly, the Inverse Discrete Fourier Transform (IDFT)
is also a transform of frequency domain to original domain
clean dat a noised dat a
(usually time domain). The j th element in original sequence
I PDs
t raining ret raining
R = (R[1], ......, R[n]) is given as
FPAk or
...
d *-privat e
Fi n
Packet sizes
DeepCorr
H1 / H0
1X 2⇡
p
1
ji
... H1 / H0 R(j) = IDFT(F)j = e n Fi (7)
I PDs Noised de e p le arning-base d at t acks n i=1
... sequences
Fj Pearson Correlat ion H1 / H0
Packet sizes
... Cosine Similarit y
Correlat ion
Spearman
H1 / H0
Algorithm 1 F P Ak algorithm
H1 / H0
Training DeepCorr wit h clean dat a.
Correlat ion
Input: The original sequence R; The scale of Laplace
Ret raining DeepCorr wit h noised dat a.
st at ist ical corre lat ion
me t rics-base d at t acks
distribution ; The parameter k in DFT.
Output: Noised sequence R̃.
Fig. 3. Differential privacy-based defense. 1: Compute Fk = DFTk (R).
2: Compute F[i] = F[i]⇣+ Lap(
⌘ ) for i = 1, . . . , k.
A. Incorporating Tor traffic constrains 3: Return R̃ = IDFT F̃ . k

DP algorithms are usually used in the field of protecting


user’s personal data which usually can be arbitrarily modified The idea of F P Ak is to transform the time-series data into
and will not be suspected. But due to the particularity of Tor the frequency domain, then add differential noise to every
network traffic, there are certain restrictions on changing Tor element, and finally transform it to the original domain through
network traffic. We list the restrictions and the corresponding IDFT. Algorithm 1 describes the generation of noised sequence
adjustment strategies in the Table I. using F P Ak .

326
MILCOM 2021 Track 3 - Cyber Security and Trusted Computing

Fig. 4. The effect of F P Ak .

Fig. 5. The effect of d⇤ -private.

C. d⇤ -private mechanism For the two defense mechanisms, we use the parameter ✏
d -private chooses the following metric d for enforcing
⇤ ⇤ as input, which represents the strength of privacy protection.
privacy. The x and x0 denote two sequences. The smaller the ✏ is, the higher the data confidentiality and
the larger the ✏ is, the higher the data availability. In order to
X protect data privacy, this parameter is usually set to a small
d⇤ (x, x0 ) = |(x[i] x[i 1]) (x0 [i] x0 [i 1])| (8) value.
i 1 For F P Ak , we refer the setting in [15] and set k to 10. So
d⇤ -private [27] is an algorithm which implements d⇤ - the first 10 Fourier coefficients are kept. Besides, we calculate
privacy. It use the noise drown from Laplace distribution to the L2 sensitivity 2 (Q) of IPD and packet sizeprespectively
ensure privacy. The perturbations ri added to x[i] is different, and generate the Laplace distribution scale = k 2 (Q)/✏.
i.e., The results of ROC curve when ✏=[0.01, 0.1, 1, 10] are
( shown in Fig. 4. For d⇤ -private, the noise is directly added
Lap ⇣1✏ ⌘ if i = D(i) upon the original sequence [15], so the ✏ is required to be
ri ⇠ blog2 ic (9)
Lap ✏ otherwise smaller than F P Ak and the ✏ is set from 1e-7 to 1e-4. Fig. 8
Where Lap(b) is a Laplace distribution with scale b and shows the performance of d⇤ -private. It can be seen from the
location µ = 0. D(i) represents the largest power of two that figures that the two mechanisms can reduce the attacker’s
divides i. correlation ability to close to random guessing (under the
appropriate parameters ✏ selected), indicating that the noised
D. Perturbation results sequences generated by F P Ak and d⇤ -private can effectively
resist flow correlation attacks on Tor.
We evaluate the performance of perturbations using the
ROC curve. We assume two kind of attackers, one is the
E. Retrain and reinforcement
ordinary attacker which is considered in this section, the other
is a more powerful attacker who can retrain and reinforce the In order to comprehensively analyze the defense effect of
DL model and is considered in next section. The ordinary DP algorithm, we assume a more powerful adversary who
attacker will not consider whether the data is disturbed. So in can obtain the noised data and retrain the DL model with
this section, we train DeepCorr with clean data and test with them. We retrain DeepCorr classifier with the noised sequences
noised data. generated by F P Ak and d⇤ -private.

327
MILCOM 2021 Track 3 - Cyber Security and Trusted Computing

[6] Y. Sun, A. Edmundson, L. Vanbever, O. Li, J. Rexford, M. Chiang,


and P. Mittal, “RAPTOR: routing attacks on privacy in tor,” in 24th
USENIX Security Symposium, USENIX Security 15, Washington, D.C.,
USA, August 12-14, 2015, 2015, pp. 271–286.
[7] A. Houmansadr, N. Kiyavash, and N. Borisov, “Non-blind watermarking
of network flows,” IEEE/ACM Trans. Netw., vol. 22, no. 4, pp. 1232–
1244, 2014.
[8] D. Luchaup, K. P. Dyer, S. Jha, T. Ristenpart, and T. Shrimpton, “Libfte:
A toolkit for constructing practical, format-abiding encryption schemes,”
in Proceedings of the 23rd USENIX Security Symposium, San Diego, CA,
USA, August 20-22, 2014, 2014, pp. 877–891.
[9] “Tor project meek,” https://trac.torproject.org/projects/tor/wiki/doc/meek.
[10] “Tor project obfsproxy4,” https://github.com/Yawning/obfs4/blob/master
/doc/obfs4-spec.txt.
[11] “Obfs3 is blocked,” https://metrics.torproject.org/userstats-bridge-
combined.html.
Fig. 6. The more powerful adversary. [12] B. Wiley, “Dust: A blocking-resistant internet transport protocol,” Tech-
nical rep ort. http://blanu. net/Dust. pdf, 2011.
[13] D. Zügner, O. Borchert, A. Akbarnejad, and S. Günnemann, “Adversarial
Fig. 6 shows the performance of F P Ak and d⇤ -private attacks on graph neural networks: Perturbations and their patterns,” ACM
under the powerful adversary. We can see that compared Trans. Knowl. Discov. Data, vol. 14, no. 5, pp. 57:1–57:31, 2020.
[14] J. Mohapatra, T. Weng, P. Chen, S. Liu, and L. Daniel, “Towards
with Fig. 4 and Fig. 8, Deepcorr’s performance has increased verifying robustness of neural networks against A family of semantic
when ✏ > 0.1 (for F P Ak ) or ✏ > 1e-6 (for d⇤ -private). perturbations,” in 2020 IEEE/CVF Conference on Computer Vision and
However, with the appropriate parameters selected, even in Pattern Recognition, CVPR 2020, USA, June 13-19, 2020, pp. 241–249.
[15] X. Zhang, J. Hamm, M. K. Reiter, and Y. Zhang, “Statistical privacy
the face of the powerful adversary, the two mechanisms can for streaming traffic,” in 26th Annual Network and Distributed System
also successfully defeat the deep learning-based attack. Security Symposium, NDSS 2019, San Diego, USA, February 24-27,
2019.
VI. C ONCLUSIONS [16] Q. Xiao, M. K. Reiter, and Y. Zhang, “Mitigating storage side channels
using statistical privacy mechanisms,” in Proceedings of the 22nd
In this paper, we explored new methods defend against ACM SIGSAC Conference on Computer and Communications Security,
flow correlation attacks on Tor. Existing plug-based defense is Denver, CO, USA, October 12-16, 2015, 2015, pp. 1582–1594.
already in danger and may be blocked by censors at any time. [17] V. Rastogi and S. Nath, “Differentially private aggregation of distributed
time-series with transformation and encryption,” in Proceedings of the
Therefore, we utilize the techniques of differential privacy to ACM SIGMOD International Conference on Management of Data,
enhance the unlinkability of Tor. Differential privacy can bring SIGMOD 2010, Indiana, USA, June 6-10, 2010, pp. 735–746.
strong privacy protection to data. We apply two mechanisms [18] K. Chatzikokolakis, M. E. Andrés, N. E. Bordenabe, and C. Palamidessi,
“Broadening the scope of differential privacy using metrics,” in Privacy
which can protect time-series data in a differentially private Enhancing Technologies - 13th International Symposium, PETS 2013,
manner. Our results show that the proposed approaches can ef- IN, USA, July 10-12, 2013, 2013, pp. 82–102.
fectively defeat the statistical traffic analysis and deep learning [19] T. Zhu, G. Li, W. Zhou, and P. S. Yu, “Differentially private data
publishing and analysis: A survey,” IEEE Trans. Knowl. Data Eng.,
classifier on Tor, even in the face of a more powerful adversary. vol. 29, no. 8, pp. 1619–1638, 2017.
[20] A. C. Bavier, M. Bowman, B. N. Chun, D. E. Culler, S. Karlin, S. Muir,
ACKNOWLEDGMENT L. L. Peterson, T. Roscoe, T. Spalink, and M. Wawrzoniak, “Operating
This work is supported by The National Key Research systems support for planetary-scale network services,” in 1st Symposium
on Networked Systems Design and Implementation (NSDI 2004), March
and Development Program of China (No.2020YFB1006100, 29-31, 2004, California, USA, Proceedings, 2004, pp. 253–266.
No.2020YFE0200500 and No.2018YFB1800200) and Key [21] A. Houmansadr, N. Kiyavash, and N. Borisov, “RAINBOW: A robust
research and Development Program for Guangdong Province and invisible non-blind watermark for network flows,” in Proceedings of
the Network and Distributed System Security Symposium, NDSS 2009,
under grant No. 2019B010137003. The corresponding author San Diego, California, USA, 8th February - 11th February 2009, 2009.
is Chang Liu. [22] P. Winter, T. Pulls, and J. Fuß, “Scramblesuit: a polymorphic network
protocol to circumvent censorship,” in Proceedings of the 12th annual
R EFERENCES ACM Workshop on Privacy in the Electronic Society, WPES 2013,
[1] M. Nasr, A. Bahramali, and A. Houmansadr, “Deepcorr: Strong flow Berlin, Germany, November 4, 2013, 2013, pp. 213–224.
correlation attacks on tor using deep learning,” in Proceedings of the [23] H. Mohajeri Moghaddam, B. Li, M. Derakhshani, and I. Goldberg,
2018 ACM SIGSAC Conference on Computer and Communications “Skypemorph: Protocol obfuscation for tor bridges,” in Proceedings of
Security, CCS 2018, Canada, October 15-19, 2018, pp. 1962–1976. the 2012 ACM conference on Computer and communications security.
[2] “Tor project stem,” https://stem.torproject.org/. ACM, 2012, pp. 97–108.
[3] Z. Zhang, T. Vaidya, K. Subramanian, W. Zhou, and M. Sherr, [24] L. Wang, K. P. Dyer, A. Akella, T. Ristenpart, and T. Shrimpton,
“Ephemeral exit bridges for tor,” in 50th Annual IEEE/IFIP “Seeing through network-protocol obfuscation,” in Proceedings of the
International Conference on Dependable Systems and Networks, DSN 22nd ACM SIGSAC Conference on Computer and Communications
2020, Valencia, Spain, June 29 - July 2, 2020, 2020, pp. 253–265. Security, Denver, CO, USA, October 12-16, 2015, 2015, pp. 57–69.
[Online]. Available: https://doi.org/10.1109/DSN48063.2020.00042 [25] “Obfs2 out of services,” https://blog.torproject.org/blog/recent-and-
[4] I. Karunanayake, N. Ahmed, R. A. Malaney, R. Islam, and S. K. upcoming-developments-pluggable-transports.
Jha, “Anonymity with tor: A survey on tor attacks,” CoRR, vol. [26] “Tor obfuscation mode,” https: //lists.torproject.org/pipermail/tor-
abs/2009.13018, 2020. project/2016-November/000776.html, 2016.
[5] M. Nasr, A. Houmansadr, and A. Mazumdar, “Compressive traffic [27] Q. Xiao, M. K. Reiter, and Y. Zhang, “Mitigating storage side channels
analysis: A new paradigm for scalable traffic analysis,” in Proceedings of using statistical privacy mechanisms,” in Proceedings of the 22nd
the 2017 ACM SIGSAC Conference on Computer and Communications ACM SIGSAC Conference on Computer and Communications Security,
Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017, Denver, CO, USA, October 12-16, 2015, 2015, pp. 1582–1594.
2017, pp. 2053–2069.

328

You might also like