Professional Documents
Culture Documents
by
Summer 2021
by
Approved:
Jamie D. Phillips, Ph.D.
Chair of the Department of Electrical and Computer Engineering
Approved:
Levi T. Thompson, Ph.D.
Dean of the College of Engineering
Approved:
Louis F. Rossi, Ph.D.
Vice Provost for Graduate and Professional Education and
Dean of the Graduate College
I certify that I have read this dissertation and that in my opinion it meets the
academic and professional standard required by the University as a dissertation
for the degree of Doctor of Philosophy.
Signed:
Chase Cotton, Ph.D.
Professor in charge of dissertation
I certify that I have read this dissertation and that in my opinion it meets the
academic and professional standard required by the University as a dissertation
for the degree of Doctor of Philosophy.
Signed:
Haining Wang, Ph.D.
Member of dissertation committee
I certify that I have read this dissertation and that in my opinion it meets the
academic and professional standard required by the University as a dissertation
for the degree of Doctor of Philosophy.
Signed:
Xing Gao, Ph.D.
Member of dissertation committee
I certify that I have read this dissertation and that in my opinion it meets the
academic and professional standard required by the University as a dissertation
for the degree of Doctor of Philosophy.
Signed:
Fouad Kiamilev, Ph.D.
Member of dissertation committee
ACKNOWLEDGEMENTS
I would like to thank my advisor Dr. Chase Cotton for his invaluable insight
into the graduate school process, his continual encouragement, and always believing in
me. Without his help, this dissertation, and the opportunities that I will enjoy because
of it, would never be possible.
Next I would like to thank Dr. Haining Wang and Dr. Xing Gao for their
immense help in teaching me to become a better researcher. Their long hours of
guidance and mentorship have taught me invaluable lessons about problem formulation
and presentation.
I would also like to thank Dr. Fouad Kiamilev, my very first computer engineering
professor, and the first UD ECE professor that I met on a discovery day nine years ago
for helping to put me on the path to becoming a computer engineer.
Next, I would like to thank my parents, my brother, and my wife for their
continuous support throughout my academic career. Without their encouragement I
never would have made it.
Finally, I would like thank God for granting the countless blessings and opportu-
nities that have enabled me to achieve so much.
iv
TABLE OF CONTENTS
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Chapter
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
v
2.3.2 Button Sequence Detection . . . . . . . . . . . . . . . . . . . 17
2.3.3 Individual Button Isolation . . . . . . . . . . . . . . . . . . . 18
2.3.4 Phone Detection . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.5 Signal Preprocessing . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.6 Animation Inference . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
vi
3.2.3 Preliminary Classification . . . . . . . . . . . . . . . . . . . . 55
3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
vii
GPU SIDE CHANNELS . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.5.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.5.2 Optimizing Cache Occupancy Attack . . . . . . . . . . . . . . 100
4.5.3 Novel GPU Channel . . . . . . . . . . . . . . . . . . . . . . . 102
viii
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Appendix
ix
LIST OF TABLES
3.2 Percentage of samples accepted when trained for each device model. 66
3.3 Average True Accept Rate (TAR) and True Reject Rate (TRR) for
same model device identification. . . . . . . . . . . . . . . . . . . . 68
x
4.1 Devices and High Power (HP) and Low Power (LP) core configurations
utilized in this work. . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.4 Accuracy for GPU based website fingerprinting on ARM devices . . 105
xi
LIST OF FIGURES
2.2 Power leakage on the USB power line when charging a Motorola G4.
Sampling rate is 125 KHz. The signal is filtered with a moving mean
filter to increase clarity. . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Averaged voltage readings for (a) Motorola G4 with LCD screen and
(b) Samsung Galaxy Nexus with AMOLED screen, when displaying
flickering white bars on the top, middle, and bottom rows, as well as
left, middle, and right columns, of a black screen. . . . . . . . . . . 12
2.6 The top displays the raw signal of multiple overlapping button presses.
The bottom demonstrates how peak detection can be utilized to
determine non-overlapping portions of individual button presses. The
signal is collected from Motorola G4 and filtered for clarity. . . . . 19
xii
2.11 Accuracy of 6-digit passcode inference. . . . . . . . . . . . . . . . . 28
2.13 Android and iOS keyboards. Each keyboard has a similar layout, with
4 rows of buttons. Each keyboard contains a maximum of 10 buttons
per row (top row). . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Histograms of read timings for 16 different USB mass storage drives.
Each plot contains 20 different samples. . . . . . . . . . . . . . . . 52
xiii
4.5 M1 MacBook Air Cache Average Memory Access Time in Safari . . 97
xiv
ABSTRACT
xv
drive is authorized for use on the target device. We examine the robustness of this
defensive method in multiple scenarios.
Finally, we notice a trend in computing towards more mobile devices and more
accessible architectures, specifically noting a recent move of some laptop designers from
x86 to ARM. We identify that the driving force behind this shift is the rapid increase in
ARM performance, power usage, and heat dissipation, partly brought about by major
modifications to the core and cache architectures. We examine whether these major
changes now enable attacks that were mainly feasible on x86 devices. We specifically
examine an attack which fingerprints the websites that users visit and use it’s success
to construct a novel GPU based channel within the ARM architecture for website
fingerprinting.
xvi
Chapter 1
INTRODUCTION
The landscape of computing has dramatically shifted over the last 20 years.
Users are no longer restricted to desk bound terminals and bulky file transfer media,
now computing moves with them. Smartphones and the ubiquity of wireless internet
access have greatly increased the flow and access to information, allowing new internet
based businesses to flourish and enabling numerous quality of life enhancements such
as online bill payment, portfolio management, and turn by turn navigation. However,
these benefits do not come without a cost. The centralization of personal and financial
data into a single mobile device has made smartphones an attractive target for cyber
criminals.
With the desire of stealing credentials, spying on a user for potential blackmail
purposes, or simply forcing the user to view advertisements, cyber criminals have
attacked mobile phones in many of the same ways that they attacked desktop and laptop
platforms. The malware targets some form of operating system vulnerability, exploits it,
and then gains privileged access to the device. Unlike their desktop counterparts, mobile
operating systems usually exhibit much more control over what APIs programmers
have access to, vastly limiting the attack surface. To this end, attackers have begun
to utilize side channel attacks, that is, attacks that observe components or shared
hardware resources that differ based on user behavior.
This dissertation investigates exploiting side channels in both offensive and
defensive settings with regards to mobile devices and peripherals. Our first study
examines how mobile smartphones leak information about their screen content over
the charging cable, and I demonstrate how this power side channel can be utilized
to infer user passcodes. I then examine a defensive setting, where I propose a novel
1
timing side channel to defend computer systems from rogue USB flash drives. Finally,
I examine the newer cache architectures that ARM has developed for mobile phones
and laptop/desktops and investigating how older cache based channels from x86 can be
utilized on these newer architectures. Finally, I utilize these findings to create a new
side channel via shared access to the GPU. In Sections 1.1 to 1.3 I briefly introduce
each topic, and the organization for this dissertation is provided in Section 1.4
2
1.2 Time-Print
The increase in reliance on computing technology has lead to an increased
necessity for security of large databases and confidential information. Where old
confidential records or critical equipment like power generators may have been kept in
physically guarded facilities where the only security concerns were fire and unauthorized
entry, new secure facilities must contend with many more security threats. To keep as
small of an attack surface as possible, many of these high security areas are physically
isolated from the internet, preventing the ingress of malware from the internet that
could wreak havoc on the facility. To facilitate the transfer of information from system
to system in these areas, many facilities utilize USB flash drives.
While initially an excellent lightweight and easily portable storage medium, USB
has developed to become a direct security threat. Attackers can store malicious code
that runs as soon as the device is plugged in or simply utilize unauthorized drives to
copy files and take them from the facility. This chapter investigates USB flash drives,
uncovering new methods to reliably identifying authorized devices while maintaining
usability.
3
1.4 Organization
The remainder of this dissertation is organized as follows. In Chapter 2, I
investigate power leakage of mobile phone screens, demonstrating a economical attack
that can accurately steal user passcodes. Chapter 3 examines the security of USB flash
drives, creating a timing side channel to accurately identify individual USB flash drives
and protect systems from unauthorized devices. Chapter 4 analyzes new design patterns
that are enabling ARM processors to perform competitively against x86 processors in
the laptop/desktop market and observes how previously proposed cache timing channels
from x86 can be modified to perform well on ARM. I also design and investigate a GPU
channel within the SoC. Finally, I summarize the dissertation and discuss future work
in Chapter 5.
4
Chapter 2
Touchscreen devices such as smartphones and tablets have become a daily tool
for a variety of business and entertainment activities, including mailing, banking,
browsing, gaming, and photography. While these devices have ushered in an era of
great convenience, their rich functionality has lead to ever-increasing usage, draining
batteries faster, and necessitating that users seek out areas to charge their smartphones.
One study suggests that city dwellers charge their phones multiple times per day [6].
To allow users to conveniently charge their devices, facilities such as USB power lines
and charging stations have been widely deployed in public areas, including airports [10],
parks [2, 11], hotels [3], and hospitals [1]. The market for shareable power banks is also
thriving [7], allowing users to simply scan a QR code to rent a public power bank and
charge their devices.
Despite their convenience, USB charging interfaces and stations also introduce
a number of security threats, as the USB interface in a public area is not under the
user’s control [8]. A typical USB interface is composed of one or more (depending on
the protocol) differential data lines for data transmission and a 5V and ground line for
delivering power. Previously it has been demonstrated that the data transmitted over
the data line can be sniffed [121] or monitored through the crosstalk leakage on the
power line [157]. Adversaries can also extract power consumption information from the
power line to infer coarse-grained information, such as internet browsing history [42]
or password length [182]. These disclosed security threats, however, do not stop users
from heavily utilizing USB charging facilities in public areas, since charging usually
involves no data transfer over the USB data line.
5
In this work, I reveal that USB charging in public areas can pose far more serious
threats than previously believed. I show, for the first time, that the signals on the power
line form a side channel and leak far more fine-grained information than previously
believed. Specifically, the power consumption information is highly correlated with
the activities on the touchscreen. Leveraging this side channel, built on the dynamic
power signals, adversaries can precisely identify the location of virtual button presses
on the touchscreen, with which they can steal extremely sensitive data such as a user’s
passcode. I call this security threat Charger-Surfing. I conduct a series of experiments
to demonstrate the existence of fine-grained information leakage tied to smartphone
touchscreen activity. For the construction of the Charger-Surfing channel, I develop
a wireless, low cost, and portable power trace capture system using commercial-off-
the-shelf (COTS) hardware. To further demonstrate that Charger-Surfing is a real
threat, I perform a case study on a numeric passcode unlock screen and show that
Charger-Surfing is able to extract a passcode on both Android and iOS devices by
leveraging signal processing and neural network techniques. I thoroughly assess this
security threat on different types of smartphones, multiple phones of the same model,
and across different users. Our results show that Charger-Surfing can achieve an average
accuracy of 98.7% for single button inference on all the tested smartphones. For an
unknown user1 , Charger-Surfing has, on average, a 95.1% or 92.8% chance to accurately
crack a 4-digit or 6-digit passcode on its first attempt, respectively, and a 99.3% (4-digit)
or 96.9% (6-digit) success rate within five trials.
In a nutshell, this is the first work that demonstrates fine-grained information
leakage over the power line of the USB charging cable regarding the content of the
touchscreen. More importantly, our studies show that the effectiveness of Charger-
Surfing is victim-independent, meaning that adversaries can train the neural network
using touchscreen data on their own smartphones with different configurations without
1
To show the effectiveness of Charger-Surfing, the model of a target device is trained
with the data created by an adversary and tested with victim users whose data were
not used to train the model.
6
any prior knowledge of a victim. The major contributions of this work include:
The rest of this chapter is organized as follows. Section 2.1 presents our threat
model and a brief primer on USB charging, touchscreen technology, and touchscreen
animations. The existence of fine-grained information leakage over the USB power line is
demonstrated in Section 2.2. The security threat posed by Charger-Surfing is detailed in
Section 2.3, followed by an in-depth case study in Section 2.4. Section 2.5 discusses the
attack practicality of Charger-Surfing. Section 2.6 describes countermeasures against
Charger-Surfing. Section 2.7 surveys related work, and finally, Section 2.8 concludes
the chapter.
7
charging, (2) touchscreen technology, focusing on the dynamic power consumed when
displaying different colors, and (3) the dynamic content of the touchscreen that could
be potentially leaked.
8
private data entered through the touchscreen, such as passcode, credit card number,
and banking information. To expose such threats, I demonstrate Charger-Surfing’s
capability in inferring a numeric passcode.
While there are a myriad of potential biometric lock mechanisms available
(fingerprints, faceID, etc.), many of these can be deceived [9, 5] and require a backup
PIN (personal identification number) code if they are unavailable (gloves, sweat, etc.).
Other authentication mechanisms such as Android’s pattern-based lock are not available
on all phones and have been shown to be less secure than a PIN code [16]. Thus, I
focus on the passcode-based lock as it is the most widely used primary or secondary
authentication mechanism to unlock touchscreen devices, and it acts as one of the only
barriers to gain complete control of a smartphone.
A passcode is extremely valuable to a dedicated adversary. When a victim
can be easily identified (e.g., using a USB port at a hotel room), knowledge of the
passcode would be sufficient for an adversary with physical access (e.g., evil maid
attack [4]) to the victim’s smartphone to steal private information or even reset other
online passwords (e.g., Apple ID and iCloud passwords). Even for an adversary without
physical access (e.g., a shareable power bank), a compromised passcode could still lead
to severe consequences, as users tend to reuse their passcodes (recent studies show
that each passcode is reused around 5 times [60]) and a smartphone’s passcode may be
reused as the PIN code of a credit/debit card or online payment system (e.g., Apple
Pay or Alipay). Overall, there are many possible real scenarios, where this type of
information would be very useful to law enforcement or an adversary for espionage,
fraud, identity theft, etc.
9
data transmission and carry negligible current when charging the battery. Newer USB
protocols include more differential data pairs, but leave the +5VDC and ground pins the
same. When charging a device, its battery enters the charging state, and the device’s
power is supplied not from the battery but from the power source connected by the
USB power line.
10
(a) USB charging station (b) USB charging interface
applied to the individual pixels, the two screen technologies should have inverse values
when they are utilized to display an identical image.
11
0.12 0.21
0.2
Screen Off
Screen On
Enter Lock Screen
0.15 Tap Pin Code 0.19
Voltage (V)
Voltage (V)
Voltage (V) 0.11
0.1
0.17
Left Top Left Top
0.05 Mid-Col Mid-Row Mid-Col Mid-Row
Right Bottom Right Bottom
1st 2nd 0.1 0.15
Button Button 0 1 2 0 1 2
0 Time (ms) Time (ms)
1 2 3 4
Time (S)
(a) (b)
Figure 2.2: Power leakage Figure 2.3: Averaged voltage readings for (a) Mo-
on the USB power line when torola G4 with LCD screen and (b) Samsung Galaxy
charging a Motorola G4. Sam- Nexus with AMOLED screen, when displaying flicker-
pling rate is 125 KHz. The sig- ing white bars on the top, middle, and bottom rows,
nal is filtered with a moving as well as left, middle, and right columns, of a black
mean filter to increase clarity. screen.
USB charging cable, may leak the location on the touchscreen where a virtual button is
pressed.
12
experimental findings, highlights the leakage patterns, and further shows that the state
of the smartphone’s battery will not cause any attenuation effects on the side channel.
13
10 -3
4
High Pass Filter Fully Charged
Voltage (V)
0
60 200 400 600
10 -3
4
High Pass Filter Charging
0
60 200 400 600
Frequency (Hz)
2
The screen constantly refreshes all pixels with a specific rate (typically 60Hz), in a
manner from left to right, and from top to bottom. This phenomenon can be observed
with a slow motion camera, such as the one on an iPhone, which films at 240 frames
per second.
14
same image on different portions of the screen. These experiments demonstrate the
great potential for inferring the location of the animation played on the screen when a
user presses a virtual button, by exploiting the power leakage on the USB power line.
15
Figure 2.5: Overview of Charger-Surfing’s working flow.
16
The voltage monitor should be able to collect the raw signal of the charging
device at a sampling frequency that is carefully determined. Utilizing a very high
frequency will result in unnecessarily large and cumbersome data, while sampling
too slowly will miss key information. There are two factors that affect the sampling
frequency: the refresh cycle of the screen and the resolution of the screen. As mentioned
in Section 2.2.2, screens typically refresh pixel by pixel, from left to right and from top
to bottom. To observe both the row and column portion of an animation, it is preferable
to sample at a rate that is slightly greater (or less) than the per row update speed, so
that (1) the power utilization can be monitored on a per row basis, and (2) samples can
be taken in different columns as the refresh moves down the screen. Most of today’s
flagship smartphones use a screen resolution between 1920×1080 and 2960×1440 and
have a refresh rate of 60 Hz. A single sample per row would require a sample rate in
the range of 115–178 KHz. Our design uses a sample rate at 125 KHz, which takes one
sample per every 0.9 – 1.4 rows on many flagship smartphones. This rate ensures that
consecutive samples are not taken on the same vertical line, thus providing more useful
location information.
17
the signal belonging to a button press sequence once the level is above an empirically
determined threshold.
18
Figure 2.6: The top displays the raw signal of multiple overlapping button presses. The
bottom demonstrates how peak detection can be utilized to determine non-overlapping
portions of individual button presses. The signal is collected from Motorola G4 and
filtered for clarity.
While the steps ¶–¸ performed up to this point are generally applicable to all
smartphones, step ¹ of Charger-Surfing focuses on detecting the phone type. This
task is much easier than classifying individual button presses as the screen technology,
the screen resolution, and different components within the phone (CPU, GPU, screen
driver, etc.) lead to vastly different power trace patterns, as demonstrated in Figure 2.3.
To accomplish this identification task, I utilize a neural network that is trained with
the isolated button press signals. The raw signal is passed through a high-pass filter
to preserve the high-frequency components, which are highly correlated to the phone
model, while removing the less informative DC offsets that can be a result of brightness
changes, charging/charged, or different charging rates.
19
As the victim’s phone model may not belong to the set that the attacker utilized
to train Charger-Surfing, the system further examines the confidence values of each
output class when inferring the phone model. If the confidence values are all low, it
will not pass the samples to the phone-specific neural networks for classification.
20
phones with different screen resolutions, features of button presses appear at different
locations of the power signal. CNNs are well suited to recognize features that can be
found in any area of a signal.
Model Classifier Configuration. An important consideration of any CNN
is the size of the convolutional kernels. Small kernels may not be able to recognize
features that manifest themselves over a large portion of the input signal, while large
kernels may be too coarse, missing the fine details and features of an input signal.
The ideal size of the convolutional kernels depends on the size of the features in
the power trace, which in turn depends on the sampling rate, screen layout, and size
of the animation to be detected. If one desires to classify individual keys on the device
text entry keyboard, for example, it would be necessary to calculate the size of the key
press animation with respect to the screen size and modify the kernel size accordingly.
This allows the first layer of the network to capture features that are large enough to
identify a button press, while not being so large as to oversimplify or miss a feature,
and not being so small as to only capture noise. Furthermore, our CNN design adopts a
typical architecture consisting of sets of a convolutional layer followed by a max-pooling
layer, which potentially increases the receptive field3 of the network. This allows the
subsequent layers of the network to leverage the highlighted features and correlate their
location across multiple frames of the signal when inferring the key press.
3
The receptive field is the portion of the input signal affecting the current convolutional
layer.
21
low-cost hardware implementation of the Charger-Surfing attack, its insensitivity to
different smartphone configuration variables (wallpapers, brightness, vibration, charging
status), and the transferability of the attack between different smartphones of the same
model. In total, I gather data from 33 volunteers4 and on 6 different devices. Our
participants are about 30% female, including members of varied races, heights, and
weights. The age of our participants ranges from 20 to 60 years old. This section utilizes
the data of 15 volunteers and four devices, while Section 2.5 uses an additional set of
18 volunteers and two devices.
4
The human-user-involved experiments have been filed and approved by the Institu-
tional Review Board (IRB) to ensure participants are treated ethically.
22
(a) iPhone (b) Android
user was tasked to input a pre-determined sequence of 200+ buttons on the numerical
lock screen. The sequence was designed to gather a uniform distribution of button
presses such that no button had a disproportionate amount of samples.
Our data collection utilizes a modified charging cable and a Tektronix MDO4024C
oscilloscope. The charging cable is modified by cutting the ground wire and inserting a
0.3Ω resistor. The oscilloscope is used to measure the voltage drop across the resistor,
providing a fine-grained and repeatable method of observation. It is configured to
sample at a rate of 125,000 samples per second.
23
rate of 2,083 samples per frame5 , the most pertinent features for button identification
are within 208 (iPhone) - 416 (Android) samples. Thus, when considering the receptive
field of the network, I choose an initial kernel size of 50 for the iPhone network and 100
for the Android network. This sizing configuration ensures that I capture the smaller
features of the signal in the initial layers of the network while still considering both
the larger features of the signal in intermediate layers and the location on the screen
across multiple frames of animation in the final layer. Detailed network configurations
are listed in Tables A.2 and A.3 in Appendix A.
Our threat model assumes that adversaries are unable to obtain the victim’s
data before training the system, and thus can only train the classifier using their own
collected data. To emulate this scenario, I divide the users into two separate sets: one
set for training (i.e., adversary) and the other set for testing (i.e., victim). To examine
the robustness of the network to the composition of the training data, I randomly
select five users to create the training set. The remaining 10 users form the testing set,
ensuring that there is no overlap between the training and testing users. I train five
neural networks for each device such that the ith (1 ≤ i ≤ 5) network is trained with
the data from i different users.
In testing, each network’s performance is evaluated on the 10 testing users, and
the average accuracy is reported.
5
The power trace signal is sampled at 125KHz, and the lock screen refreshes at a rate
of 60Hz. Under this configuration, 2,083 samples are gathered within each refresh cycle.
Each sample contains information about the content of the screen progressing vertically,
as the screen refreshes from top to bottom.
24
Table 2.1: Single Button Accuracy
# of Phone
Training Motorola Galaxy
iPhone 6+ iPhone 8+
Users G4 Nexus
1 82.0% 50.0% 23.8% 44.6%
2 90.0% 95.0% 93.3% 67.1%
3 99.6% 99.1% 96.9% 88.7%
4 99.7% 99.4% 98.5% 94.5%
5 99.9% 99.6% 99.5% 95.8%
(a) Press button on the left side. (b) Press button on the right side.
I train a primary neural network using high-pass filtered data from a subset
of the collected users and test on the data from the remaining users. Our results
show that the network can determine the correct phone model 100% of the time. This
identification step is also applicable to phones that might run multiple OS versions.
Different OS versions would be detected and classified at this step before being passed
to the more specific secondary neural networks.
25
results for different phones, ranging from 23.8% for iPhone 6+ to 82.0% for Motorola
G4. Once I increase the training data size to two users, however, there is a significant
accuracy improvement for single button inference: 67% for iPhone 8+ and more than
90% for all the other phones. The increasing accuracy trend is mainly attributed to the
differences in user behavior when interacting with touchscreens, which can have direct
effects on the power usage of the screen. More specifically, Android devices demonstrate
spatial and temporal variations while iOS devices demonstrate temporal and processing
variations. On the Android lock screen, the screen plays an animation that depends
on where users place their finger. An example of this scenario is shown in Figure 2.8,
where a user placing the finger on the left or right side of the button can create different
animations. Furthermore, the longer the user holds their finger in this position, the
larger the darker white circle grows. On iOS devices, when users press a button on the
lock screen, no matter where exactly they press it, the entire button lights up completely
and immediately. This animation does not end until the user removes their finger,
imparting temporal variations to the recorded power trace. Furthermore, devices newer
than the iPhone 6S (such as the tested iPhone 8+) make use of so-called “3D-Touch”
to measure the force of the screen press. This extra processing and information further
introduces subtle noise or processing variations into the measured signals.
The aforementioned user-oriented uncertainties and randomness can be dramati-
cally mitigated by integrating more users into the training process. Once the neural
network is presented with a robust dataset demonstrating diverse user behaviors, these
abnormalities can be recognized and classified correctly. Table 2.1 confirms that by
training on four users’ data, Charger-Surfing can achieve more than 94% accuracy when
classifying the single button presses of new users (i.e., the victims) for all devices. The
average accuracy across all four test phones for single button inference further reaches
98.7% when there are five training users. By this point, the improvements demonstrate
diminishing returns as more users are included. This indicates that our system only
requires a few users’ training data to achieve near optimal accuracy.
26
Figure 2.9: Breakdown of actual and predicted button classifications for the Galaxy
Nexus when trained with one user’s data. An entry on row i and column j corresponds
to button i being classified as j.
27
(a) 1st Trial (b) 5th Trial (c) 10th Trial
28
(a) 1st Trial (b) 5th Trial (c) 10th Trial
Table 2.2: Cumulative Accuracy of 3 Classification Attempts for Single User Trained
Model
Phone
Attempts Motorola Galaxy
iPhone 6+ iPhone 8+
G4 Nexus
1 82.0% 50.0% 23.9% 44.6%
2 86.6% 63.0% 40.6% 57.3%
3 89.0% 72.0% 51.9% 65.5%
29
to produce the top candidates for each press and then construct combinations of the
top candidates to produce guesses for the passcode.
Figure 2.10 illustrates the accuracy for 4-digit passcode inference. I utilize the
networks trained in Section 2.4.4, where each phone is trained on its own network with
i (1 ≤ i ≤ 5) users. Figures 2.10 (a), (b), and (c) show the accuracy results after the
first, fifth, and tenth trials, respectively.
In a brute force attack scenario, the success rate on the first trial is only 0.01%.
By contrast, with only one user in the training set, Charger-Surfing achieves an average
success rate of 13.9% on the first trial and a 20.8% success rate after the 10th trial.
Clearly, there is a strong trend towards improved accuracy as the number of training users
increases, showing that with more users, Charger-Surfing can develop a more general
and accurate model that is robust against irregularities caused by user interactions
with the smartphone. When two users are involved in training, the average success rate
increases substantially, scoring 59.5% on the first trial and 75.8% by the tenth trial.
This improvement trend continues but slows down as more users are included. Finally, it
achieves an average success rate of 95.1% on the first trial and 99.5% on the tenth trials
when trained with five users. The diminishing return indicates a strong convergence of
Charger-Surfing’s inference accuracy with only a few users in the training set.
6-digit passcode: I further evaluate the effectiveness of Charger-Surfing when
cracking a longer, 6-digit passcode. Similarly to the 4-digit case, I select 1,000 random
6-digit combinations and test them against our inference system. Figures 2.11 (a),
(b), and (c) illustrate the accuracy after the first, fifth, and tenth trials, respectively.
Although the search space for a 6-digit passcode is much larger (a 6-digit passcode
has 1,000,000 combinations), Charger-Surfing demonstrates high success rates similar
to those achieved when cracking a 4-digit passcode. When trained on five users, the
success rate of the first trial is greater than 90% for all phones except the iPhone 8+,
which has an accuracy of 77.0%. Even for iPhone 8+, the success rate then increases to
90.3% after the fifth trial; and the accuracy for all phones is more than 96% by the
tenth trial. In comparison to a brute force approach that has a success rate of 0.001%
30
125%
100%
Single Button
Accuracy
75%
50%
25%
0%
125.0 62.5 31.3 15.6 10.4 7.8 6.3 3.9
Frequency (KHz)
Figure 2.12: Impact of different sampling rates on single button accuracy, based on
3-user data of Motorola G4.
within ten trials, Charger-Surfing is more than 96,000 times more effective.
31
Table 2.3: Impact of sampling frequency on row, column, and overall classification
accuracy, based on 3-user data of Motorola G4.
Accuracy
Frequency
Row Column Overall
(KHz)
62.5 99.4% 99.4% 99.3%
31.3 99.8% 99.6% 99.5%
15.6 98.5% 92.4% 92.3%
10.4 94.1% 62.3% 61.3%
7.8 85.3% 46.9% 43.0%
6.3 59.5% 38.5% 26.0%
3.9 30.8% 33.4% 9.9%
Figure 2.13: Android and iOS keyboards. Each keyboard has a similar layout, with 4
rows of buttons. Each keyboard contains a maximum of 10 buttons per row (top row).
row and column accuracy degradation6 as the sampling rate decreases. The results are
listed in Table 2.3. It turns out that the column accuracy is the limiting factor. While
the row accuracy remains above 94% even at 10.4KHz, the column accuracy degrades
from 99.5% at 31.3KHz to 62.3% at 10.4KHz. Such a result is consistent with the screen
refresh behavior: as the screen refreshes row by row and from left to right on each
row, the row signal changes much slower than the column signal. Thus, a decreased
sampling rate can still capture the row signal, but becomes incapable of fully capturing
the column signal.
6
Row (column) accuracy is defined as the percentage of classifications that fall within
the correct row (column) (e.g., a ‘1’ that is misclassified as a ‘2’ is still in the correct
row).
32
2.4.8 Detection Granularity Analysis
So far I have demonstrated that by monitoring the power usage of a charging
smartphone, an adversary can extract the location of animations on the touch screen,
compromising a user’s passcode. Another particularly enticing target is the onscreen
virtual keyboard. Each press of the keyboard provides feedback to the user by either
displaying an enlarged version of the pressed character or by darkening the pressed key.
Thus, an adversary with a voltage monitoring setup might attempt to infer a user’s
input by locating and classifying the animations of the onscreen keyboard. However,
one important question remains: is Charger-Surfing able to achieve sufficient precision
for classifying smaller animations on the screen?
To gain a better understanding of the achievable precision of Charger-Surfing, I
examine the relationship among animation positioning, animation size, and inference
accuracy at different sampling rates. Specifically, the results in Table 2.3 show that the
column accuracy is the limiting factor in classification accuracy. Using the examples of
the onscreen keyboard in Figure 2.13, I can see that both iOS and Android keyboards
have a maximum of 10 columns (top row) that must be classified accurately. Table 2.3
shows that a sampling rate of 31.3 KHz is required to accurately classify 3 columns.
Thus, to classify 10 columns, the sampling rate should be increased by at least 10/3
times to around 105 KHz.
While this sampling rate ensures that the signal contains enough information,
it is equally important to tune the filter size in the neural network for identifying the
patterns present in the data. As previously discussed in Section 2.4.2, the convolutional
kernels must be sized such that they are smaller than the number of samples that
encompass the animation. For example, in the iOS keyboard presented in Figure 2.13a,
each key takes up about 1/17th of the vertical space on the screen. Using the sampling
rate determined above, of 105 KHz, 1,750 samples are taken during each screen refresh.
Thus, each keypress animation can be recorded in about 103 samples. Leveraging
our experience in training the CNN for passcode inference (a kernel size of 50 for 208
33
samples, as described in Section 2.4.2), a kernel size close to 25 should provide an
adequate starting point for tuning the network to detect keyboard press animations.
34
Figure 2.14: The portable, low-cost data collection setup. A WiFi enabled microcon-
troller can send acquired data to a custom webserver in real-time.
Table 2.4: Single Button and Passcode Inference Accuracy (5 training users / 15 testing
users).
Single Button Passcode
Attempt Press Trial 4-Digit 6-Digit
1 98.6% 1 94.9% 92.4%
2 99.4% 5 97.4% 94.9%
3 99.6% 10 97.5% 96.3%
faster sampling and bulky setup (e.g., an oscilloscope). For cracking a 4-digit passcode,
the system achieves an average accuracy of 94.9% in the first attempt and 97.4% by the
fifth attempt. The results of cracking a 6-digit passcode are also promising: an average
accuracy of 92.4% in the first attempt and 96.3% by the tenth attempt.
35
configurations. I gather data from a Motorola G4 in which we, one at a time, change
the wallpaper (two different wallpapers), modify the brightness (0%, 50%, 100%), use
an uncharged phone, and enable haptic feedback. I then test the data against the
network trained with 5 users in Section 2.5.1. The results listed in Table 2.5 indicate
that the configuration difference has very little impact upon the inference accuracy,
which remains above 97% for single button inference in all cases. This demonstrates
that Charger-Surfing is quite robust against device configuration changes.
Table 2.5: Single Button Inference Accuracy (5 training users / 1 testing user) with
Varied Configurations.
Static Wallpaper Brightness
Configuration Charge Haptics
1 2 0% 50% 100%
Accuracy
99.3% 98.0% 98.0% 97.3% 100% 99.2% 100%
(1st Attempt)
36
Table 2.6: Cross-device training and testing configurations.
Training Testing
Phone A Phone B
Users: 1,2 Users: 3-12
Wallpaper: 1,2,3 ⇒ Wallpaper: 4
100 Presses of Balanced 200
each button button sequence
Total: 6,000 Presses Total: 2,000 Presses
Table 2.7: iPhone 6+ cross device testing classification results. 2 training users on an
iPhone 6+ and 10 testing users on a different iPhone 6+.
Single Button Passcode
Attempt Press Trial 4-Digit 6-Digit
1 99.1% 1 96.5% 94.6%
2 99.4% 5 97.4% 95.6%
3 99.4% 10 97.4% 96.2%
Table 2.8: iPhone 8+ cross device testing classification results. 2 training users on an
iPhone 8+ and 10 testing users on a different iPhone 8+. High initial accuracy meant
that subsequent attempts realized minimal improvement.
Single Button Passcode
Attempt Press Trial 4-Digit 6-Digit
1 99.7% 1 99.0% 98.6%
2 99.8% 5 99.1% 98.6%
3 99.8% 10 99.1% 98.7%
37
8+, are presented in Tables 2.7 and 2.8, respectively, demonstrating that both cross-
device tests achieve greater than 99% accuracy on the first attempt when classifying
single buttons and greater than 94% accuracy when classifying 6-digit passcodes. Note
that the accuracy results here are slightly higher than those in the oscilloscope-based
experiments shown in Section 2.4. This slight difference could be caused by the different
iOS versions (the oscilloscope experiments are performed on older iOS versions), or
oscilloscope vs ADC quantization at low voltages.
Overall, this set of experiments clearly indicate that Charger-Surfing works well
not only across different users but across different devices of the same model, posing a
real security threat.
2.6 Countermeasures
Our experiments show that on different smartphones, Charger-Surfing is highly
effective at locating the button presses on a touchscreen and inferring sensitive informa-
tion such as a user’s passcode. While it would be difficult to completely fix the leakage
channel, which is related to USB charging and hardware, there exist some possible
countermeasures.
The side channel exploited by Charger-Surfing leaks information about dynamic
motion on the touchscreen. This attack is so effective as the layout of the lock screen
is fixed: the buttons for a passcode are in the same positions every time the screen
is activated. On the contrary, randomizing a number’s position on the keypad for
code entry would likely hamper Charger-Surfing’s ability to detect a user’s sensitive
information. However, this position randomization may inconvenience users as it will
take more time for them to locate each button. Furthermore, this approach scales poorly;
randomizing a keyboard layout, for example, would be highly undesirable to users.
Likewise, it is possible for smartphone vendors to remove button input animations,
a change that would significantly reduce the information leakage in the power line,
but provide minimal feedback to users as to whether they have correctly pressed the
38
intended button. While both features are available in some customized versions of
Android, they are not widely deployed in currently available devices.
At first glance, one likely solution is not to eliminate the leakage, but to drown
it out via noise. One such option would be to utilize a moving background such as
the readily available live/dynamic wallpapers on Android/iOS, which act similarly to
videos and constantly animate the screen. While this idea seems initially attractive, it
has a few major drawbacks: 1) the live wallpaper only works on the lock or home screen
and would not prevent similar attacks against onscreen keyboards in applications, and
2) the noise generated by this system is random and can be filtered out with sufficient
samples. In a preliminary study of this defense technique, I built a neural network
trained with 100 samples per button taken with two live wallpapers and tested on
another live wallpaper. The network was able to realize greater than 98% single button
accuracy, demonstrating that with sufficient samples of live wallpapers, Charger-Surfing
can discern the true user input signal from the noise signal of the moving background.
To fully address the leakage channel exploited by Charger-Surfing, one solution
is to eliminate the leakage channel by inserting a low pass filter in the charging circuitry
of the device. This modification will remove the informative high frequency component
from the signal. In a preliminary testing, I applied a low-pass filter with a cutoff of
60Hz to the collected iPhone 6+ cross-device data and the accuracy dropped to 10%
(expected accuracy of random guessing). This result demonstrates that this approach
can effectively mitigate the information leakage that Charger-Surfing relies upon.
Until an effective countermeasure is widely adopted, it is important for users
to be increasingly aware of the security threats associated with USB charging. Users
should avoid inputting a passcode or other sensitive information while charging their
smartphones in public or shared environments.
39
research work in the following four areas:
Power analysis. Extensive efforts have been devoted to analyzing the power consump-
tion of smartphones [132, 131, 31, 112]. Carroll et al. [36] presented a detailed analysis
showing that the touchscreen is one of the major consumers of power in a smartphone.
Furthermore, many works [41, 55, 188] attempt to understand the energy consumed by
the touchscreen.
40
The power consumption of a smartphone could be exploited as a side channel
to extract information such as mobile application usage [42] or password length [180].
Yang et al. demonstrated that public USB charging stations allow attackers to identify
the webpages being loaded when a smartphone is being charged [182]. Michalevsky
et al. [116] demonstrated that power consumption could be used to infer the location
of mobile devices. Spolaor et al. [154] showed that the USB charging cable can be
used to build a covert channel on smartphones by controlling a CPU-intensive app
over 20 minutes. To the best of our knowledge, I are the first to show that the power
consumption of a smartphone can be used to infer animations on a touchscreen and
steal sensitive data, such as a user’s passcode.
Other side channel attacks. Chen [40] demonstrated that the shared procfs in the
Linux system could be exploited to infer an Android device’s activities and launch UI
inference attacks. Without procfs (e.g., iOS devices), attackers can still infer sensitive
information and private data by exploiting exposed APIs [189]. Genkin et al. [65]
acquired secret-key information from electromagnetic signals by attaching a magnetic
probe to a smartphone. Radiated RF signals can also be used to eavesdrop screen
contents remotely [114]. Recent research [67] has also shown the possibility to infer
broad information on large computer monitors via acoustic emanations from the voltage
regulator. Similar to traditional computers, smartphones are also vulnerable to classical
cache-based side-channel attacks [191]. Our work differs from these prior works by
showing much finer grained information leakage of screen animation locations through
the power line.
41
to extract fine-grained information such as user passcodes.
While ethernet over power line techniques have been utilized in both homes and
data centers [38], Guri et al. demonstrated the possibilities of building covert channels
over a power line [75]. Prior research has also shown that power consumption information
can lead to various privacy issues, including key extraction on cryptographic systems [94]
and laptops [68], state inference of home appliances [59], webpage identification of
computers [45] and laptop user recognition [50]. Unlike these attacks, our work classifies
ten on-screen animations in real time, directly exposing precise user input over the
charging port.
2.8 Summary
This paper reveals a serious security threat, called Charger-Surfing, which
exploits the power leakage of smartphones to infer the location of animations played on
the touchscreen and steal sensitive information such as a user’s passcode. The basic
mechanism of Charger-Surfing monitors the power trace of a charging smartphone and
extract button presses by leveraging signal processing and neural network techniques
on the acquired signals. To assess the security risk of Charger-Surfing, I conduct a
comprehensive evaluation of different types of smartphones and different users. My
evaluation results indicate that Charger-Surfing is victim-independent and achieves
high accuracy when inferring a smartphone passcode (an average of 99.3% and 96.9%
success rates when cracking a 4-digit and 6-digit passcode in five attempts, respectively).
Furthermore, I build and test a portable, low-cost power trace collection system to
launch a Charger-Surfing attack in practice. I then utilize this system to demonstrate
that Charger-Surfing works well in real settings across different user configurations and
devices. Finally, I present different countermeasures to thwart Charger-Surfing and
discuss their feasibility.
Having demonstrated the offensive capabilities of side channels, I next turn our
attention to developing a side channel based defense. Specifically, I realize that many
computer systems are vulnerable to a curious user problem, where well intended users
42
may plug unknown USB devices into a computer, potentially infecting it with malware.
I design a system to thwart this type of attack in the following chapter.
43
Chapter 3
The Universal Serial Bus (USB) has been a ubiquitous and advanced peripheral
connection standard for the past two decades. USB has standardized the expansion of
computer functions by providing a means for connecting phones, cameras, projectors,
and many more devices. Recent advancements in USB have increased data transfer
speeds above 10 Gbps, making the USB mass storage device (flash drive) a popular
method for moving data between systems. Especially, USB is commonly used in air-
gapped systems where security policies prohibit data transfer via the Internet, such as
military, government, and financial computing systems [33, 125, 134].
While USB has made the usage and development of various peripheral devices
far simpler, it has recently been scrutinized for security issues [21, 58, 76, 86]. USB
is an inherently trusting protocol, immediately beginning to set up and communicate
with a peripheral device as soon as it is connected. This has many advantages, as users
are not required to undertake a difficult setup process, but has recently been exploited
by attackers to compromise host systems. The discovery of Stuxnet [58], Flame and
Gauss [97] has demonstrated that malware can be designed to spread via USB stick.
Unwitting and curious employees might pick up dropped (infected) flash drives and plug
them into their computers, which allowed the malicious code on the drives to infect the
hosts and then propagate across the network, wreaking havoc on the targeted industrial
control systems. More recently, attackers have investigated the ability to modify the
firmware of a USB device [76, 86] such that an outwardly appearing generic USB flash
drive can act as an attacker-controlled, automated, mouse and keyboard. The behavior
of the USB driver can also be utilized as a side-channel to fingerprint a host device and
44
launch tailored drive-by attacks [21, 52]. While many defense mechanisms have been
proposed, these techniques generally require user input [165], new advanced hardware
capabilities [24, 160], or utilize features (device product ID, vendor ID, or serial number)
that could be forged by an advanced attacker with modified firmware [12, 86].
In this paper, I propose a new device authentication method for accurately
identifying USB mass storage devices. I reveal that read operations on a USB mass
storage device contain enough timing variability to produce a unique fingerprint. To
generate a USB mass storage device’s fingerprint, I issue a series of read operations to
the device, precisely record the device’s response latency, and then convert this raw
timing information to a statistical fingerprint. Based on this design rationale, I develop
Time-Print, a software-based device authentication system. In Time-Print, I devise a
process for transforming the raw timing data to a statistical fingerprint for each device.
Given device fingerprints, Time-Print then leverages one-class classification via K-Means
clustering and multi-class classification via neural networks for device identification. To
the best of our knowledge, this is the first work to expose a timing variation within USB
mass storage devices, which can be observed completely in software and be utilized to
generate a unique fingerprint1 .
To validate the efficacy of Time-Print, I first provide evidence that statistical
timing variations exist on a broad range of USB flash drives. Specifically, I gather
fingerprints from more than 40 USB flash drives. Then I examine three common security
scenarios assuming that attackers have different knowledge levels about the targeted
victim from least to most: (1) identifying known/unknown device with different models,
(2) identifying seen/unseen devices within the same model, and (3) classifying individual
devices within the same model. I demonstrate compelling accuracy for each case, greater
than 99.5% identification accuracy between known/unknown devices with different
brands and models, 95% identification accuracy between seen and unseen drives of the
1
USB Type-C has provisions to identify device models [172] via a specialized key
system; Time-Print does not make use of any specialized hardware and works on both
legacy and new devices.
45
same model, and 98.7% accuracy in classifying individual devices of the same model.
I finally examine the robustness of Time-Print in multiple hardware configurations.
I observe that Time-Print experiences a small accuracy degradation when measured on
different USB ports, hubs, and host systems. I also examine the stability of Time-Print
and present a strategy to make the fingerprints robust to write operations. Additionally,
I investigate the authentication latency of Time-Print, demonstrating that while precise
authentication can be achieved in 6-11 seconds, an accuracy greater than 94% can be
achieved in about one second.
The major contributions of this work include:
• The first work to demonstrate the existence of a timing channel within USB mass
storage devices, which can be utilized for device fingerprinting.
• A thorough evaluation of more than 40 USB mass storage devices, showing that
the ability to fingerprint with high accuracy is not dependent upon the device
brand, protocol, or flash controller.
The remainder of this chapter is organized as follows. Section 3.1 describes the
threat model, including an attacker’s capabilities, and provides a primer on the USB
protocol, USB mass storage devices, and USB security threats/defenses. Section 3.2
demonstrates the existence of a fingerprintable timing channel within USB mass storage
devices. Section 3.3 details the method for generating and gathering a USB mass storage
fingerprint. Section 3.4 presents the experimental setup for evaluation. Section 3.5
evaluates the Time-Print system. Section 3.6 examines the practicality of Time-Print
under different use configurations. Section 3.7 surveys related work in USB security,
device fingerprints, and device authentication. Finally, Section 3.8 summarizes the
chapter.
46
3.1 Threat Model and Background
This section presents the threat model and introduces various components of the
new timing-based side-channel, including the USB protocol stack, USB mass storage
devices, and current USB security.
47
not without their drawbacks. Attacks such as Stuxnet [58] were injected into target
systems via USB, and recent research has demonstrated the creation of malicious USB
devices which can negatively affect system security [21, 76, 86].
I then assume that attackers attempt to compromise the target air-gapped
computer via USB drives. Attackers have the ability to design malicious USB devices
so that once the USB handshake is completed, malicious scripts or activities can
be executed on the host. According to the organization’s security policies, system
administrators only issue access to a few approved USB devices (i.e., insider devices)
belonging to particular brands and models (e.g., SanDisk Cruzer Blade). Thus, a USB
fingerprinting mechanism must be integrated into the host to accept/classify approved
USB devices and reject other devices. For a specific air-gapped computer system, system
administrators can train fingerprints for all approved devices. Also, they can pre-collect
multiple devices from popular brands or models to augment the device authentication
system with examples of unapproved drives.
With these settings in mind, I envision three typical scenarios as shown in Fig-
ure 3.1, in which Time-Print offers enhanced security benefits for device authentication.
Note that Time-Print is designed to augment current USB security, and it can greatly
assist existing USB security mechanisms such as GoodUSB [165] and USBFilter [168].
Scenario ¶: Attackers have no knowledge of the approved USB devices, and
thus a random USB device could be connected to the target host. Such a random USB
device likely does not belong to one of the approved device models. Time-Print should
thus reject any device whose model is not approved. In this minimal knowledge scenario,
administrators can also prevent system infection from irresponsible employees that
plug in non-approved devices (dropped devices) or computers in an open environment
(reception computers).
Scenario ·: Attackers (e.g., former employees who are aware of the security
measure) know the brand and model of the approved USB devices and purchase one with
the same brand and model. Time-Print should be able to reject unseen devices of the
same brand/model.
48
Scenario ¸: Auditing user authentication. A system administrator should have
the ability to identify specific devices that were issued to employees. For approved
devices, different authorization levels might be assigned. In this case, the system
administrator needs to audit which specific devices are connected to the target system
to trace employee activities and detect data exfiltration attacks. Therefore, Time-Print
should be able to classify all approved devices with high confidence.
Attacker Capabilities. I examine Time-Print against attackers at multiple
levels. A weak attacker may simply attempt to plug a device into the victim system
with little knowledge (e.g., Scenario 1). A stronger attacker may know the device model
allowed at the victim side and attempt to connect a device of the same model (e.g.,
Scenario 2/3). The strongest attacker may be able to steal a legitimate device and
attempt to replicate the physical fingerprint with an FPGA based system. While the
FPGA based system may present different firmware, the firmware for current USB flash
drives is a closely guarded and proprietary secret. I do not consider a case in which
an attacker is able to significantly modify the firmware of a (stolen) legitimate device.
In addition, I also must exclude authorized users who attempt to maliciously harm
their own computing systems. This is a reasonable assumption as authorized users who
have privileges to access any system resources likely has little need for mounting such a
complicated USB attack.
Defender Preparations. To use Time-Print, defenders (e.g., system admin-
istrators) should first have a security policy for limiting the employee usage of USB
devices to specific models. Then, they need to gather fingerprint samples for their
legitimate devices to enroll them into Time-Print beforehand.
49
Insider Devices Difficulty
(Brand X)
APPED COMPU ❸
R-G TE
AI R
Unauthorized Devices
(Brand X)
Unauthorized Devices
(Other Brands)
Figure 3.1: Three security scenarios of USB fingerprinting for device authentication.
cameras, network adapters, Bluetooth, etc. The later introduced USB 3.0 [79] standard
offers an increased 5 Gbit/s data rate and additional support for new types of devices.
Also, USB 3.0 devices are backwards compatible with USB 2.0 ports, but at 2.0’s speed.
USB 3.1 [80] further increases the data transfer rate to 10 Gbit/s with a modified power
specification that increases the maximum power delivery to 100W [13]. In this paper, I
focus on USB devices with standards 2.0, 3.0, and 3.1.
50
the flash storage of the device. Flash storage is generally made up of many blocks. As
flash has a limited write endurance and is usually designed in such a way that individual
bits cannot be selectively cleared, the flash controller typically conducts a series of
operations to modify the stored data in the flash medium. It first locates a new unused
block and copies the data from the old block to the new block while incorporating any
data changes. The flash controller then marks the old block as dirty, and eventually
reclaims these dirty blocks as part of the garbage collection process. The controller
(as the ‘flash translation layer’) maintains the mapping information between logical
addresses (addresses used by the host system to access files) and the physical addresses
of the actual pages, and the frequent remapping of blocks is an invisible process to the
host system. Thus, the time required for the USB mass storage device to access large
chunks of data is potentially unique and suitable to fingerprint the device.
51
SanDisk Blade #0 SanDisk Blade #1 SanDisk Blade #2 SanDisk Blade #3
0.4
0.3
Frequency
0.2
0.1
0.0
20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100
0.20
Generic #0 Generic #1 Generic #2 Generic #3
0.15
Frequency
0.10
0.05
0.00
20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100
SanDisk Ultra #0 SanDisk Ultra #1 SanDisk Ultra #2 SanDisk Ultra #3
0.4
Frequency
0.2
0.0
20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100
Samsung Bar Plus #0 Samsung Bar Plus #1 Samsung Bar Plus #2 Samsung Bar Plus #3
0.3
Frequency
0.2
0.1
0.0
20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100
Histogram Bin Histogram Bin Histogram Bin Histogram Bin
Figure 3.2: Histograms of read timings for 16 different USB mass storage drives. Each
plot contains 20 different samples.
52
device and its host. If the flash controller of one device can respond faster or slower
than that of a different device, it is possible that this variation can be used to identify
a device. Furthermore, if a large chunk of data is requested from the device, the flash
translation layer may access multiple locations to return all of the data at once. The
time taken for this action (e.g., consult translation table, access one or multiple flash
blocks within the device, coalesce data, respond to host) may also create observable
timing differences.
53
devices do not vary significantly enough to create a unique profile. In addition, the file
contents of the same device greatly influence the behaviors of the device enumeration
process, such as addresses, sizes, and the number of packets. Therefore, the device
enumeration process cannot be leveraged to generate a reliable fingerprint.
54
Host USB Device
Timing
Extra Read
Acquisition
SCSI
Commands
Preprocessing
Fingerprints
Identification
55
3.3 Time-Print Design
In this section, I detail the design and implementation of Time-Print and describe
how Time-Print generates device fingerprints. In general, Time-Print extends the USB
driver to generate a number of extra reads on randomly chosen blocks on USB devices via
the SCSI commands (as shown in Figure 3.3) and then measures the timing information
of these read operations. The process of Time-Print consists of four steps, namely, (1)
performing precise timing measurements, (2) exercising the USB flash drive to generate
a timing profile, (3) preprocessing the timing profile, and (4) conducting classification
based on the timing profile for device acceptance/rejection.
56
Host Peripheral
Com
m and
a
Timing Dat
Information
s
Statu
r an sfer
T
Time
Time-Print leverages a low overhead and high granularity timing source, the CPU
timestamp counter (TSC), which is a monotonic 64-bit register present in all recent
x86 processors. While initially designed to count at the clock speed of the CPU, most
recent systems implement a ‘constant TSC’, which ticks at a set frequency regardless
of the actual CPU speed. This feature enables Time-Print to precisely time the data
transmission phase, regardless of the underlying CPU frequency. I utilize the built-in
kernel function rdtsc() both before and after each transaction to record the precise
amount of time it takes for the execution of each interaction.
With the collected timing information, Time-Print further integrates a low-
overhead storage and reporting component for this timing information. This component
modifies the USB driver to maintain a continuous stream of timing information for the
drive. Specifically, I augment the us data structure present in the USB storage header
to contain arrays to keep track of command opcode, size, address, and TSC value for
each transaction.
57
Device Manufacturer Device Name Size Flash Controller Number of Devices USB Protocol
SanDisk Cruzer Blade 8GB SanDisk 10 USB 2.0
Generic General UDisk 4GB ChipsBank CBM2199S 10 USB 3.0
SanDisk Ultra 16GB SanDisk 10 USB 3.0
Samsung BAR Plus 32GB Unknown 4 USB 3.1
PNY USB 3.0 FD 32GB Innostor IS902E A1 1 USB 3.0
Kingston DataTraveler G4 32GB SSS 6131 1 USB 3.0
58
Kingston DataTraveler SE9 64GB Phison PS2309 1 USB 3.0
PNY Elite-X Fit 64GB Phison PS2309 1 USB 3.1
SMI USB Disk 64GB Silicon Motion SM3269 AB 1 USB 3.0
SMI USB Disk 64GB Silicon Motion SM3267 AE 1 USB 3.0
SanDisk Cruzer Switch 8GB SanDisk 1 USB 2.0
SanDisk Cruzer Glide 16GB SanDisk 2 USB 2.0
Table 3.1: USB mass storage devices utilized in the evaluation of Time-Print.
To transfer the timing values and record them (for prototype purposes), I
implement a character device within the USB storage driver to transfer the timing
information to the userspace for further processing. Since accessing the TSC is designed
to be a low overhead function, the induced overhead is negligible (more discussion on
the overhead is presented in Section 3.6). To ensure minimal performance impact, once
a device has been approved, the timing and storage functionality can be disabled.
59
status. I need to capture and record the timing values for each packet from the host’s
perspective. Specifically, a timestamp is recorded upon the entry and exit of each of
the two functions listed above. Each timestamp also includes the following meta-data:
command opcode, the size of the packet, and the offset the data is coming from. The
preprocessing step of Time-Print filters any commands that are not read commands
from the recording, and searches for the beginning of the commands from the read
script to discount any packets that are issued as part of the drive enumeration. As the
goal of the fingerprint system is to focus specifically on the time it takes for the drive
to access blocks of the USB device, not the timing between packets, I calculate the
time latency between when the host finishes sending the command packet and when
the host finishes receiving the data response packet from the drive.
The next step is to organize this raw timing information, which contains timing
data from a multitude of locations and sizes. I group them into separate bins where
each contains one size and address offset. Grouping the timing results by read size and
offset ensures that each timing sample within a group corresponds to a single action or
group of actions within the drive, allowing for meaningful statistical analysis.
3.3.4 Classification
With the timing information grouped by size and offset, I can leverage features
and machine learning techniques to create a fingerprint for each device. Based on the
trained fingerprints, Time-Print can reject or accept devices. For the different security
scenarios mentioned in the threat model, Time-Print uses different algorithms for better
performance. Section 4.6 further presents the details for different scenarios.
60
3.4.1 Experimental Devices
I utilize the following devices and system configurations to gather fingerprints.
Host System, OS, and Driver Modifications. Our host system is a DELL
T3500 Precision tower. The system contains an Intel Xeon E5507 4 core processor with
a clock speed of 2.27GHz and 4GB of RAM. The USB 2.0 controllers are Intel 82801JI
devices. I utilize a Renesas uPD720201 USB 3.0 controller (connected via PCI) for
USB 3.0 experiments.
The host runs Ubuntu 18.04 LTS and I modify the USB storage drivers as
detailed in Section 3.3.12 . Namely, I modify the USB driver to record the timing
information for the start and completion of each USB packet transmission that is a
part of the USB storage stack. Each time a device is connected, a data structure is
created to store the timestamp and packet metadata information. This data structure
is deleted upon device disconnect. A character device is inserted into the USB driver
code to facilitate the transfer of this timing information to log files after the completion
of drive fingerprinting operations.
USB Devices. I test the performance and applicability of Time-Print with 12
unique USB models and 43 different USB devices. Table 3.1 lists the device manufacturer,
name, size, controller, number of devices, and protocol for every device used in our
experiments. I select these brands to create a broad dataset that contains a number of
the most popular devices on the market (purchased by users on Amazon as of September
2020). Each device is analyzed with no modifications to the firmware of the device.
To ensure fairness, all devices are zeroed and formatted as FAT32 with an
allocation size of 4KB, and are identically named as ‘USB 0’. I extract the device
controller name by using Flash Drive Information Extractor [153]. Of note, SanDisk
does not publicly identify the versions of their flash controllers and simply reports the
name ‘SanDisk’.
2
Since Time-Print is entirely software-based, it could reasonably be extended to macOS
and Windows with cooperation from developers.
61
Raw Grouped
Samples Samples Features Classifiers
Group 1
Mean 1 1D
Size: 16KB
Loc: 0 Mean 2
Scenario ❶
Group 2
Size: 32KB Mean N
Loc: 0 K-means
Group 3
Size: 64KB
Loc: 0
Histogram 1 2D
Histogram 2 Scenarios ❷❸
Group N
Size: 64KB Histogram N Neural
Loc:M Networks
Figure 3.5: Flow of generating 1D features from the raw fingerprint samples of a drive
as used for different model identification (top) and 2D features as used for individual
device classification (bottom).
USB Hub and Ports. To facilitate testing of the USB drives, I utilize an
Amazon Basics USB-A 3.1 10-Port Hub that I connect to the inbuilt USB 2.0 Intel
82801JI hub on the host for USB 2.0 experiments and to the Renesas uPD720201 USB
3.0 hub for USB 3.0 testing.
62
To better explain the overall testing methodology, I further present the sample
acquisition process with an example of 10 different USB drives. Before testing, each
port on the USB hub is disabled such that no power is provided to a plugged-in device.
I then plug each drive into a port on the USB hub and record the mapping of the hub
port to drive ID (to match each sample to a specific drive). The fingerprint gathering
script enables the first port on the USB hub and waits for the USB driver process to be
launched. Upon launch, the driver process is isolated to a single core of the CPU to
ensure maximum timing precision. Next, I launch the fingerprinting script that initiates
a series of reads of different sizes and in different locations on the drive. The returned
data is not recorded because only the timing information of these reads is important.
Once the collection script completes, I mount the character device and write all of the
recorded timing information to a log file. The system then unmounts the character
device and USB device and disables the USB port to simulate unplugging the device. I
also simulate non-idle system states: the Linux stress utility is run to fully utilize one
CPU core on every other sample. The above process is repeated for the next port on
the USB hub. All drives are tested in a round-robin fashion.
Once 20 fingerprints have been gathered from each drive, I physically unplug
each drive and plug it into a different port on the USB hub; this ensures that any
difference observed in the readings is caused by the individual USB drives, not the USB
port.
63
3.4.4 Training and Testing Datasets
As mentioned above, in our experiments, fingerprints are gathered in a round-
robin fashion from devices in a set of 20. After collecting 20 fingerprints for all drives, all
devices are physically unplugged and then plugged into different ports. I thus refer to a
group of 20 fingerprints as a ‘session’ of data. For all devices listed in Table 3.1, I gather
4 sessions of fingerprints (i.e., 80 fingerprints). I then conduct 4-fold cross-validation by
selecting 3 sessions for training, and 1 session for testing.
64
sample distance by examining the features of each sample, and groups the samples
into clusters. Once the algorithm converges, I calculate the distance of each training
sample to its closest cluster. The maximum distance value is then used to set a decision
boundary. In this case, for a fingerprint to be accepted by the clustering algorithm,
it must be within the decision boundary of one of the pre-trained clusters. I first
preprocess each sample into different chunks by separating each reading based on the
size and location offset of the measurement. With the size and locations grouped, I
calculate the mean of each group, generating a 1D feature list for each sample, as
illustrated in the upper part of Figure 3.5.
Training and testing. I train the one-class classifier on four types of devices:
(1) the Generic Drives (10 devices), Samsung Bar Plus (4 devices), SanDisk Ultra (10
devices), and SanDisk Cruzer Blade devices (10 devices). I then test the classifier
against all other devices listed in Table 3.1. For clarity of presenting the results, I group
all extra devices with the USB 3.X protocol into a set called ‘other USB3’, and all extra
devices with the USB 2.0 protocol into a set called ‘other USB2’.
For example, to test the accuracy for the Generic Drives, I have four sessions (80
fingerprints in total) of data for all ten devices in this model. For Generic Drive #1, I
train the classifier using three sessions of data and test the classifier using the remaining
one session of data, and the data from all other devices from different brands/models. I
repeat the experiment for each Generic device and report the average accuracy.
Accuracy. The results are presented in Table 3.2, showing very high accuracy:
an average true accept rate of 99.5% while rejecting all drives of different models and
brands (i.e., zero false accept rate). As mentioned in the threat model, Time-Print is
mainly designed for use in a high-security system. Such a system should always reject
unknown models to minimize security risks. While the true accept rate of 99.5% may
still reject a legitimate device with a very small chance for the first trial, the user can
simply re-plugin the USB drive and re-authenticate with the system. The probability
of being rejected twice in a row is only 0.0025%. In other words, the probability of a
legitimate device being accepted after two trials is 99.9975%, which is very close to one.
65
Training Devices
SanDisk Samsung SanDisk
Generic
Cruzer Blade Bar Plus Ultra
Generic 99.9% 0.0 0.0 0.0
SanDisk
0.0 98.8% 0.0 0.0
Cruzer Blade
Devices
Testing
Samsung
0.0 0.0 99.7% 0.0
Bar Plus
SanDisk
0.0 0.0 0.0 99.9%
Ultra
Other USB2 0.0 0.0 0.0 0.0
Other USB3 0.0 0.0 0.0 0.0
Table 3.2: Percentage of samples accepted when trained for each device model.
Overall, these results show that Time-Print can accurately distinguish unknown
devices with different brands and models from legitimate devices.
66
for neural networks (e.g. 0 to 1). Especially, I convert the data from each group to
a histogram, with all data being scaled by the group global minimum and maximum
values, from the entire training set. Such a method creates a fine-grained representation
of the signal. This also makes sense as large reads take much longer to complete than
short reads, and a full ranged histogram would contain a large amount of unimportant
zero values. To ensure experimental integrity, the individual minimum and maximum
ranges are recorded and used to process the testing set.
Each histogram can be represented as a 1D vector of measurement frequency,
and the histograms for all groups can be concatenated together to create a 2D input
vector to the classification network. This process is illustrated in the lower part of
Figure 3.5. Another advantage of the histogram and neural network combination is
that the network can rapidly be tuned to work for different drives, since the number of
histogram bins, readings per size and location, or input trace can easily be adjusted
while maintaining a consistent preprocessing pipeline.
Training and testing. To achieve accurate identification, system administra-
tors can purchase multiple devices from the same brand and model to serve as ‘malicious’
devices to train the classifier. I emulate this scenario by examining the SanDisk Cruzer
Blade, SanDisk Ultra, and the Generic drives. I have 10 devices for each model. Among
the 10 devices, for training, one device is selected as the ‘legitimate’ drive, and 8 of
the remaining 9 devices are chosen as ‘malicious’ drives; then the last is used as the
‘unseen’3 device for testing purposes. During training, I use 60 samples of each drive
involved. During testing, I utilize the remaining 20 samples of each ‘legitimate’ drive
and 20 samples of each ‘unseen’ drive. To ensure fairness and remove any influence
of randomness, I test all 90 possible combinations (10 possible ‘legitimate’ drives × 9
possible ‘unseen’ drives) and cross-validate each by rotating the samples utilized for
training and testing.
3
The ‘unseen’ device is equivalent to an attacker’s ‘malicious’ device, and I use a
different term to differentiate the malicious device in testing from those used in training.
67
SanDisk SanDisk
Generic
Cruzer Blade Ultra
TAR TRR TAR TRR TAR TRR
Raw 92.2% 93.8% 96.5% 89.2% 97.6% 90.6%
Augment 97.3% 91.7% 98.0% 93.5% 98.7% 91.4%
Table 3.3: Average True Accept Rate (TAR) and True Reject Rate (TRR) for same
model device identification.
Accuracy. Table 3.3 presents the results, showing a compelling average true
accept rate (TAR) of 95.4% and an average true reject rate (TRR) of 91.2%.
After investigating the false acceptances, I find that most false acceptances occur
in pairs. I realize that the problem of classifying an unknown drive is likely to benefit
from synthetic data. Augmenting the training set with random variations (in an attempt
to simulate more unknown devices), or with samples from more ‘malicious’ devices
may better solidify the decision boundary of the network, leading to higher overall
accuracy. I also augment the samples of the ‘legitimate’ drives, albeit with much smaller
perturbations, to increase the true accept rate. I randomly select samples from the
training set and perturb them with noise. This augmentation procedure improves the
results, increasing the overall average accuracy to 95%. More specifically, the average
true accept rate increases to 98.0%, and the average true reject rate increases to 92.2%.
Overall, these results indicate that this approach has enough information to
uniquely fingerprint USB drives and that Time-Print can even detect unseen devices of
the exact same brand and model.
68
Device Name (# of Devices) Classification Accuracy
SanDisk Cruzer Blade (10) 98.6%
Generic Drive (10) 99.1%
SanDisk Ultra (10) 98.7%
Samsung Bar Plus (4) 98.4%
· and shown in Table B.1 of the Appendix. Since the goal is to identify each individual
drive, I modify the final output layer of the network to contain the same number of
neurons as devices that I attempt to classify. I utilize the same histogram transformation
from Scenario ·, where each sample is separated by size and location and then converted
to a histogram for utilization in the neural network.
Similarly to Scenario ·, I train and test (with cross-validation) a classifier for
each model (i.e., only drives in one model are trained and tested), as I expect that
an organization that adopts a device authentication system like Time-Print will limit
the usage of USB drives to a particular model. The classification results are listed in
Table 3.4 for the SanDisk Cruzer Blade, Generic, SanDisk Ultra, and Samsung Bar Plus
devices. I can see that Time-Print achieves accuracy above 98.4% for varied devices,
including those from some of the best selling manufacturers (SanDisk and Samsung).
Furthermore, the data for SanDisk and the Generic devices demonstrates that the
variability between drives is rich enough to create distinct classification boundaries
among different drives. Finally, this data shows that USB fingerprinting is not limited
to a single manufacturer or USB protocol. In short, Time-Print is able to fingerprint a
USB drive within the same brand and model for accurate classification.
69
utilized for fingerprinting, and how Time-Print might be deployed in the real world.
70
Figure 3.6: Classification accuracy degradation as the number of samples is reduced
(10 SanDisk Ultra USB 3.0 drives).
71
3.6.2 Fingerprints with Hardware Variation
When the fingerprint data is acquired, it must pass through a myriad of system
components. For example, the data transmission, beginning with the USB drive, must
go through any ports and hubs along its path, through the USB controller on the
motherboard, and finally through the bridge between the motherboard USB controller
and the processor. Each of these system components may contain varying levels of
routing logic and create timing variations in the fingerprint. As such, I conduct several
experiments to understand the impact of hardware variations on fingerprint accuracy.
Different Ports and Hubs.
To understand the impact of using different ports and hubs, I utilize the training
data from Section 4.6, but gather new testing sets with both Generic and SanDisk Blade
devices. I conduct two tests: (1) the USB hub is plugged into a different host port
and (2) another Amazon Basics USB-A 3.1 10-Port Hub is used to test the accuracy of
these configurations with the classifier and training data of Scenario ¸. I observe that
utilizing a different host port or a different hub slightly reduces the accuracy from 99%
to about 95% for the Generic devices but has no effect on the SanDisk Blade devices.
Different Host. I further investigate the impact of different host machines:
can the same fingerprint be transferred between different host machines? I expect to
see a degradation in accuracy as many factors (e.g., variations in the clock speed of the
processor, motherboard, etc.) are likely to alter the fingerprint. To assess the impact,
I gather a dataset on a second host system with a different configuration (system
comparison is listed in Table 3.5) using both the Generic and SanDisk Blade devices.
Again, I utilize Scenario ¸ as an example to measure the accuracy degradation.
The main difference between the two host systems lies in the different CPUs.
The TSC tick rate (i.e., the rate at which the TSC increments) is directly dependent on
the base clock speed of CPU. Thus, I prescale the data gathered on the testing machine
by multiplying the timing values by a factor of 0.7386, which is the ratio of 2.26 GHz
on our training machine to 3.06 GHz on the testing machine.
72
Training System Testing System
Intel Xeon E5507 Intel Xeon W3550
Processor
4C/4T @ 2.27 GHz 4C/8T @ 3.06 GHz
Motherboard Dell 09KPNV Dell 0XPDFK
RAM 2x2GB 1x8GB
USB Controller Intel 82801JI Intel 82801JI
With this preprocessing step, the SanDisk Blade devices experience no accuracy
degradation, and the Generic drives experience an 11% accuracy decrease to 88%, which
is still a promising finding. To understand the reason for these different behaviors, I
uncover that the Generic devices appear to produce noisier distributions with more
similar peak locations than the SanDisk Blade devices, as shown in Figure 3.2. I infer
that such increased noise coupled with different electrical paths (e.g., different hubs,
ports, machines) makes the Generic devices harder to classify in a cross host scenario.
However, it should be noted that in an enterprise environment, people usually purchase
a number of identical host machines with the same model of processor, motherboard,
USB controllers, etc. As a result, I might experience even better fingerprint transfer
between hosts. Meanwhile, this host transfer is not required in our threat model, as
system administrators can train an authentication system for each protected computer.
73
might degrade the accuracy of the fingerprint as the device is written to.
To investigate the impact of this remapping, I conduct an experiment by writing
hundreds of random files to five SanDisk Cruzer Blade devices and track the accuracy
of the classification system by gathering a sample between each write. In total, I write
6,520MB data to each 8GB drive.
The results demonstrate that Time-Print is somewhat resilient to drive writes,
experiencing no accuracy degradation until about 2.3GB at which point the accuracy
rapidly decreases. To better understand the cause of this sudden accuracy degradation,
I examine the behavior of the actual flash drive. I utilize the tool hdparm to observe
the actual logical block address (LBA) of each file, and notice that the drive attempts
to write files to the lowest available LBA. The classification neural network essentially
performs a matching task, attempting to classify the trace as the class that is the closest
to the training samples. After more than half of the LBAs utilized for the fingerprint
are written, the neural network is no longer able to perform this task reliably, since the
majority of the LBAs are no longer the same. To address this problem, there are two
solutions: LBA reservation and manufacturer support.
LBA reservation. If Time-Print can prevent the drive from updating the
virtual to physical mapping of the blocks utilized for fingerprinting, it can prevent drive
writes from affecting the fingerprint, as the drive will not reassign pages that are in
use. This can be accomplished by placing small placeholder files at their locations for
LBA reservation. I implement this mechanism by copying large files (to occupy large
swaths of LBAs) and small files into the chosen fingerprint locations, and then deleting
the large files. I use the hdparm tool to check the LBAs used by the small fingerprint
files. All of the small files combined together are only 768KB in total, thus inducing
low overhead. I then write 7.3GB (the capacity of the drive) data to the drive in 16MB
chunks, and observe no changes in the histograms and no accuracy degradation. This
solution can adequately accommodate the normal drive usage as long as the small
fingerprint files are not deleted (by users).
Manufacturer Support. This is the most resilient solution but requires
74
collaboration with drive manufacturers. Manufacturers already provide extra flash
blocks that are hidden from users to facilitate better wear leveling and drive performance.
They can similarly reserve extra blocks for fingerprinting on new devices. This solution
can ensure that Time-Print fingerprints are unaffected by write operations and further
ensure that accidental deletion of the contents of the drive will not interfere with the
fingerprint.
75
which should be accepted, and the samples from the legitimate device taken in the
wrong locations (to emulate a spoofing attack) that should be rejected.
I test this setup with the SanDisk Blade, Ultra, and Generic devices and observe
an average of 96.4% true accept rate and 99.6% true reject rate. This result indicates
that Time-Print is very robust against such ‘spoofing’ attacks.
76
utilizing only this information reduces accuracy from greater than 98% to 65% and 45%
for the two types of devices, respectively. This shows that while the timing information
of the flash controller can be utilized to identify some devices, it alone is insufficient to
create a robust fingerprint.
77
One of the most common methods for fingerprinting is the utilization of (un)intentional
electromagnetic frequency radiation. Cobb et al. [46, 47] showed that the process
variations in the manufacturing process cause subtle variations in the unintentional
electromagnetic emissions, which can be utilized to generate a valid fingerprint for
similar embedded devices. Cheng et al. [44] further found that unique fingerprints can
be created for more sophisticated systems like smartphones and laptops. Other prior
works [27, 57, 129, 138] study the fingerprint generation in radiating electromagnetic
signals for communication (e.g. Zigbee, WiFi, etc.). The most similar work to Time-
Print is Magneto [82], which uses the unintentional electromagnetic emissions during
device enumeration on a host to fingerprint USB mass storage devices. While their
work demonstrates the ability to classify different brands and models accurately, the
system requires expensive measurement equipment. By contrast, this work requires
no special equipment and uncovers a novel timing channel that can be used to further
identify devices within the same brand and model.
Device serial numbers, descriptors, and passwords are also used to thwart the
connection of unauthorized USB devices [12, 83]. These defenses inherently trust that
the USB device is accurately reporting software values. TMSUI [181], DeviceVeil [160],
and WooKey [24] use specialized hardware to uniquely identify individual USB mass
storage devices, and as a result, most of these systems are not compatible with legacy
devices. Instead, Time-Print is completely software-based and does not require any
extra or specialized hardware. The USB 3.0 Promoter Group has proposed a USB 3.0
Type-C PKI-based authentication scheme [172] to identify genuine products, but these
mechanisms are not designed to uniquely identify individual devices. Other prior works
utilize a USB protocol analyzer [99] or smart devices [21] to identify a host system and
its specific operating system by inspecting the order of enumeration requests and timing
between packets [52]. Unlike those works, the objective of Time-Print is to identify the
peripheral device, instead of the host.
78
3.7.2 Flash Based Fingerprints
Several prior works have investigated whether the properties of flash devices can
be utilized for fingerprinting. For example, device fingerprints are constructed using
programming time and threshold voltage variations [135, 174]. Others [74, 88, 92, 123,
143, 159, 176] further investigate the design of physically unclonable functions in flash
chips and explore the impact of write disturbances, write voltage threshold variation,
erase variations, and read voltage threshold variation. Sakib et al. [142] designed a
watermark into flash devices by program-erase stressing certain parts of a device.
The above techniques work at a physical level, which requires control and
functionalities that may not be available in a cost-constrained, mass-market device
like a USB flash drive. Time-Print only utilizes read operations (a common function
available on all USB flash drives) and thus is non-intrusive. In addition, while these
technologies could be incorporated into new devices, Time-Print is fully compatible
with existing devices and only requires a slight modification to the host driver.
79
firmware for malicious behaviors. USBFILTER [168] presents a firewall in the USB
driver stack to drop/allow USB packets based on a set of rules.
Similarly, Cinch [14] creates a virtual machine layer between USB devices and
the host machine to act as a firewall. Johnson et al. [89] designed a packet parser to
protect the system from malformed USB packets. Tian et al. [163] proposed a unified
framework to protect against malicious peripherals.
Other prior works like USBeSafe [91] and USBlock [122] utilize machine learning
algorithms to analyze the characteristics of USB packet traffic to prevent keyboard
mimicry attacks [86]. Like those works, Time-Print is a software-based approach to
enhancing USB security.
3.8 Summary
This paper presents Time-Print, a novel timing-based fingerprinting mechanism
for identifying USB mass storage devices. Time-Print creates device fingerprints by
leveraging the distinctive timing differences of read operations on different devices. I
develop the prototype of Time-Print as a completely software-based solution, which
requires no extra hardware and thus is compatible with all current USB mass storage
devices. To assess the potential security benefits of Time-Print, I present a comprehen-
sive evaluation of over 40 USB drives in three different security scenarios, demonstrating
Time-Print’s ability to (1) identify known/unknown device models with greater than
99.5% accuracy, (2) identify seen/unseen devices within the same model with 95% accu-
racy, and (3) individually classify devices within the same model with 98.7% accuracy.
I further examine the practicality of Time-Print, showing that Time-Print can retain
high accuracy under different circumstances while incurring low system latency.
Now that the offensive and defensive capabilities of side channels have been
demonstrated, I next turn our attention to how side channels can be utilized to invade
user privacy. Unlike the first attack in Chapter 2 which required the attacker utilize
extra hardware to monitor the user device, the next attack will take place completely
80
remotely, requiring no extra hardware, and attacking user privacy from across the
internet.
81
Chapter 4
4.1 Introduction
While Advanced RISC Machines (ARM) processors have dominated the mobile
device market over the past decade, recently they have also gained market share in both
cloud computing and desktop applications. Enterprises like Apple and Samsung have
announced plans to develop ARM based laptop devices that function with the complete
MacOS and Windows operating systems. Apple has already released its M1 ARM chip
to power its newest laptop and desktop devices. Spurring this rapid expansion of ARM
devices into new markets is the adoption of a more peripheral based design that attaches
a number of coprocessors and accelerators to the System-on-a-Chip (SoC). ARM has
also adopted a System-Level Cache to serve as a shared cache between the CPU-cores
and peripherals. This design works to alleviate the memory bottleneck issues that exist
between data sources and the accelerators, allowing higher speed communication and
increased performance.
If the marketshare of ARM processors in desktop and laptop systems continues
to increase, it is expected that attackers will pay more attention on ARM and explore
more vulnerabilities of ARM. While extensive research has been conducted on exploring
and securing microarchitectural side channels on Intel’s x86 systems, far less research
has been focused on the ARM architecture. Furthermore, as mobile OSes tend to deny
low level control over the hardware, most vulnerabilities are usually within non-essential
APIs [190, 87, 54, 39, 103, 194] and are rapidly patched. ARM designers must be careful
to ensure that their designs are not vulnerable to malicious attacks when exposed to a
82
full fledged operating system, where OS developers are able to exert far fewer restrictions
on potential attacker activities.
In this paper, we present an in-depth security study on recent personal computing
devices (e.g., mobile phones and laptops) equipped with ARM processors with the
recent DynamIQ [110] design. Unlike previous designs that only share cache within
core clusters, these devices contain multiple levels of cache and share the last level
cache with other core clusters and accelerators (e.g., graphics processing unit). Unlike
x86 processors, these ARM devices utilize heterogeneous core architectures, different
caching policies, and advanced energy aware scheduling to increase performance and
battery life. We endeavor to examine whether those advancements (e.g., new cache
architectures, the tight integration of accelerators, etc.) make the ARM platform more
difficult to attack compared with with x86 platforms.
Specifically, we focus on investigating cache occupancy channels [151], which
continually monitor shared cache activities, to fingerprint websites. We design a series
of microbenchmarks to better understand how ARM system behaviors (e.g., energy
aware scheduling, core selection, and different browsers) affect the cache occupancy
channels. Based on our preliminary study, we further optimize the attack for these
new ARM cache designs and consider multiple different browsers, including Chrome,
Safari, and Firefox. The redesigned attack significantly reduces the attack duration
while increasing accuracy over previous cache occupancy attacks. Furthermore, we
introduce a novel GPU contention channel in mobile devices, which can achieve similar
accuracy as the cache occupancy channel. To evaluate the proposed attacks, we conduct
a thorough evaluation on these attacks across multiple devices, including iOS, Android,
and MacOS with the new, ARM-based, M1 MacBook Air. The experimental results
show that the System-Level Cache based website fingerprinting technique can achieve
promising accuracy in both open (up to 90%) and closed (up to 95%) world scenarios.
Overall, the main contributions of this work are summarized as followed:
• An examination of the system level cache within new ARM SoCs that utilize
83
the DynamIQ design principle, especially how different components and software
scheduling affect cache behaviors.
• The discovery of a new GPU side channel attack that can be utilized to fingerprint
user behaviors on MacOS and Android.
The rest of this chapter is organized as follows: Section 4.2 provides necessary
background information. Section 4.3 presents the threat model and discusses the unique
challenges that the ARM architecture creates for attackers in a shared cache occupancy
attack. Section 4.4 details our system design and Section 4.5 describes our experimental
setup. Section 4.6 analyzes our findings and Section 4.7 surveys related works. Finally,
Section 4.8 concludes the paper.
4.2 Background
4.2.1 Caching and Side-Channel Attacks
Modern computer systems utilize a tiered memory system to enhance their
performance, from the smallest and fastest (i.e., L1) to larger and slower (e.g., L2
and L3). Two important distinctions in caching are exclusive and inclusive caching.
Inclusive caching guarantees that any memory address that is included in a cache tier
is also present in the cache tiers below it. For example, a value in the L1 cache is
also present in the L2 and L3 caches. By contrast, an exclusive caching policy ensures
that items are only present in one level of the cache (e.g., an item in the L1 cache is
not present in the L2 or L3 cache). While there are various pros and cons to both
caching policies, Intel x86 processors mostly employ inclusive caching, but recent ARM
processors tend to utilize exclusive caching policies.
84
As portions of the cache are shared between all processes, it has been widely
exploited for side channel attacks. By determining whether specific memory is in the
cache (e.g., timing its access time), attackers can infer the information of the victim.
The ‘prime+probe’ attack [106, 127] attempts to identify vulnerable data locations that
indicate specific program flows. With a high resolution timer and a predictable program,
cache-based side channel attacks allow attackers to extract private information such as
encryption keys.
Cache Occupancy Channel. Shusterman et al. [151] suggested two versions of
the cache occupancy channel, cache occupancy and cache sweeping. In cache occupancy,
they designated a sample rate (every 2ms) and accessed the entire buffer. If the buffer
is accessed faster than 2ms, the total time to access the buffer is recorded. If the
access takes longer than 2ms, a miss is recorded. In cache sweeping, the cache buffer is
continually accessed and the number of full ‘sweeps’ in each sampling period is recorded.
At the beginning of each sample period, the system starts accessing the cache from the
first location. They demonstrated that such techniques can be used for robust website
fingerprinting in x86 systems.
85
ARM DynamIQ Architecture
big Cluster
LITTLE Cluster core [0]
core [0] core [1] L1I L1D
L1I L1D L1I L1D
core [1]
core [2] core [3] L1I L1D
L1I L1D L1I L1D
L2 L2
GPU ISP
L2
DSP
86
cores have access to larger L1/L2 caches than their lower performance counterparts.
As the L2 caches of the different core clusters are not shared between clusters, a large
amount of cache coherency traffic is necessary to facilitate switching tasks between the
high and low performance cores, resulting in suboptimal performance.
To overcome this performance limitation, a newer system ‘DynamIQ’ [110] was
developed for ARM. The DynamIQ system allows greater modularity and design freedom
than the original big.LITTLE system. DynamIQ allows the processor designers to
create multiple clusters of heterogeneous processors (instead of just two), and employs
a shared L3 cache to improve computational performance between processor clusters,
as shown in Figure 4.1. Our work explores the potential security vulnerabilities in this
shared cache architecture.
Accelerators. Due to the explosive popularity of machine learning applications
in image and signal processing domains, mobile devices have begun to require a low
power method for executing neural network inference functions. To resolve this issue,
current mobile devices make use of a number of accelerators or co-processors to enable
advanced functionalities within their energy budget. Recent versions of Apple’s custom
A series chips, Qualcomm’s Snapdragon, and Samsung’s Exynos chips have begun to
increase their reliance on accelerator peripherals. Those chips include dedicated digital
signal processors, image signal processors, motion co-processors, neural processing units,
and graphics processing units.
The inclusion of numerous accelerators creates a major system design issue.
To utilize a co-processor, it must be supplied with data and a set of instructions to
operate on. The co-processor must then complete its calculations and return the data
to the main processor. In a non-integrated SoC, communication with co-processors
must take place over a bus, and this can severely limit performance speedup. Nvidia
has attempted to resolve part of this problem on x86 with GPUDirect [70], allowing
for direct transfer of data to the GPU without the CPU. To speed up co-processor
performance in ARM, the DynamIQ system utilizes a system level cache that is shared
with these accelerators. ARM calls this technology cache stashing [108], which allows
87
tightly coupled accelerators (such as GPUs) to directly access the shared L3 cache and
in some cases directly access L2 caches.
88
4.3 Threat Model and Challenges in ARM
4.3.1 Threat Model
This work studies the ability of an attacker to fingerprint a user’s website
browsing activity via a low frequency contention channel in either the shared cache or
the GPU of an ARM SoC. The attacker is motivated to track the user’s web activity for
some malicious purposes, such as to better identify the victim’s interests for targeted
advertising or to covertly determine sensitive information (e.g., medical condition,
sexual/political preferences, etc.) for the purpose of discrimination or blackmail. We
consider two typical scenarios in website fingerprinting: (1) closed world, where the
victim only visits websites from the list of sensitive websites; and (2) open world, where
users might also visit some non-sensitive websites. To accomplish the fingerprinting task,
the attacker can pre-profile a list of sensitive websites and build a model based on specific
browsers (e.g., Chrome/Firefox/Safari) and devices (e.g., MacBook/Smartphone).
To evaluate the potential threat from this attack, we mainly examine a web-based
attacker who is only capable of delivering JavaScript from a website. We also conduct
an investigation of an app based attacker who is able to trick a user into installing
malware, but impose additional limits, analyzing how well the attack would function if
the OS clock functions were similarly limited to those of web browser1 .
Web-Based Attacker. The web-based attacker attempts to exploit the cache
occupancy channel in the context of the web browser, delivering a JavaScript file to
the user via a malicious advertisement on a legitimate page or by tricking the user
into visiting a malicious web page. We assume that the attacker is unable to exploit
any vulnerabilities in the browser. Instead, (s)he attempts to create a cross tab attack
scenario, wherein the user leaves the tab with the malicious JavaScript open and
continues to browse other websites in a different tab. The malicious JavaScript in the
background tab continues to run and attempts to monitor the user’s activity. This is
1
Researchers have demonstrated that the high precision timers available to native
programs can produce very accurate attacks. OS developers may move to reduce the
attack surface by reducing the granularity of available timers in the future
89
reasonable as all current web browsers enable users to visit multiple websites at the same
time in different browser tabs. While tabs are isolated from each other in software, they
are not necessarily segregated in hardware. Furthermore, the weak attacker is restricted
by the privileges granted to JavaScript, and are subject to the reduced precision timers,
memory management, and scheduling constraints that the browser enforces.
App-Based Attacker. We assume that the app-based attacker is capable
of tricking the user into installing an application or program onto their device that
contains the malicious observation code. The code can be integrated into a benign
application such as a music player, fitness tracker, or social media application, and
is therefore capable of running a disguised process to monitor user activities. Unlike
the web-based attacker, the app-based attacker is not restricted to only JavaScript
and has access to the APIs provided by the operating system, allowing better control
over memory management and scheduling. However, the attacker is not granted any
super-user privileges and does not utilize any exploit to access privileged commands.
Note that, in both scenarios, the application/JavaScript does not necessarily
need to be sourced from a purely malicious entity. Such a tracking service could
be deployed in social media applications to better identify and profile user activities.
Large ad-supported companies like Google or Facebook could also greatly benefit from
deploying a similar script on their webpages, continually monitoring users browsing
activities to better target advertisements.
90
Table 4.1: Devices and High Power (HP) and Low Power (LP) core configurations
utilized in this work.
Device Core Configuration High Power L1/L2 Low Power L1/L2 System Level Cache
2x Lightning (HP) 128KB L1i / 128KB L1D / Core Unknown L1i / 48KB L1D / Core
iPhone SE 2 16MB
4x Thunder (LP) 8MB L2 Shared 4MB L2 Shared
4x Kryo 385 Gold (HP) 64KB L1i / 64KB L1D / Core 64KB L1i / 64KB L1D / Core
Android 2MB
4x Kryo 385 Silver (LP) 256KB L2 / Core 128KB L2 / Core
4x FireStorm (HP) 192KB L1i / 128KB L1D / Core 128KB L1i / 64KB L1D / Core
MacBook Air 16MB
4x IceStorm (LP) 12MB L2 Shared 4MB L2 Shared
ARM Cache Contention. ARM systems differ from common x86 architectures
in multiple aspects. ARM offers exclusive and inclusive caching at different levels, and
utilizes heterogeneous architectures in which multiple different core architectures and
cache layouts may be present on the same chip. Also, each type of core may run
at different frequencies. Those factors increase the difficulty of exploiting the cache
occupancy channel in the ARM architecture. Since the system level cache is the only
cache level shared by all processor cores in ARM, if the scheduler moves the spy and
victim processes between different core types, it can greatly affect the observed cache
profile.
Due to the exclusive nature of the last level cache in ARM, when a process
migrates the data in its L1/L2 caches, the data will not be present in the last level
cache, but in the L1/L2 caches of its previous location. Upon migrating a process from
one core type to another, some ARM processors invalidate the entirety of the previous
cores’ caches, while others may allow that data remain until it is evicted. In either case,
in an exclusive cache setup, any reads to locations that were in the L1/L2 cache of the
previous location will be serviced from the L1/L2 and have no impact on the L3 cache.
This greatly hinders the cache occupancy channel: while in an inclusive cache, one
could reliably observe L3 occupancy (if the value were removed from L3, it would be
removed from all higher levels), the exclusive cache can serve the value from either the
previous L1/L2 or main memory, giving no indication as to the status of the L3 cache2 .
2
The L3 cache on ARM also maintains the ability to be selectively inclusive if an
item is utilized by more than one core [111], however, the cache occupancy JavaScript
channel does not utilize shared memory and should not experience this behavior.
91
Exclusive caching also has drawbacks with respect to buffer size. In an x86
system with inclusive caching, the spy process evicting the entire L3 cache would also
remove any data in the L1/L2 caches. Thus, when the victim process accesses data,
it always causes activities in the L3 cache3 . However, in an ARM system, if a victim
process accesses a buffer small enough to fit in the L1/L2 cache, a spy process that is
monitoring the entirety of the L3 cache will never see this activity. While this behavior
might be unnoticed, and even preferable, to a program under normal circumstances, it
is not ideal for the cache occupancy channel. The cache occupancy channel assumes
that continually accessing a large buffer in cache will completely evict any data of the
victim process from the L3. Also, it assumes that any access to memory will bring
data back into the L3, making it observable. Thus, to better suit ARM processors, the
access patterns and buffer sizes for the cache occupancy channel should be carefully
considered.
Browser Differences. Further complicating the applicability of the cache
occupancy channel is the memory management of a web browser. The web-based
attacker must work within the constraints of the JavaScript engine within each web
browser. Today’s popular web browsers, including Google Chrome, Apple Safari, and
Mozilla Firefox, utilize different JavaScript engines. Furthermore, these JavaScript
engines must interact with the system scheduler. Different OSes (e.g., Google’s Android,
Apple’s iOS, and MacOS) likely utilize carefully tuned schedulers to maximize the
performance. Finally, the JavaScript engines of the major browsers will manage memory
in different ways, and the garbage collector of each JavaScript engine will handle memory
management in a way that is not accessible to the attacker. Thus, a one size fits all
approaches to cache occupancy fingerprinting is certainly not ideal as each browser may
act very differently, even on the same hardware.
3
In some x86 server CPUs (specifically Skyake-X CPUs from Intel, the L3 is ‘non-
inclusive’, meaning that it is neither fully inclusive or exclusive. Consumer CPUs from
Intel have not yet adopted this layout.
92
4.4 Optimizing ARM Cache Occupancy
We first design a series of microbenchmarks to better understand ARM system
behavior. In particular, we investigate how energy aware scheduling, core selection, and
different browsers impact the cache occupancy channel.
4
Web workers were designed to facilitate background processing off of the main UI
thread, allowing for complex computation to take place in the background while keeping
a website responsive.
93
4.4.2 Cache Access Pattern
Modern ARM processors utilize cache prefetchers to learn data access patterns
and bring data into the cache beforehand. To accurately measure the cache performance
of a device, it is necessary to develop a cache access pattern that defeats these prefetchers.
While exact prefetching algorithms are closely guarded secrets, current systems broadly
utilize two types of prefetcher, the next line prefetcher and the stride prefetcher.
The next line prefetcher exploits spatial locality, assuming that the processor
will want to access the next data line and therefore fetches it from memory. The stride
prefetcher actively learn patterns in data access and fetch the data based on the pattern.
For example, the stride prefetcher observing that a program accesses every 10th element
of an array will begin to bring future elements into the cache before they are requested.
It has been demonstrated that the stride prefetcher is limited in recognizing
patterns within memory pages and can only keep track of a certain number of patterns
before the hardware pattern matching is exhausted [51]. To evade the two prefetchers,
we follow a similar access pattern to that of [51]. We create a large array of buffers
which spans multiple memory pages. We then access the first line of every page, then
the third line, then the fifth line, etc. By accessing every other line, we avoid any
impact from the next line prefetcher. By accessing one item from each buffer before
looping back to the first buffer, we exhaust the ability of the stride prefetcher to learn
a pattern.
94
Average Memory Access Time Google Pixel 3
120
110 12
To this end we create a large buffer and access increasingly large portions in the
prefetcher thwarting manner described previously, and record the required time in each
iteration. We then normalize the access times with respect to the number of memory
accesses to better understand the cache sizes and memory management. To assess
background activity, we run the script in a background tab while the foreground tab is
set to www.google.com. We also find that writing to the accessed buffer (e.g. increment
a counter stored at each array location) increases the consistency of experiments.
This can be attributed to a more complex instruction stream reducing the amount of
optimization and/or reordering that can occur, and thereby better exposing the cache
sizes.
Figures 4.2 shows the result for the Google Pixel 3. We observe a large difference
between the behavior of the foreground and background tabs. The cache access in the
background is about 5-10 times longer than those in the foreground. Also, the shape
of the distribution is different, clearly demonstrating that the higher and lower power
processor cores behave differently. The iPhone SE2 also demonstrates very different
foreground and background cache behavior, as shown in Figure 4.3. Background accesses
are nearly 10x slower than foreground accesses, and the background memory access time
95
Average Memory Access Time iPhone SE 2
50
6.0
45 5.5
curve is significantly different from the foreground access curve. The foreground curve
experiences multiple sharp increases in cache access time, indicating the multiple levels
of cache are present (e.g., L1, L2, L3) while the behavior of the background process
shows far less distinguishable increases in timing.
96
Average Memory Access Time Chrome
6.0 6.0
5.5 5.5
Figure 4.4: M1 MacBook Air Cache Average Memory Access Time in Chrome
Figure 4.5: M1 MacBook Air Cache Average Memory Access Time in Safari
97
Average Memory Access Time Firefox
5.5 5.5
5.0
Figure 4.6: M1 MacBook Air Cache Average Memory Access Time in Firefox
access speed for their respective foreground and background processes, indicating that
background tabs are not relegated to the low power cores.
We also observe that the overall shape of the timing curves for cache accesses is
unique to each browser, indicating that even though the access pattern was the same,
the memory allocation algorithms for each JavaScript engine are vastly different. Thus,
understanding how these allocation strategies affect cache timing can greatly increase
the accuracy of a potential cache occupancy attack. Specifically, when searching for an
optimal buffer size used in the cache occupancy attack, we expect that sharp increases
in memory access time indicate a buffer overflowing a cache level, and therefore presents
a suitable target size to begin testing.
Figures 4.4, 4.5, and 4.6 also shows that each browser has different locations
for the increases in memory access time. This can be attributed to the differences in
JavaScript engines, which utilize different interpreters and compilers. Also, JavaScript is
a prototype based language, and there is a variable amount of overhead for creating any
type of buffer or array (e.g. an Int32Array will contain extra bytes describing the usage).
Furthermore, the garbage collection system in a JavaScript engine prevents users from
directly controlling their memory allocations. Current garbage collection algorithms are
98
designed to reduce memory fragmentation and reclaim / reduce the memory footprint
of the programs running. This step is frequently referred to as a ‘compact’ step and
many algorithms will physically copy the memory to a new (different) location without
any warning.
In addition, each JavaScript engine utilizes a unique allocation strategy and thus
allocates different amounts of memory for the same size object. We use the developer
tools within Mozilla Firefox and Google Chrome to examine this in a more fine grained
manner. We find that a 1,024 element Int32 array in Google Chrome utilizes 4,220
Bytes as opposed to an expected 4,096, an excess of 124 Bytes while a single element
array utilizes 136 Bytes, an excess of 132 Bytes. In Firefox, these same arrays utilize
4,224 Bytes and 96 Bytes, respectively. These differing overheads would require that an
attacker design specific code for each browser.
In summary, the memory management of the JavaScript engine has a large effect
on the actual allocations and stability of memory addresses utilized in a browser based
JavaScript attack. The combination of these effects with the aforementioned issues of
prefetchers, differing cache sizes with heterogeneous core designs, and exclusive cache
policies greatly deteriorate the ability of an attacker to exploit a cache occupancy
channel.
4.5.1 Setup
Data Sets. To monitor the accuracy of the cache occupancy channel, we utilize
an abbreviated open world dataset, which consists of multiple accesses to sensitive and
non-sensitive websites. It marks all non-sensitive websites as a single class, regardless
99
of domain. Particularly, We utilize a dataset of 1,500 website accesses, containing 100
accesses to the top 10 Alexa websites (i.e., sensitive websites) and 1 access to 500 other
websites not within the Alexa top 100 (i.e., non-sensitive websites). To prevent any
biasing of the dataset, we generate a random order for these 1,500 accesses and then
utilize the same order for every experiment. We believe this randomization is important,
and previous works do not discuss the access order. If all websites are visited in the
same order repeatedly, it might lead to invalid accuracy data when dealing with a cache
channel. Unlike network based fingerprinting attacks, the CPU cache may retain some
of its state between website accesses causing the machine learning system to identify
incorrect features and boost the accuracy of the test. Note that this abbreviated dataset
is used in this section to optimize the side channel attacks on ARM. In the next section,
we conduct a thorough evaluation using a much larger dataset.
Machine Learning Approaches. We evaluate the performance utilizing
multiple supervised learning algorithms. Specifically, we utilize the Rocket [53] transform
paired with Ridge regression and a convolutional neural network. We rely on similar
hyperparameters to the original cache occupancy paper [151] as our starting point. All
classifiers are trained and tested with a cross validation strategy, wherein we utilized
90% of the data for training, and 10% of the data for testing. We report the average of
5 rounds of training and testing.
100
Using the information from Figure 4.2, accessing the cache takes about 60ns at a 2MB
buffer size. Since the Snapdragon 845 employs a 64 Byte cache line size, to avoid
prefetching, we should access every 32nd integer in our 2MB buffer. As the buffer can
hold ≈500,000 integers, this results in ≈ 16, 000 accesses. At 60ns per access, this
equates to just under 1ms. While the Snapdragon 845 has configured the system level
cache to be 2MB, the Cortex A55 supports up to 4MB of shared cache [109], and the
accesses may take almost 2ms with no background activity, and will almost certainly
take more than 2ms if the processor is performing another task. Thus, if the system
described in [151] is used without modification every trace would be nearly identical
with only overlong accesses and no identification would be possible. To this end, we
propose a series of modifications that work for devices regardless of their access speed
to cache. This enables the attacker to adjust the buffer size for the device and not have
to worry about adjusting the sample rate if the device happens to be very slow.
Modifications. The first modification entails recording the number of cache
accesses within the time frame, instead of the time to complete accesses. This is
advantageous for a few reasons. First, this system is far less effected by changes in the
accuracy of clock. The system will always record the number of actual cache accesses,
a number that is far more fine grained than time to access the cache. To enhance
system performance on slower devices, we also increase the access window time to 4ms
to increase the number of possible accesses. With these initial modifications, we achieve
75% open world accuracy in the abbreviated 10 site test (Section 4.5.1) on the Google
Pixel 3.
With the first enhancement, the system checks the number of total cache accesses
in the time period. It then needs to frequently check the clock to see if the time period
has been completed. We find that the Android system only completes about 2,500
accesses per 4ms window, which is far lower than the original predicted value of ≈16,000
accesses per 1ms window. We find that when profiling the page, the vast majority of the
code runtime is taken by the script performing the performance.now() call to check
whether the time window is elapsed. Since the ARM last level caches are exclusive, the
101
attack might have several issues if the cache occupancy system continually accesses
the same beginning elements of the buffer without ever accessing the entirety of the
buffer. In the worst case, if the number of accesses can fit in the L1 and L2 cache,
the script may never actually impact the L3 cache. Therefore, it can only observe
minimal information about the L3 occupancy, and thus performs poorly in the website
fingerprinting. If the accesses overflow into the L3 but do not fill it, the system will
perform sub-optimally as it is unable to fully observe the L3 cache. Furthermore, it will
continue to observe the same portion of the L3 cache, which may not provide useful
information.
We thus further employ two enhancements. The first enhancement accesses the
buffer in a circular fashion: if the script only completes 2,500 accesses in the time
window, it will access the 2,501st element at the beginning of the next window. It
only returns to the first element once all elements have been visited. This ensures
that the buffer eventually fills the L3 cache and that sequential observations observe
different parts of the cache. We find that this technique increases the accuracy of the
10 site open world dataset to about 83%. The next enhancement is to decrease the
amount of time that the script spends checking the time. Instead of checking after every
access, we check after every 20 cache accesses. This enhancement (without circular
accesses) increases the accuracy to 84%. We then combine both enhancements and
further increase the accuracy to 86%. We present a through evaluation in Section 4.6.
102
WebGL or WebGL2 animations, and videos are also usually hardware accelerated. Thus,
we endeavor to explore whether the GPU and shared cache architecture of current ARM
DynamIQ can be utilized to create a website fingerprinting side channel.
It is challenging to exploit a GPU cache occupancy channel. WebGL2 and basic
HTML5 canvas elements only update at a low frequency of 60Hz. While these sampling
rates can be increased, working with the canvas element in a background tab further
increases the complexity and overhead. Also, it is not straightforward to determine the
amount of memory that a GPU process consumes. GPU programming within JavaScript
is mainly designed around graphical interfaces and smooth animations. An ideal attack
should instead perform minimal useless image display, but focus primarily on exploiting
the side channel. Therefore, we utilize a JavaScript library called GPU.js [71], which is
designed to enable the creation and deployment of GPU computational kernels from
JavaScript to WebGL compatible code. It can reduce the amount of boilerplate code
and other timing elements for an attacker.
We thus create a two dimensional buffer of data and repeatedly utilize the GPU
to process this buffer with different mathematical kernels.
Unlike our improved cache occupancy channel, accelerator based channels cannot
provide us with high granularity measurements. The accelerator based workload requires
that the CPU first declare the work, pass it to the accelerator (GPU), and wait until
the GPU completes its task. This means that the sizing and complexity of the kernel
task must be tuned for the optimal performance for fingerprinting.
To understand the performance of different settings, we create a spy script similar
in nature to the cache occupancy spy script. The GPU script reports the number of
kernel executions that it can complete in the monitoring time period. We conduct
experiments using multiple kernels, including the matrix multiplication and computing
the dot product. We find that the kernel that sums each row of the input array
delivers far superior performance. This might be due to massively decreased complexity
and time in this GPU kernel: the reduced complexity enables more possible kernel
executions, which in turn leads to better observability of GPU usage. We also check the
103
optimal size for the computation. A small size might result in mainly observing GPU
startup overhead, while a large size results in too much time spent in GPU computation,
decreasing observation granularity. We find that an overall compute array of between
20KB (Android) and 40KB (MacOS) organized into 5x4KB or 10x4KB arrays works
best. Finally, we examine the observation window, but limit our experiments to a
maximum 10 second duration to maintain a realistic approach. Again, we find disparate
sizes depending on platforms. The Google Pixel 3 provides the best performance
with 500 20ms observations and the M1 MacBook Air achieves its best results with
1,000 10ms observations. We believe this is caused by the speed of the processors:
the SnapDragon 845 functions much slower and thus requires more time to manifest
observable differences in computation performance as opposed to simply observing GPU
overhead.
4.6 Evaluation
In this section, we provide detailed performance results for the cache occupancy
and GPU contention channels. Unlike the previous section which utilized open world
testing of 10 sensitive sites and 500 open world sites, this section utilizes a much larger
dataset containing: 100 accesses to 100 sensitive sites (Alexa Top 100), and 1 access
to 5,000 other websites. We report both closed (only the sensitive websites) and open
(all websites) world accuracy. As before, to remove any bias from the experimentation,
the collection process is conducted via Appium or Selenium automation of the target
platform. The list of 15,000 total website accesses was randomized to ensure that there
were no unintentional ordering effects and the same random access order was utilized
for each experiment for better comparison.
To compute the accuracy of the fingerprinting, we utilize 10-fold cross validation
with a 90/10 train/test split. We report accuracy for two machine learning algorithms,
a ridge regression with a minirocket [53] transform and a minirocket transform with a
1D CNN (configuration presented in Appendix Table C.1). The ridge regression with
104
Table 4.2: Accuracy for web-based cache occupancy website fingerprint on multiple
ARM devices
Closed World Open World
Device CPU Browser
Ridge Regression CNN Ridge Regression CNN
Macbook Air Apple M1 Chrome 89 95.6 92.2 88.1 89.8
Macbook Air Apple M1 Safari 14 94.3 89.4 78.4 85.1
Macbook Air Apple M1 Firefox 88 88.1 83.9 68.2 77.8
iPhone SE2 Apple A13 Safari 14 80.2 75.7 65.8 72.7
iPhone SE2 Apple A13 Chrome 90 80.2 75.9 65.0 73.3
Google Pixel 3 Snapdragon 845 Chrome 90 88.0 81.8 66.0 75.9
Table 4.3: Accuracy for native application cache occupancy website fingerprint on
multiple ARM devices
Closed World Open World
Device CPU Browser
Ridge Regression CNN Ridge Regression CNN
Macbook Air Apple M1 Chrome 89 92.5 85.7 84.3 85.7
Macbook Air Apple M1 Safari 14 91.1 87.0 72.4 81.7
Macbook Air Apple M1 Firefox 88 89.3 85.9 70.5 81.1
iPhone SE2 Apple A13 WebKit View 71.5 68.7 64.0 69.1
Google Pixel 3 Snapdragon 845 WebView 81.9 76.3 67.7 74.1
Table 4.4: Accuracy for GPU based website fingerprinting on ARM devices
Closed World Open World
Device GPU Browser
Ridge Regression CNN Ridge Regression CNN
Macbook Air Apple 7 Core Chrome 89 90.5 85.3 76.6 81.4
Google Pixel 3 Adreno 630 Chrome 89 88.2 82.6 67.6 77.3
105
behavior is expected as the 1D CNN utilizes multiple convolutional and pooling layers
to extract features from the dataset and learn both spatial and temporal patterns.
We note that the cache occupancy channel performs best on the Macbook Air,
and worst on the iPhone SE 2. This is likely related to the design of both the cache
systems and schedulers. The CPU core designs in the MacBook Air are one generation
newer, and the M1 chip was designed specifically for desktop/laptop workloads, and
was likely tuned for multiprocess workloads. Also, the M1 chip contains features to
prevent single cores from dominating the cache [64], and the A13 has been discovered
to use part of the shared high performance L2 cache as an extra L2 cache for the low
performance cores [63]. Apple also changes the amount of the cache that the high and
low power cores have access to depending on the DVFS states of the core [63].
To analyze these effects, we conduct experiments with different buffer sizes. The
iPhone SE2 and Google Pixel 3 offer relatively straight forward values. The Google
Pixel 3 reports 2MB of shared cache and we find that a 2MB buffer performs best in
the fingerprinting task. While the iPhone SE2 is unclear about the actual amount of
shared cache provided to the low power cores, we find that a 4MB buffer performs the
best in both tested configurations. Interestingly, this 4MB buffer seems to indicate that
the cache occupancy channel is solely utilizing the L2 cache of the low power cores,
potentially indicating that Apple schedules foreground browser rendering processes
to these low power cores or that the ‘extra’ L2 cache that is shared with the high
performance cores is not exclusively owned by either core type. The Macbook Air,
however, demonstrates vastly different behavior. Specifically, we find that a 4MB buffer
performs best for Google Chrome, a 10MB buffer for Mozilla Firefox, and a 24MB
buffer for Apple’s Safari. As previously mentioned, these differences may be caused by a
number of reasons, including renders and JavaScript engines. In general, attackers need
to adjust attack strategies based on various factors to achieve good overall performance.
Another possible factor in the reduced performance of the mobile devices vs. the
laptop form factor could be the trend of websites to deliver different pages to different
devices. When a laptop visits a website it views the entire site which usually contains
106
much more detailed content than the corresponding mobile website which gets served
to mobile devices. The vastly simplified mobile websites may appear more similar to
the cache occupancy channel resulting in the observed decreased accuracy.
5
Both iOS and Android provide a mechanism called a webview to display web content
to users within an application. The webview functions as a web browser without the
navigation controls. Both iOS and Android webview components are nearly identical
similar to the system web browser.
107
different process priorities that have recently been shown to greatly effect which cores a
specific task is executed on [124] and thus mixing native and web browser processes may
result in unexpected scheduling. While the process in a background tab is very likely to
end up on the low power cores, the native process may be scheduled on either depending
on how the operating system interprets its priority/whether it is a user-facing process.
The impact that the OS scheduler has is particularly evident when the perfor-
mance between the native web-based attack are examined in the context of different
devices. The MacBook Air (with a more friendly background process scheduler), experi-
ences an average 2.5% drop in closed world accuracy while the mobile devices exhibit an
average 7.4% drop in closed world accuracy. Thus, it is possible that in most cases the
web browser actually provides a more stable attack surface than the native application.
108
While it is difficult to directly compare to works done on homogeneous x86
systems like those done in [151], our open world Safari performance is 4% better than
their best neural network configuration, and the closed world attack is 22% better. One
item that complicates comparison to [151] is their open world data. Their work claims
99% accuracy in delineating between a sensitive and non-sensitive website, which could
indicate significant differences between the open and closed world datasets. Contrarily,
our work combines and randomizes the order of the collection of the open and closed
world datasets to ensure that there are no cross-sample ordering artifacts which might
artificially increase accuracy.
109
4.6.5 Countermeasures
There are several approaches to potentially protect an ARM system from those
contention based side channels. For example, the system can introduce noise to the
measurement channel via extra operations, or manipulate timers and array accesses
via obfuscation such as in Chrome Zero [145]. However, introducing extra noise has
been shown ineffective [151] and leads to increased energy usage, which is unacceptable
for mobile devices. Also, Shusterman et al. [150] demonstrated that the protections
of Chrome Zero is largely ineffective and impose significant performance penalties.
Furthermore, browser based defenses cannot thwart App-based attacks.
Another defensive approach for energy restricted devices is to remove process
contention via hardware segmentation. This can guarantee that the processes are
unable to interact with one another. However, it requires complete redesigns of the
operating system scheduler and hardware. In the future work, we plan to develop
effective defensive solutions to detect significant contention and large swings in cache
occupancy (similar to [23]) for ARM devices.
110
Our work provide a much deeper investigation of the cache occupancy channel
in ARM devices. In addition to Android and MacOS, we also study the iOS platform.
Furthermore, our approach differs from Shusterman’s in that we develop a vastly
different method for cache accesses (Section 4.5.2) which increases accuracy on budget
devices with slower processors. We also study the effect of different browsers and their
memory management, demonstrating that simply sizing the eviction buffer based on
the shared cache provides suboptimal results in different browsers engines on the same
hardware (Section 4.6.1). Besides, we also increase the attack effectiveness, utilizing
only 8 seconds of observation to identify the website unlike the previously required 30
seconds in both [151, 150]. Even with nearly 75% less sampling time, our experiments
outperform Shusterman’s work by more than 5% in testing on the M1 MacBook Air
with Google Chrome. Finally, we also propose (4.5.3) and test (4.6.4) the novel GPU
based contention channel and demonstrate that it is nearly as effective as the cache
occupancy channel in ARM SoC devices, raising the alarm on continued access to SoC
accelerator components from JavaScript.
Website Fingerprinting Website fingerprinting has long been an interesting target
for attackers. As desktop browsers were the original way to browse the web, many
website fingerprinting attacks focused on breaking privacy enhancing technologies
like HTTPS and ToR through attacks targetting features extracted from the packet
stream [35, 69, 81, 130, 139]. With the rise of mobile devices, more effort has been spent
examining mobile devices. MagneticSpy [115] examines both a JavaScript and app
based CPU activity channel employing the magnetometer. They perform similar open
and closed world examinations (albeit with fewer websites), and demonstrate impressive
fingerprinting accuracy. However, the JavaScript APIs that allowed access to these
sensors have since been removed from support in Firefox and Safari [148]. Furthermore,
iOS requires that users explicitly grant permissions to a website before it is allowed
to access their accelerometer data [96]. Several work [102, 183] examines power based
website fingerprinting on smartphones, however they require much higher frequency
sampling and cannot employ this attack from a JavaScript platform. Jana et al. [87]
111
examined the memory allocations of website traffic, but required privileged access to
process memory data (now removed from standard user access). Spreitzer et al. [155]
utilized the data usage API within Android to fingerprint websites, but this must be
done from a native application.
ARM Attacks Gulmezoglu et al. [73] built a similar contention based channel in
ARM devices, but mainly focused on finding contention among specific sets within the
cache ways of the device. Their attack only examines the Google Pixel 5, and only
utilizes native APIs within the system. While the work presents impressive results,
their system relies upon a identifying eviction sets within the cache. With a high
resolution timer this can take a few seconds, however, the low resolution timer available
from JavaScript [133] would make this task take infeasibly long. Lipp et al. [104] and
Gruss et al. [72] similarly construct memory based JavaScript attacks, but require either
privileged system calls or higher resolution timers than are currently available [133].
Timing Attacks from JavaScript Genkin et al. [66] executed encryption side channel
attacks from the browser but utilized web assembly and shared array buffers to construct
a high frequency timer. Oren et al. [126] similarly demonstrated that eviction sets could
be created via JavaScript timers and utilized this to provide a cursory examination of
website fingerprinting (not on ARM). Bosman et al. [29] demonstrated page dedupli-
cation attacks from JavaScript. Each of these attacks requires high resolution timers
that have since been removed from JavaScript [133]. Schwarz et al. [146] demonstrated
a number of interesting methods to achieve high resolution timing, however many of
these techniques have been disabled or hindered within major browsers.
GPU Attacks Lee et al. [98] examined website fingerprinting via shared memory
within the GPU. Frigo et al. [62] executed a number of side channel attacks from a
mobile GPU, however, these attacks require timing primitives that have been removed.
He et al. [77] uncovered a register leakage within Intel GPUs and exploited it to
identify websites. Naghibijouybari et al. [120] utilized GPU memory allocation APIs
within CUDA or OpenGL to track memory allocations and fingerprint websites. They
did not explore ARM integrated GPUs or execution from a JavaScript environment
112
and instead employed a spy program that ran as a native process with full access to
CUDA/OpenGL. Karimi et al. [90] examined a side channel attack against an ARM
SoC GPU and extracted AES keys by exploiting cache behavior, however the attack
requires a long execution time and a stable system which is not running other tasks,
and was not examined from a JavaScript perspective.
4.8 Conclusion
This chapter investigates whether the new ARM DynamIQ system design, specif-
ically the inclusion of a shared last level cache between all CPU cores and accelerators,
poses a security threat to individuals. We examine the information leakage in the
context of a website fingerprinting attack, demonstrating that a cache occupancy side
channel can be constructed to reliably fingerprint user website activities. We reveal this
security threat on Android, iOS, and MacOS, delving into how the channel responds to
different browser environments and proposing enhancements over previous works. In
addition, we unveil an accelerator based website fingerprinting channel, showing that
the SoC GPU can be exploited in a contention based side channel from JavaScript. Our
evaluation results indicate that both channels can achieve high website fingerprinting
accuracy on different browsers in Android, iOS, and MacOS systems in both open and
closed world scenarios.
113
Chapter 5
This dissertation has examined the security aspects of mobile devices and pe-
ripherals from the perspective of side channels. Specifically we have identified numerous
side channels which can be utilized both offensively and defensively in the realm of
mobile devices, uncovering novel techniques for stealing user input, defending secure
computers, and invading user privacy.
114
Finally, in Chapter 4 we analyze the current laptop/desktop marketplace, noting
that ARM chips are now being targeted at laptop and desktop systems. We identify
the major architectural changes that ARM SoCs have undergone, noting an increased
number of cache levels and shared caches as well as tightly integrating accelerators into
the memory subsystem. We notice that these new shared caches share some design
elements with x86 processors that have made them vulnerable to cache contention
side channels and demonstrate that with some optimization the accuracy of the cache
contention channel can be made compatible and highly accurate on ARM platforms.
We further develop a novel GPU contention channel in the SoC to fingerprint website
visits by the user.
• Investigating screen leakage from applications and text input fields. Charger
Surfing demonstrated that animations on the touch screen of a mobile phone can
lead to a leakage channel that allows attackers to identify the location of onscreen
animations. The case study demonstrates significant success in identifying user
passcodes. One barrier in the work is the amount of time and training data
required to create an accurate classifier. An investigation into designing an
automated process to extract animation content from applications that utilize
the system keyboard would be an ideal extension. Further work to investigate
the precision of the channel and whether it could be utilized to extract content
from the screen that is not related to animations (e.g. identify text on the screen)
would also be an ideal extension.
115
timing channel may be identifiable in the flash memory utilized by modern laptop
/ desktop and mobile phones. If possible, this could serve as another layer of
authentication to ensure that a user is logging in from their device. If an attacker
can find a way to reliably read the same location from disk (an infrequently
utilized file or system component), this type of channel may also be extended to
work from JavaScript and act as another method for device fingerprinting.
• The ARM cache and GPU contention channel can be extended by examining
a native implementation and extending the types of user activity analysis to
potentially identify the applications that a user is using. Manipulating the GPU
from JavaScript is far less exact than the CPU cache, however, further study
into how the GPU is shared between rendering processes would be beneficial.
Interestingly, the push of machine learning has lead to drafts of standards like
WebGPU [175] which may actually make the GPU contention channel far easier to
realize. Finally, investigating the other peripherals that share the last level cache
could be beneficial. Currently, image signal processors, digital signal processors,
and neural processing units all have access to the cache, determining if these
peripherals can be reached from a JavaScript context is an important next step
to better understanding the risks of tight accelerator integration into SoCs.
116
BIBLIOGRAPHY
117
[14] Sebastian Angel, Riad S. Wahby, Max Howald, Joshua B. Leners, Michael Spilo,
Zhen Sun, Andrew J. Blumberg, and Michael Walfish. Defending against malicious
peripherals with cinch. In USENIX Security Symposium, pages 397–414, 2016.
[16] Adam Aviv, John Davin, Flynn Wolf, and Ravi Kuber. Towards Baselines for
Shoulder Surfing on Mobile Authentication. In Proceedings of the 33rd Annual
Computer Security Applications Conference, 2017.
[17] Adam Aviv, Katherine Gibson, Evan Mossop, Matt Blaze, and Jonathan M Smith.
Smudge Attacks on Smartphone Touch Screens. Proceedings of 4th USENIX
Workshop on Offensive Technologies, 2010.
[18] Michael Backes, Tongbo Chen, Markus Duermuth, Hendrik Lensch, and Martin
Welk. Tempest in a Teapot: Compromising Reflections Revisited. In Proceedings
of the 30th IEEE Symposium on Security and Privacy, 2009.
[20] Darrin Barrall and David Dewey. Plug and Root, the USB Key to the Kingdom.
Presentation at Black Hat Briefings, 2005.
[21] Adam Bates, R. Leonard, Hannah Pruse, Daniel Lowd, and K. Butler. Leveraging
USB to Establish Host Identity Using Commodity Devices. In ISOC Network
and Distributed System Symposium (NDSS), 2014.
[22] Adam Bates, Dave (Jing) Tian, Kevin R.B. Butler, and Thomas Moyer. Trustwor-
thy whole-system provenance for the linux kernel. In USENIX Security Symposium,
pages 319–334, 2015.
[24] Ryad Benadjila, Arnauld Michelizza, Mathieu Renard, Philippe Thierry, and
Philippe Trebuchet. WooKey: Designing a Trusted and Efficient USB Device. In
ACM Computer Security Applications Conference (ACSAC), page 673–686, 2019.
[25] Yigael Berger, Avishai Wool, and Arie Yeredor. Dictionary Attacks Using Key-
board Acoustic Emanations. In Proceedings of the 13th ACM conference on
Computer and Communications Security, 2006.
118
[26] H. Bhargava and S. Sharma. Secured use of USB over the Intranet with anonymous
device Identification. In IEEE Conference on Communication Systems and
Network Technologies (CSNT), pages 49–53, 2018.
[28] Hristo Bojinov, Yan Michalevsky, Gabi Nakibly, and Dan Boneh. Mobile device
identification via sensor fingerprinting. https://arxiv.org/pdf/2002.05905.pdf,
2014.
[29] Erik Bosman, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. Dedup est
machina: Memory deduplication as an advanced exploitation vector. In IEEE
Symposium on Security and Privacy (SP), 2016.
[30] Vladimir Brik, Suman Banerjee, Marco Gruteser, and Sangho Oh. Wireless device
identification with radiometric signatures. In Conference on Mobile Computing
and Networking (MobiCom), page 116–127, 2008.
[31] Niels Brouwers, Marco Zuniga, and Koen Langendoen. NEAT: a Novel Energy
Analysis Toolkit for Free-Roaming Smartphones. In Proceedings of the 12th ACM
Conference on Embedded Network Sensor Systems, 2014.
[33] Eric Byres. The Air Gap: SCADA’s Enduring Security Myth. Commun. ACM,
page 29–31, August 2013.
[34] Liang Cai and Hao Chen. TouchLogger: Inferring Keystrokes on Touch Screen
from Smartphone Motion. Proceedings of the USENIX HotSec, 2011.
[35] Xiang Cai, Xin Cheng Zhang, Brijesh Joshi, and Rob Johnson. Touching from a
distance: Website fingerprinting attacks and defenses. In Computer and Commu-
nications Security (CCS), 2012.
[37] Hai-Wei Chen, Jiun-Haw Lee, Bo-Yen Lin, Stanley Chen, and Shin-Tson Wu.
Liquid Crystal Display and Organic Light-Emitting Diode Display: Present Status
and Future Perspectives. Light: Science & Applications, 2018.
119
[38] Li Chen, Jiacheng Xia, Bairen Yi, and Kai Chen. PowerMan: An Out-of-Band
Management Network for Datacenters Using Power Line Communication. In
Proceedings of the 15th USENIX Symposium on Networked Systems Design and
Implementation, 2018.
[39] Qi Alfred Chen, Zhiyun Qian, and Z. Morley Mao. Peeking into your app without
actually seeing it: UI state inference and novel android attacks. In 23rd USENIX
Security Symposium (USENIX Security 14), 2014.
[40] Qi Alfred Chen, Zhiyun Qian, and Zhuoqing Morley Mao. Peeking into Your App
without Actually Seeing It: UI State Inference and Novel Android Attacks. In
Proceedings of the 23rd USENIX Security Symposium, 2014.
[41] Xiang Chen, Yiran Chen, Zhan Ma, and Felix Fernandes. How is Energy Consumed
in Smartphone Display Applications? In Proceedings of the 14th ACM Workshop
on Mobile Computing Systems and Applications, 2013.
[42] Yimin Chen, Xiaocong Jin, Jingchao Sun, Rui Zhang, and Yanchao Zhang.
POWERFUL: Mobile App Fingerprinting via Power Analysis. In Proceedings of
the IEEE Conference on Computer Communications, 2017.
[43] Yimin Chen, Tao Li, Rui Zhang, Yanchao Zhang, and Terri Hedgpeth. Eye-
Tell: Video-Assisted Touchscreen Keystroke Inference from Eye Movements. In
Proceedings of the 2018 IEEE Symposium on Security and Privacy, 2018.
[44] Yushi Cheng, Xiaoyu Ji, Juchuan Zhang, Wenyuan Xu, and Yi-Chao Chen.
DeMiCPU: Device Fingerprinting with Magnetic Signals Radiated by CPU. In
ACM Conference on Computer and Communications Security (CCS), 2019.
[45] Shane Clark, Hossen Mustafa, Benjamin Ransford, Jacob Sorber, Kevin Fu, and
Wenyuan Xu. Current Events: Identifying Webpages by Tapping the Electrical
Outlet. In European Symposium on Research in Computer Security. Springer,
2013.
[48] David Cock, Qian Ge, Toby Murray, and Gernot Heiser. The last mile: An
empirical study of timing channels on sel4. In Computer and Communications
Security (CCS), 2014.
120
[49] Compaq, Hewlett-Packard, Intel, Lucent, Microsoft, NEC, and Philips. Universal
Serial Bus Specification, Revision 2.0, 2000.
[50] Mauro Conti, Michele Nati, Enrico Rotundo, and Riccardo Spolaor. Mind the
Plug! Laptop-User Recognition Through Power Consumption. In Proceedings of
the 2nd ACM International Workshop on IoT Privacy, Trust, and Security, 2016.
[51] Patrick Cronin and Chengmo Yang. A fetching tale: Covert communication with
the hardware prefetcher. In IEEE International Symposium on Hardware Oriented
Security and Trust (HOST), 2019.
[52] Andy Davis. Revealing Embedded Fingerprints: Deriving Intelligence from USB
Stack Interactions. Technical report, nccgroup, 2013.
[54] Wenrui Diao, Xiangyu Liu, Zhou Li, and Kehuan Zhang. No pardon for the
interruption: New inference attacks on android through interrupt timing analysis.
In IEEE Symposium on Security and Privacy (SP), pages 414–432, 2016.
[55] Mian Dong and Lin Zhong. Chameleon: a Color-Adaptive Web Browser for
Mobile OLED Displays. In Proceedings of the 9th International Conference on
Mobile Systems, Applications, and Services, 2011.
[59] Jingyao Fan, Qinghua Li, and Guohong Cao. Privacy Disclosure Through Smart
Meters: Reactive Power Based Attack and Defense. In Proceedings of the 47th An-
nual IEEE/IFIP International Conference on Dependable Systems and Networks,
2017.
[60] Dinei Florencio and Cormac Herley. A Large-Scale Study of Web Password Habits.
In Proceedings of the 16th International Conference on World Wide Web. ACM,
2007.
121
[61] USB Implementers Forum. Defined class codes. https://www.usb.org/defined-
class-codes.
[62] Pietro Frigo, Cristiano Giuffrida, Herbert Bos, and Kaveh Razavi. Grand pwning
unit: Accelerating microarchitectural attacks with the gpu. In IEEE Symposium
on Security and Privacy (SP), 2018.
[64] Andrei Frumusanu. The 2020 Mac Mini Unleashed: Putting Apple Silicon M1 To
The Test. https://www.anandtech.com/show/16252/mac-mini-apple-m1-tested,
Nov 2020.
[65] Daniel Genkin, Lev Pachmanov, Itamar Pipman, Eran Tromer, and Yuval Yarom.
ECDSA Key Extraction from Mobile Devices via Nonintrusive Physical Side
Channels. In Proceedings of the 2016 ACM SIGSAC Conference on Computer
and Communications Security, 2016.
[66] Daniel Genkin, Lev Pachmanov, Eran Tromer, and Yuval Yarom. Drive-by
key-extraction cache attacks from portable code. In Bart Preneel and Fred-
erik Vercauteren, editors, Applied Cryptography and Network Security. Springer
International Publishing, 2018.
[67] Daniel Genkin, Mihir Pattani, Roei Schuster, and Eran Tromer. Synesthesia:
Detecting screen content via remote acoustic side channels. In IEEE Symposium
on Security and Privacy (SP), 2019.
[68] Daniel Genkin, Itamar Pipman, and Eran Tromer. Get Your Hands off My Laptop:
Physical Side-Channel Key-Extraction Attacks on PCs. Journal of Cryptographic
Engineering, 2015.
[69] Xun Gong, Nikita Borisov, Negar Kiyavash, and Nabil Schear. Website detection
using remote traffic analysis. In Privacy Enhancing Technologies Symposium
(PETS), 2012.
[72] Daniel Gruss, David Bidner, and Stefan Mangard. Practical memory deduplication
attacks in sandboxed javascript. In Günther Pernul, Peter Y A Ryan, and Edgar
Weippl, editors, Computer Security – ESORICS 2015. Springer International
Publishing, 2015.
122
[73] Berk Gulmezoglu, Andreas Zankl, M. Caner Tol, Saad Islam, Thomas Eisenbarth,
and Berk Sunar. Undermining user privacy on mobile devices using ai. In
Proceedings of the 2019 ACM Asia Conference on Computer and Communications
Security, Asia CCS ’19. Association for Computing Machinery, 2019.
[74] Zimu Guo, Xiaolin Xu, Mark M. Tehranipoor, and Domenic Forte. Ffd: A
framework for fake flash detection. In ACM Design Automation Conference
(DAC), 2017.
[75] Mordechai Guri, Boris Zadov, Dima Bykhovsky, and Yuval Elovici. PowerHammer:
Exfiltrating Data from Air-Gapped Computers through Power Lines. IEEE
Transactions on Information Forensics and Security, 2020.
[77] Wenjian HE, Wei Zhang, Sharad Sinha, and Sanjeev Das. Igpu leak: An informa-
tion leakage vulnerability on intel integrated gpu. In 2020 25th Asia and South
Pacific Design Automation Conference (ASP-DAC), 2020.
[78] Grant Hernandez, Farhaan Fowze, Dave (Jing) Tian, Tuba Yavuz, and Kevin R.B.
Butler. Firmusb: Vetting usb device firmware using domain informed symbolic
execution. In ACM Conference on Computer and Communications Security
(CCS), page 2245–2262, 2017.
[79] Hewlett-Packard, Intel, Microsoft, NEC, ST-NXP Wireless, and Texas Instruments.
Universial Serial Bus 3.0 Specification, Revision 1.0, 2008.
[82] Omar Adel Ibrahim, Savio Sciancalepore, Gabriele Oligeri, and Roberto Di Pietro.
Magneto: Fingerprinting usb flash drives via unintentional magnetic emissions.
ACM Trans. Embed. Comput. Syst., 2020.
[84] Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar,
and Pierre-Alain Muller. Deep Learning for Time Series Classification: A Review.
Data Mining and Knowledge Discovery, 2019.
[85] Jeffrey Robert Jacobs. Measuring the Effectiveness of the USB Flash Drive as a
Vector for Social Engineering Attacks on Commercial and Residential Computer
Systems. Master’s Thesis Embry-Riddle Aeronautical University, 2011.
123
[86] Karsten Nohl Jakob Lell. BadUSB - On Accessories that Turn Evil. Blackhat
USA, 2014.
[87] Suman Jana and Vitaly Shmatikov. Memento: Learning secrets from process
footprints. In IEEE Symposium on Security and Privacy (SP), 2012.
[88] Shijie Jia, Luning Xia, Zhan Wang, Jingqiang Lin, Guozhu Zhang, and Yafei
Ji. Extracting robust keys from nand flash physical unclonable functions. In
Conference on Information Security (ISC), page 437–454. Springer-Verlag, 2015.
[89] Peter C. Johnson, Sergey Bratus, and Sean W. Smith. Protecting against malicious
bits on the wire: Automatically generating a usb protocol parser for a production
kernel. In ACM Annual Computer Security Applications Conference (ACSAC),
page 528–541, 2017.
[90] Elmira Karimi, Zhen Hang Jiang, Yunsi Fei, and David Kaeli. A timing side-
channel attack on a mobile gpu. In IEEE 36th International Conference on
Computer Design (ICCD), 2018.
[91] Amin Kharraz, Brandon L. Daley, Graham Z. Baker, William Robertson, and
Engin Kirda. USBESAFE: An end-point solution to protect against usb-based
attacks. In USENIX Research in Attacks, Intrusions and Defenses (RAID), 2019.
[93] Paul Kocher, Jann Horn, Anders Fogh, , Daniel Genkin, Daniel Gruss, Werner
Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael
Schwarz, and Yuval Yarom. Spectre attacks: Exploiting speculative execution. In
40th IEEE Symposium on Security and Privacy (S&P), 2019.
[94] Paul Kocher, Joshua Jaffe, and Benjamin Jun. Differential Power Analysis. In
Proceedings of the Annual International Cryptology Conference. Springer, 1999.
[96] Andy Kong. Accessing the iphone accelerometer with javascript in ios 14 and
13. https://kongmunist.medium.com/accessing-the-iphone-accelerometer-with-
javascript-in-ios-14-and-13-e146d18bb175, Nov 2020.
[98] Sangho Lee, Youngsok Kim, Jangwoo Kim, and Jong Kim. Stealing webpages
rendered on your browser by exploiting gpu vulnerabilities. In IEEE Symposium
on Security and Privacy, 2014.
124
[99] Lara Letaw, Joe Pletcher, and Kevin Butler. Host Identification via USB Finger-
printing. In International Workshop on Systematic Approaches to Digital Forensic
Engineering (SADFE), page 1–9, 2011.
[100] Lingjun Li, Xinxin Zhao, and Guoliang Xue. Unobservable Re-Authentication for
Smartphones. In Proceedings of the 20th Network and Distributed System Security
Symposium, 2013.
[101] Yanlin Li, Jonathan M. McCune, and Adrian Perrig. Viper: Verifying the integrity
of peripherals’ firmware. In ACM Conference on Computer and Communications
Security (CCS), page 3–16, 2011.
[102] Pavel Lifshits, Roni Forte, Yedid Hoshen, Matt Halpern, Manuel Philipose, Mohit
Tiwari, and Mark Silberstein. Power to peep-all: Inference attacks by malicious
batteries on mobile devices. Proceedings on Privacy Enhancing Technologies,
2018.
[103] Chia-Chi Lin, Hongyang Li, Xiao-yong Zhou, and XiaoFeng Wang. Screenmilker:
How to milk your android screen for secrets. In 21st Annual Network and
Distributed System Security Symposium, NDSS, 2014.
[104] Moritz Lipp, Daniel Gruss, Raphael Spreitzer, Clémentine Maurice, and Stefan
Mangard. Armageddon: Cache attacks on mobile devices. In 25th USENIX
Security Symposium (USENIX Security. USENIX Association, 2016.
[105] Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas,
Anders Fogh, Jann Horn, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval
Yarom, and Mike Hamburg. Meltdown: Reading kernel memory from user space.
In 27th USENIX Security Symposium (USENIX Security 18), pages 973–990,
Baltimore, MD, August 2018. USENIX Association.
[106] Fangfei Liu, Yuval Yarom, Qian Ge, Gernot Heiser, and Ruby B. Lee. Last-level
cache side-channel attacks are practical. In 2015 IEEE Symposium on Security
and Privacy, 2015.
125
[112] Xiao Ma, Peng Huang, Xinxin Jin, Pei Wang, Soyeon Park, Dongcai Shen,
Yuanyuan Zhou, Lawrence Saul, and Geoffrey Voelker. Edoctor: Automatically
Diagnosing Abnormal Battery Drain Issues on Smartphones. In Proceedings of
the 10th USENIX Symposium on Networked Systems Design and Implementation,
2013.
[113] Jani Mantyjarvi, Mikko Lindholm, Elena Vildjiounaite, S-M Makela, and
HA Ailisto. Identifying Users of Portable Devices from Gait Pattern with Ac-
celerometers. In Proceedings of IEEE International Conference on Acoustics,
Speech, and Signal Processing, 2005.
[115] Nikolay Matyunin, Yujue Wang, Tolga Arul, Kristian Kullmann, Jakub Szefer,
and Stefan Katzenbeisser. Magneticspy: Exploiting magnetometer in mobile
devices for website and application fingerprinting. In Proceedings of the 18th
ACM Workshop on Privacy in the Electronic Society. Association for Computing
Machinery, 2019.
[116] Yan Michalevsky, Aaron Schulman, Gunaa Arumugam Veerapandian, Dan Boneh,
and Gabi Nakibly. PowerSpy: Location Tracking Using Mobile Device Power
Analysis. In Proceedings of the 24th USENIX Security Symposium, 2015.
[117] Micron. NAND Flash 101: An Introduction to NAND Flash and How to Design
It In to Your Next Product, TN-29-19. Technical report, 2010.
[118] Emiliano Miluzzo, Alexander Varshavsky, Suhrid Balakrishnan, and Romit Roy
Choudhury. Tapprints: Your Finger Taps Have Fingerprints. In Proceedings of
the 10th ACM International Conference on Mobile Systems, Applications, and
Services, 2012.
[119] John Monaco. SoK: Keylogging Side Channels. In Proceedings of the 2018 IEEE
Symposium on Security and Privacy. IEEE, 2018.
[120] Hoda Naghibijouybari, Ajaya Neupane, Zhiyun Qian, and Nael Abu-Ghazaleh.
Rendered insecure: Gpu side channel attacks are practical. In Proceedings of
the 2018 ACM SIGSAC Conference on Computer and Communications Security.
Association for Computing Machinery, 2018.
[122] Sebastian Neuner, Artemios G. Voyiatzis, Spiros Fotopoulos, Collin Mulliner, and
Edgar R. Weippl. USBlock: Blocking USB-Based Keypress Injection Attacks. In
126
Data and Applications Security and Privacy, pages 278–295. Springer International
Publishing, 2018.
[123] T. Nguyen, S. Park, and D. Shin. Extraction of device fingerprints using built-in
erase-suspend operation of flash memory devices. IEEE Access, pages 98637–98646,
2020.
[124] Howard Oakley. How m1 macs feel faster than intel models: it’s about
qos. https://eclecticlight.co/2021/05/17/how-m1-macs-feel-faster-than-intel-
models-its-about-qos/, May 2021.
[125] National Institute of Standards and Technology. Security and privacy controls for
federal information systems and organizations, 2020.
[127] Dag Arne Osvik, Adi Shamir, and Eran Tromer. Cache attacks and counter-
measures: The case of aes. In David Pointcheval, editor, Topics in Cryptology –
CT-RSA 2006. Springer Berlin Heidelberg, 2006.
[128] Emmanuel Owusu, Jun Han, Sauvik Das, Adrian Perrig, and Joy Zhang. ACCes-
sory: Password Inference Using Accelerometers on Smartphones. In Proceedings
of the 12th ACM Workshop on Mobile Computing Systems and Applications, 2012.
[129] J.L. Padilla, P. Padilla, J.F. Valenzuela-Valdés, J. Ramı́rez, and J.M. Górriz.
RF fingerprint measurements for the identification of devices in wireless com-
munication networks based on feature reduction and subspace transformation.
Measurement, pages 468 – 475, 2014.
[130] Andriy Panchenko, Fabian Lanze, Andreas Zinnen, Martin Henze, Jan Pennekamp,
Klaus Wehrle, and Thomas Engel. Website fingerprinting at internet scale. In
Network and Distributed Systems Symposium (NDSS), 2016.
[131] Abhinav Pathak, Charlie Hu, and Ming Zhang. Where is the Energy Spent Inside
My App?: Fine Grained Energy Accounting on Smartphones with eprof. In
Proceedings of the 7th ACM European Conference on Computer Systems, 2012.
[132] Abhinav Pathak, Charlie Hu, Ming Zhang, Paramvir Bahl, and Yi-Min Wang.
Fine-Grained Power Modeling for Smartphones Using System Call Tracing. In
Proceedings of the 6th ACM Conference on Computer Systems, 2011.
[133] Filip Pizlo. What spectre and meltdown mean for webkit.
https://webkit.org/blog/8048/what-spectre-and-meltdown-mean-for-webkit/, Jan
2018.
127
[134] Raymond Pompon. Attacking Air-Gap-Segregated Computers.
https://www.f5.com/labs/articles/cisotociso/attacking-air-gap-segregated-
computers, 2018.
[135] Pravin Prabhu, Ameen Akel, Laura M. Grupp, Wing-Kei S. Yu, G. Edward
Suh, Edwin Kan, and Steven Swanson. Extracting device fingerprints from flash
memory by exploiting physical variations. In Trust and Trustworthy Computing.
Springer Berlin Heidelberg, 2011.
[139] Vera Rimmer, Davy Preuveneers, Marc Juarez, Tom Van Goethem, and Wouter
Joosen. Automated website fingerprinting through deep learning. In Network and
Distributed Systems Symposium (NDSS), 2018.
[140] Thomas Ristenpart, Eran Tromer, Hovav Shacham, and Stefan Savage. Hey, you,
get off of my cloud: exploring information leakage in third-party compute clouds.
In Computer and Communications Security (CCS), 2009.
[141] J Rogers. Please Enter Your Four-Digit Pin. Financial Services Technology, US
Edition, 2007.
[144] Paul Sawers. US Govt. plant USB sticks in security study, 60% of subjects take
the bait. https://thenextweb.com/insider/2011/06/28/us-govt-plant-usb-sticks-
in-security-study-60-of-subjects-take-the-bait/, 2011.
[145] Michael Schwarz, Moritz Lipp, and Daniel Gruss. Javascript zero: Real javascript
and zero side-channel attacks. In Network and Distributed System Security
Symposium, 2018.
128
[146] Michael Schwarz, Clémentine Maurice, Daniel Gruss, and Stefan Mangard. Fantas-
tic timers and where to find them: High-resolution microarchitectural attacks in
javascript. In Aggelos Kiayias, editor, Financial Cryptography and Data Security.
Springer International Publishing, 2017.
[149] Len Sherman. The Basics of USB Battery Charging: A Survival Guide. Maxim
Integrated Products, Inc, 2010.
[150] Anatoly Shusterman, Ayush Agarwal, Sioli O’Connell, Daniel Genkin, Yossi Oren,
and Yuval Yarom. Prime+probe 1, javascript 0: Overcoming browser-based
side-channel defenses. In 30th USENIX Security Symposium (USENIX Security),
2021.
[151] Anatoly Shusterman, Lachlan Kang, Yarden Haskal, Yosef Meltser, Prateek Mittal,
Yossi Oren, and Yuval Yarom. Robust website fingerprinting through the cache
occupancy channel. In 28th USENIX Security Symposium (USENIX Security 19),
2019.
[152] Zdeňka Sitová, Jaroslav Šeděnka, Qing Yang, Ge Peng, Gang Zhou, Paolo Gasti,
and Kiran Balagani. HMOG: New Behavioral Biometric Features for Continuous
Authentication of Smartphone Users. IEEE Transactions on Information Forensics
and Security, 2016.
[154] Riccardo Spolaor, Laila Abudahi, Veelasha Moonsamy, Mauro Conti, and Radha
Poovendran. No Free Charge Theorem: A Covert Channel via USB Charging
Cable on Mobile Devices. In International Conference on Applied Cryptography
and Network Security. Springer, 2017.
[155] Raphael Spreitzer, Simone Griesmayr, Thomas Korak, and Stefan Mangard.
Exploiting data-usage statistics for website fingerprinting attacks on android. In
Proceedings of the 9th ACM Conference on Security & Privacy in Wireless and
Mobile Networks, WiSec ’16, 2016.
[157] Yang Su, Daniel Genkin, Damith Ranasinghe, and Yuval Yarom. USB Snooping
Made Easy: Crosstalk Leakage Attacks on USB Hubs. In Proceedings of the 26th
USENIX Security Symposium, 2017.
129
[158] Jingchao Sun, Xiaocong Jin, Yimin Chen, Jinxue Zhang, Yanchao Zhang, and
Rui Zhang. VISIBLE: Video-Assisted Keystroke Inference from Tablet Backside
Motion. In Proceedings of the 23rd Network and Distributed System Security
Symposium, 2016.
[161] Di Tang, Zhe Zhou, Yinqian Zhang, and Kehuan Zhang. Face Flashing: a Secure
Liveness Detection Protocol based on Light Reflections. Proceedings of the 25th
Network and Distributed System Security Symposium, 2018.
[164] Dave Jing Tian, Adam Bates, and Kevin Butler. Defending Against Malicious
USB Firmware with GoodUSB. In Proceedings of the 31st Annual Computer
Security Applications Conference, 2015.
[165] Dave Jing Tian, Adam Bates, and Kevin Butler. Defending against malicious
usb firmware with goodusb. In ACM Annual Computer Security Applications
Conference (ACSAC), page 261–270. Association for Computing Machinery, 2015.
[166] Dave (Jing) Tian, Grant Hernandez, Joseph I. Choi, Vanessa Frost, Christie Raules,
Patrick Traynor, Hayawardh Vijayakumar, Lee Harrison, Amir Rahmati, Michael
Grace, and Kevin Butler. ATtention Spanned: Comprehensive Vulnerability
Analysis of AT Commands Within the Android Ecosystem. In Proceedings of the
27th USENIX Security Symposium, 2018.
[167] Dave (Jing) Tian, Nolen Scaife, Adam Bates, Kevin Butler, and Patrick Traynor.
Making USB Great Again with USBFILTER. In Proceedings of the 25th USENIX
Security Symposium, 2016.
[168] Dave (Jing) Tian, Nolen Scaife, Adam Bates, Kevin Butler, and Patrick Traynor.
Making USB great again with USBFILTER. In USENIX Security Symposium,
pages 415–430, 2016.
130
[169] J. Tian, N. Scaife, D. Kumar, M. Bailey, A. Bates, and K. Butler. SoK: “Plug &
Pray” Today – Understanding USB Insecurity in Versions 1 Through C. In IEEE
Symposium on Security and Privacy (S&P), pages 1032–1047, 2018.
[170] J. Tian, N. Scaife, D. Kumar, M. Bailey, A. Bates, and K. Butler. SoK: ”Plug
Pray” Today – Understanding USB Insecurity in Versions 1 Through C. In 2018
IEEE Symposium on Security and Privacy, May 2018.
[174] Y. Wang, W. Yu, S. Wu, G. Malysa, G. E. Suh, and E. C. Kan. Flash memory
for ubiquitous hardware security functions: True random number generation and
device fingerprints. In IEEE Symposium on Security and Privacy (S&P), pages
33–47, 2012.
[177] Weitao Xu, Guohao Lan, Qi Lin, Sara Khalifa, Neil Bergmann, Mahbub Hassan,
and Wen Hu. Keh-Gait: Towards a Mobile Healthcare User Authentication
System by Kinetic Energy Harvesting. In Proceedings of the 24th Network and
Distributed System Security Symposium, 2017.
[178] Yi Xu, Jared Heinly, Andrew White, Fabian Monrose, and Jan-Michael Frahm.
Seeing Double: Reconstructing Obscured Typed Input from Repeated Compro-
mising Reflections. In Proceedings of the 2013 ACM SIGSAC conference on
Computer and communications security, 2013.
[179] Zhi Xu, Kun Bai, and Sencun Zhu. Taplogger: Inferring User Inputs on Smart-
phone Touchscreens Using On-Board Motion Sensors. In Proceedings of the 5th
ACM Conference on Security and Privacy in Wireless and Mobile Networks, 2012.
[180] Lin Yan, Yao Guo, Xiangqun Chen, and Hong Mei. A Study on Power Side
Channels on Mobile Devices. In Proceedings of the 7th Asia-Pacific Symposium
on Internetware, 2015.
131
[181] Bo Yang, Yu Qin, Zhang Yingjun, Weijin Wang, and Dengguo Feng. TMSUI:
A Trust Management Scheme of USB Storage Devices for Industrial Control
Systems. In Information and Communications Security”, pages 152–168, 2016.
[182] Qing Yang, Paolo Gasti, Gang Zhou, Aydin Farajidavar, and Kiran Balagani.
On Inferring Browsing Activity on Smartphones via USB Power Analysis Side-
Channel. IEEE Transactions on Information Forensics and Security, 2017.
[183] Qing Yang, Paolo Gasti, Gang Zhou, Aydin Farajidavar, and Kiran S Balagani.
On inferring browsing activity on smartphones via usb power analysis side-channel.
IEEE Transactions on Information Forensics and Security, 2016.
[184] Guixin Ye, Zhanyong Tang, Dingyi Fang, Xiaojiang Chen, Kwang In Kim, Ben
Taylor, and Zheng Wang. Cracking Android Pattern Lock in Five Attempts. In
Proceedings of the 24th Network and Distributed System Security Symposium,
2017.
[185] Qinggang Yue, Zhen Ling, Xinwen Fu, Benyuan Liu, Kui Ren, and Wei Zhao.
Blind Recognition of Touched Keys on Mobile Devices. In Proceedings of the 2014
ACM SIGSAC Conference on Computer and Communications Security, 2014.
[186] Pete Zaitcev. The usbmon: USB monitoring framework, 2005.
[187] J. Zhang, A. R. Beresford, and I. Sheret. SensorID: Sensor Calibration Finger-
printing for Smartphones. In IEEE Symposium on Security and Privacy (S&P),
pages 638–655, 2019.
[188] Lide Zhang, Birjodh Tiwana, Zhiyun Qian, Zhaoguang Wang, Robert Dick, Morley
Mao, and Lei Yang. Accurate Online Power Estimation and Automatic Battery
Behavior Based Power Model Generation for Smartphones. In Proceedings of the
8th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign
and System Synthesis, 2010.
[189] Xiaokuan Zhang, Xueqiang Wang, Xiaolong Bai, Yinqian Zhang, and XiaoFeng
Wang. OS-level Side Channels without Procfs: Exploring Cross-App Information
Leakage on iOS. In Proceedings of the 25th Network and Distributed System
Security Symposium, 2018.
[190] Xiaokuan Zhang, Xueqiang Wang, Xiaolong Bai, Yinqian Zhang, and XiaoFeng
Wang. Os-level side channels without procfs: Exploring cross-app information leak-
age on ios. In 25th Annual Network and Distributed System Security Symposium,
NDSS. The Internet Society, 2018.
[191] Xiaokuan Zhang, Yuan Xiao, and Yinqian Zhang. Return-Oriented Flush-Reload
Side Channels on ARM and their Implications for Android Devices. In Proceedings
of the 2016 ACM SIGSAC Conference on Computer and Communications Security,
2016.
132
[192] Nan Zheng, Kun Bai, Hai Huang, and Haining Wang. You Are How You Touch:
User Verification on Smartphones via Tapping Behaviors. In Proceedings of the
IEEE 22nd International Conference on Network Protocols, 2014.
[193] Man Zhou, Qian Wang, Jingxiao Yang, Qi Li, Feng Xiao, Zhibo Wang, and
Xiaofen Chen. PatternListener: Cracking Android Pattern Lock Using Acoustic
Signals. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and
Communications Security, 2018.
[194] Xiaoyong Zhou, Soteris Demetriou, Dongjing He, Muhammad Naveed, Xiaorui
Pan, XiaoFeng Wang, Carl A. Gunter, and Klara Nahrstedt. Identity, location,
disease and more: Inferring your secrets from android public resources. In
Proceedings of the ACM SIGSAC Conference on Computer & Communications
Security, 2013.
[195] Li Zhuang, Feng Zhou, and Doug Tygar. Keyboard Acoustic Emanations Revisited.
ACM Transactions on Information and System Security, 2009.
133
Appendix A
134
Table A.1: Smartphones Used For Evaluation
Screen
Phone (Release Year) OS Processor GPU
Resolution Technology
4 x 1.5 GHz A-53
Motorola G4 (2016) Android 6.0.1 Adreno 405 1920x1080 LCD
4 x 1.2 GHz A-53
Samsung Galaxy Nexus
Android 6.0.1 2 x 1.2 GHz A-9 PowerVR SGX540 1280x720 Super AMOLED
(2012)
Apple iPhone 6+ (2014) iOS 12.1 2 x 1.4 GHz Typhoon PowerVR GX6450 1920x1080 LCD
2 x 2.3 GHz Monsoon
Apple iPhone 8+ (2017) iOS 12.1.2 Apple GPU 1920x1080 LCD
4 x 1.4 GHz Mistral
135
Table A.2: Classification Network Used for iPhone
iPhone Classification Network
Layer Operation Kernel Size
1 Input 100000x1
2 Convolution 50x50
3 MaxPool 5
4 Convolution 50x50
5 MaxPool 5
6 Convolution 50x50
7 MaxPool 5
8 Convolution 50x50
9 GlobalAveragePool -
10 Dropout 0.5
11 Dense 10
136
Appendix B
137
Appendix C
138