Professional Documents
Culture Documents
Self-checking
Jiaxuan Wu Zheng Zhu
jasonwuee@gmail.com s222461@dtu.dk
Technical University of Denmark Technical University of Denmark
Kongens Lyngby, Copenhagen, Denmark Kongens Lyngby, Copenhagen, Denmark
Table 1: Full List of Collected and Examined Papers. of an executable might not apply to other customized versions.
Self-checking, also referred to as tamper-proofing, integrity check-
Author/Reference Year Citations Venue ing, and anti-tampering technology, is an essential element in an
effective tamper-resistance strategy. Self-checking detects changes
Aucsmith[5] 1996 527 Springer
in the program and invokes an appropriate response if change is
Collberg[10] 1997 1534 CSTR
detected. Our approach falls into the self-checking category.
Wang[20] 2000 271 ACM
Chang[8] 2001 374 ACM
Collberg[11] 2002 1201 IEEE
2 BACKGROUND AND DEFINITION
Petroni[19] 2004 738 USENIX In this section, we first introduce necessary knowledge of program
Castro[7] 2006 476 USENIX analysis and communication between user mode and kernel mode,
Nagra[17] 2009 520 Pearson Education two techniques used in our approach. Then, we carefully define
Baumann[6] 2015 919 ACM important concepts used in our study.
1.1 Literature Review Reaching definition. Reaching definition is one of the most com-
mon and useful data flow schemas. Using the terminology from [1],
Towards checking how far we are in runtime software integrity an instruction that writes to a memory position defines the value
protection, we performed a systematic literature review based on in the memory position, and an instruction that reads the value
the guidelines provided by[21] to understand the status quo. is said to use the value. The analysis computes the set of reaching
Keywords Identification. We resort to a set of keywords to search definitions for each use and assigns an identifier to each definition.
for relevant publications in popular repositories. It returns a map from instructions to definition identifiers and a set
• S1: software integrity protection of reaching definition identifiers for each use, which we call the
• S2: data flow integrity, data-flow analysis static data-flow graph.
• S3: software hardening In this paper’s context, We rely on a flow-sensitive intraproce-
dural analysis, which is called reaching definitions analysis [1] to
Paper Exclusion. As we aimed at collecting as many relevant pa- compute the static data-flow graph and determine the location to
pers as possible, we have simply considered all the returned results. insert the corresponding instrument.
However, not every paper is related to runtime software integrity
protection. We there go one step further to read the abstract (and 2.2 Interprocedural Analysis
full content if needed) of the obtained papers to only retain the
An interprocedural analysis operates across an entire program,
closely related ones by applying the following exclusion criteria:
flowing information from the caller to its callees and vice versa.
(1) Short papers (i.e., less than six pages in double-column format
For languages that pass parameters by reference, interprocedural
or 11 pages in single-column format) are excluded. (2) Papers not
analysis is needed to determine if the same variable is passed as two
targeting x86 architectures are excluded. (3) Papers not targeting
or more different arguments. Such aliases can create dependences
c/c++ language are excluded. (4) Papers targeting software integrity
between seemingly distinct parameters.
issues but that do not concern runtime protection are excluded. (5)
Context-sensitive Interprocedural Analysis means that data-flow
Papers with few citations are excluded. After applying these exclu-
analysis is required to take cognizance of what the sequence of pro-
sion criteria, there are 10 papers retained that are closely related to
cedure calls has been. That is, context-sensitive analysis includes
runtime software intrgrity protection, which are listed in table 1.
the current sequence of activation records on the stack, along with
Status Quo Analysis. The approaches against MATE attacks, the current point in the program, when distinguishing among dif-
moves around the four basic categories, i.e., code obfuscation[10, 17, ferent "places" in the program.
20], customization[5], self-checking or run-time process memory On the other hand, context-insensitive refers to an analysis or
invariant checking[7, 8, 11, 19], and trusted hardware[6, 18] with computation that does not take into account the context in which
their subcategories. Obfuscation attempts to thwart reverse engi- a piece of code is executed. This means that the analysis treats all
neering by making it hard to understand the behavior of a program instances of the code as if they are executed in the same context. In
through static or dynamic analysis. Customization takes one copy context-insensitive analysis, parameters and returned values are
of a program and creates many very different versions. Distributing modeled by copy statements.
Many different versions of a program stops widespread damage In this paper’s context, we choose to use context-sensitive inter-
from a security break since published patches to break one version procedural analysis with pointer analysis [05 Improving software
2
Enforcing Runtime Software Integrity by Finer-grained Self-checking Conference’17, July 2017, Washington, DC, USA
security with a C pointer analysis.] because the method is more goals are out of our scope. Runtime software integrity refers to
precise to allow it to scale to large programs. the integrity of executable code and memory space at runtime.
Online software, (or a web-based software) is a software that runs
2.3 Kernel-mode Driver on a server or a client that requires internet connection for full
Windows operating system includes both user-mode and kernel- functionality.
mode components (https://learn.microsoft.com/en-us/windows- Threats: In this scenario, reputation, financial gains, sabotage,
hardware/drivers/kernel/overview-of-windows-components). Kernel- fantasy of success and terrorism are non-exhaustive motivations
mode driver runs it code in kernel mode. Kernel-mode driver export for attackers to target the integrity of software systems. Here we
a specific call routine so that it can respond to specific calls from mainly consider two types of MATE attacks: local attack and remote
the operating system and can respond to other system calls. attack[2]. Local attackers can tamper with softwares at any stage
In this paper’s context, our kernel level monitor is actually a and at any where (on-disk, in-memory and in-execution). Remote
kernel-mode driver, whose driver entry identifier is 0x666. attackers do not require physical access to the system of interest and
they have the ability to exploit the vulnerabilities of the software
2.4 Asynchronous Communication which enable them to tamper with the integrity of the system.
Data transfers can be synchronous or asynchronous. The determin-
Goal: Our overall design goal is to protect the integrity of online
ing factor is whether the entry point that schedules the transfer
software system from local and remote MATE attacks. In detail, we
returns immediately or waits until the I/O has been completed.
need to prevent:
With an asynchronous I/O request to the kernel, the process is not
required to wait while the I/O is in process. A process can per- • Runtime code overwrite (local)
form multiple I/O requests and allow the kernel to handle the data • Runtime memory overwrite (local)
transfer details. • Runtime control flow data attack (remote)
• Runtime non-control flow data attack [9](remote)
2.5 Definition 1: Data-flow Integrity
Castro et al. introduced data-flow integrity (DFI) [7], a defensive
3.2 Overview
technique that aims to protect programs against non-control-data Data-flow data integrity reinforcement has both compile-time and
attacks. Firstly, DFI generates the data-flow graph (DFG) of the runtime components. Figure 1 illustrates how these components
program by static analysis, secondly, it instruments the program work from a top-down overview.
introducing data-flow integrity checks, and finally, it enforces at
runtime that the data-flow of the program is allowed by the DFG,
otherwise the execution is aborted.
Data-flow integrity. i.e., whenever a value is read, the definition
identifier of the instruction that wrote the value should be in the set
of reaching definitions for the read (if there is one), the definition
instruction should be the same as in the source code and remain Figure 1: Top-down overview
unchanged since the last definition.
The first phase uses static program analysis to compute a data-
2.6 Definition 2: Control-flow Integrity
flow graph for the target program. The second phase instruments
The CFI security policy dictates that software execution must fol- the program to ensure that the data-flow data is not overwritten
low a path of a ControlFlow Graph (CFG) determined ahead of and matches this graph. The third phase launches the reinforced
time [05 Control-Flow Integrity Principles, Implementations, and program. When one of the instruments is executed, it will launch
Applications]. an asynchronous system call to our kernel level monitor. Then the
Control-flow integrity. i.e., whenever a value is read, the compar- kernel level monitor will scan specific memory addresses of the
ison instruction should be the same as in the source code and the program to check whether anomaly memory overwrite happens. If
value should remain unchanged since the last definition. the answer is yes, it will raise an exception. Since our solution will
In this paper’s context, control-flow integrity check has been not generate false positives; when we raise an exception, there is
included in flow-sensitive data-flow analysis and context-sensitive an attack in process.
interprocedural analysis. Thus if data-flow integrity is not violated We will use the simple example in Fig 2 to illustrate how these
at runtime, then control-flow integrity is not violated. phases work.
The example shows a code fragment that is inspired by a vulner-
3 APPROACH ability in SSH[23] that can be exploited to launch both control-data
and non-control-data attacks [9].
3.1 Scenario, Threat and Goal In this example, we assume that Windows exploitation protection
Scenario: While there are various attack scenarios and goals that methods are turned off, such as ACG and StackPivot [16]. This
MATE attackers may pursue, here we particularly focus on attacks vulnerability could be exploited by both local MATE attack and
violating online software integrity at runtime; other attacker remote MATE attack to overwrite the return address of the function
3
Conference’17, July 2017, Washington, DC, USA Jiaxuan Wu and Zheng Zhu
[7] Miguel Castro, Manuel Costa, and Tim Harris. 2006. Securing software by view=o365-worldwide
enforcing data-flow integrity. In Proceedings of the 7th symposium on Operating [17] Jasvir Nagra and Christian Collberg. 2009. Surreptitious Software: Obfuscation,
systems design and implementation. 147–160. Watermarking, and Tamperproofing for Software Protection: Obfuscation, Water-
[8] Hoi Chang and Mikhail J Atallah. 2001. Protecting software code by guards. In marking, and Tamperproofing for Software Protection. Pearson Education.
ACM Workshop on Digital Rights Management. Springer, 160–175. [18] Ricardo Neisse, Dominik Holling, and Alexander Pretschner. 2011. Implementing
[9] Shuo Chen, Jun Xu, Emre Can Sezer, Prachi Gauriar, and Ravishankar K Iyer. 2005. trust in cloud infrastructures. In 2011 11th IEEE/ACM International Symposium
Non-Control-Data Attacks Are Realistic Threats.. In USENIX security symposium, on Cluster, Cloud and Grid Computing. IEEE, 524–533.
Vol. 5. 146. [19] Nick L Petroni Jr, Timothy Fraser, Jesus Molina, and William A Arbaugh. 2004.
[10] Christian Collberg, Clark Thomborson, and Douglas Low. 1997. A taxonomy of Copilot-a coprocessor-based kernel runtime integrity monitor.. In USENIX secu-
obfuscating transformations. Technical Report. Department of Computer Science, rity symposium. San Diego, USA, 179–194.
The University of Auckland, New Zealand. [20] Chenxi Wang, Jonathan Hill, John Knight, and Jack Davidson. 2000. Software
[11] Christian S. Collberg and Clark Thomborson. 2002. Watermarking, tamper- tamper resistance: Obstructing static analysis of programs. Technical Report.
proofing, and obfuscation-tools for software protection. IEEE Transactions on Technical Report CS-2000-12, University of Virginia, 12 2000.
software engineering 28, 8 (2002), 735–746. [21] Claes Wohlin. 2014. Guidelines for snowballing in systematic literature studies
[12] gcc.gnu.org. 2022. GIMPLE. https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html and a replication in software engineering. In Proceedings of the 18th international
[13] gcc.gnu.org. 2022. Static Single Assignment. https://gcc.gnu.org/onlinedocs/ conference on evaluation and assessment in software engineering. 1–10.
gccint/SSA.html [22] Jeff Yan and Brian Randell. 2005. A systematic classification of cheating in online
[14] Github.com. 2022. GNU Compiler Collection. https://github.com/gcc-mirror/gcc games. In Proceedings of 4th ACM SIGCOMM workshop on Network and system
[15] Simon Hansman and Ray Hunt. 2005. A taxonomy of network and computer support for games. 1–9.
attacks. Computers & Security 24, 1 (2005), 31–43. [23] Michael Zalewski. 2001. Ssh1 crc-32 compensation attack detector vulnerability.
[16] Microsoft. 2022. Customize exploit protection. https://learn.microsoft.com/en-
us/microsoft-365/security/defender-endpoint/customize-exploit-protection?