Professional Documents
Culture Documents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Handbook structure . . . . . . . . . . . . . . . . . . . . 1
6. Intermediate usage . . . . . . . . . . . . . . . . . . . . . . . . 52
6.1 Defining globals in Frida’s REPL . . . . . . . . . . . . 52
6.2 Following child processes . . . . . . . . . . . . . . . . 53
6.3 Creating NativeFunctions . . . . . . . . . . . . . . . . 57
6.3.1 Using NativeFunction to call system APIs 58
6.4 Modifying return values . . . . . . . . . . . . . . . . . 61
6.5 Access values after usage . . . . . . . . . . . . . . . . 61
6.6 CryptDecrypt: A practical case. . . . . . . . . . . . . 62
6.7 Modifying values before execution . . . . . . . . . . 64
6.8 Undoing instrumentation . . . . . . . . . . . . . . . . 69
6.9 std::string . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.9.1 std::vector in MSVC . . . . . . . . . . . . . 76
6.10 Operating with ArrayBuffers . . . . . . . . . . . . . . 77
7. Advanced usage . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.1 NOP functions . . . . . . . . . . . . . . . . . . . . . . . 80
7.1.1 Using the replace API . . . . . . . . . . . . 80
7.1.2 Patching memory . . . . . . . . . . . . . . 81
7.2 Memory scanning . . . . . . . . . . . . . . . . . . . . . 82
7.2.1 Reacting on memory patterns . . . . . . . 83
7.3 Using custom libraries (DLL/.so) . . . . . . . . . . . . 86
7.3.1 Creating a custom DLL . . . . . . . . . . . 87
7.3.2 Using our custom library . . . . . . . . . . 87
7.4 Reading and writing registers . . . . . . . . . . . . . . 89
7.5 Reading structs . . . . . . . . . . . . . . . . . . . . . . 91
CONTENTS
8. MacOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8.1 ObjC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8.2 Intercepting NSURL InitWithString . . . . . . . . . . 128
8.3 Obj-C: Intercepting fileExistsAtPath . . . . . . . . . 131
8.4 ObjC: Methods with multiple arguments. . . . . . . 134
8.5 ObjC: Reading a CFDataRef . . . . . . . . . . . . . . 137
8.6 Getting CryptoKit’s AES.GCM.seal data before en-
cryption . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.7 Swift.String . . . . . . . . . . . . . . . . . . . . . . . . . 143
9. r2frida . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.0.1 Testing r2frida . . . . . . . . . . . . . . . . 146
9.1 Tracing functions . . . . . . . . . . . . . . . . . . . . . 147
9.1.1 Tracing functions from imports/exports . 148
9.1.2 Tracing functions by using offsets . . . . 151
9.2 Disassembling functions in memory . . . . . . . . . 155
9.3 Replace return values . . . . . . . . . . . . . . . . . . . 156
9.4 Replacing return values (hijacking) . . . . . . . . . . 158
9.5 Allocating strings . . . . . . . . . . . . . . . . . . . . . 160
CONTENTS
These are the main ‘features’ that make this framework interesting
to us. However, there are some more interesting features such as the
possibility of working in other architectures like ARM⁴ or MIPS⁵, and
the fact that it is possible to make instrumentation software using the
Frida libraries and/or toolkit and use it for commercial purposes.
This table helps illustrate the main advantages of Frida over other
frameworks:
Frida DynamoRIO PIN
Open Source Yes Yes No
Cross-Platform Yes Yes (limited) Yes(limited)
Bindings in Yes No No
different
languages
Write quick Yes No No
instrumentation
tools
Support writing Yes No No
instrumentation
without C
Mobile Support Yes No No
Free Yes Yes No
⁴https://en.wikipedia.org/wiki/ARM_architecture
⁵https://en.wikipedia.org/wiki/MIPS_architecture
Binary instrumentation and Frida 9
The way Ole describes what is Frida is: “the Greasemonkey for
native apps, a dynamic code instrumentation toolkit that lets you
inject snippets of JavaScript or your own library into native apps on
multiple systems”.
For us it means that we can do all the flashy things that are possible to
do with other instrumentation frameworks but faster due to the use of
JavaScript to write instrumentation scripts and with high portability.
Regarding high portability, Frida supports the following Operating
Systems and architectures:
Now that Frida has been briefly introduced, in the next section we’ll
see how instrumentation tools are structured when using Frida and
its role in them.
⁶https://termux.com/
Binary instrumentation and Frida 10
¹⁰http://www.giovanni-rocca.com/dwarf/features/
¹¹https://github.com/dpnishant/appmon
¹²https://github.com/nowsecure/r2frida
¹³https://github.com/andreafioraldi/frida-fuzzer
¹⁴https://github.com/sensepost/objection
4. Frida usage basics
This chapter introduces the basic usage of Frida, which includes
learning how tools based on Frida work but also the usage of the
frida-tools package, Frida’s CLI (Command Line Interface) as well
as making our basic instrumentation scripts.
Before going on, be sure to install frida and frida-tools packages
using Python’s pip:
$ pip install frida frida-tools
The frida package includes the libraries that can be used from Python
and the frida-tools package include the prebuilt command line tools
of Frida. For more information on the frida-tools package refer to
Section 5.2. frida-tools.
Important: From now on, whenever frida is mentioned it refers to
Frida’s CLI. Whenever Frida (in capital letters) is mentioned the text
refers to the toolkit as a whole.
Frida development can be done using JavaScript or TypeScript al-
though the later is transpiled into compatible JavaScript, in the next
section the differences between both are shown.
TypeScript JavaScript
Editor autocompletion Yes No
Extension support Yes Yes, but limited
Error checking on build Yes No
Runtime error checking Yes Yes
techniques but the most common one are trampoline based hooks.
These work by inserting the beginning of a function we want to
instrument, a jump to a function that is in our control of so that
the former is executed instead of the original one. Let’s see a more
graphical example of how this works:
Say there is a program that has a function_A function, and the intent
is to execute the function_B function instead. function_A prologue
is modified and replaced with a JMP instruction to our function_B.
Once function_B code is executed, the trampoline ensures it returns
to the intended function_A execution flow.
4.4 frida-tools
frida-tools is a Python package that offers some CLI tools that
can be used for quick instrumentation and they can be vinstalled by
simply running the following pip:
$ pip install frida-tools
However, in case there are two processes with the same name Frida
will fail because it doesn’t know which process to attach to. To solve
this issue, try to use the PID as much as possible.
-f switch allows to spawn a process given a path. When doing this,
the instrumented binary is spawned by Frida and suspended. Frida
then gives us access to a command line which allows for early
instrumentation that allows to defeat anti-debugging techniques or
inject our instrumentation code before the process is run.
To resume the execution from within the command line simply type
%resume and the process will continue its execution.
Runtimes in Frida
Frida supports running instrumentation scripts using duktape
(an embedded JavaScript Engine), JavaScript V8 Engine and now
in recent versions QuickJS (which replaces duktape). For basic
scripts QuickJS is enough whereas V8 provides better language
features as well as more detailed error logs. Also V8 is more
performant than QuickJS but JS VM exits(this is, everytime our
code has to communicate with the agent) are more expensive than
QuickJS.
Since they are both included when you install Frida you are free
Frida usage basics 22
to choose whichever engine suits your use case; in case you are
having trouble figuring out where exactly your instrumentation
code is failing be sure to check the V8 engine because errors are
more detailed (you can learn how to switch engines in Section 5.3.
frida-trace).
https://duktape.org
https://v8.dev
https://bellard.org/quickjs/
what is needed for the task. It can be fetched from this repository
automagically.
There are other command line switches present in the Frida CLI but
they are either focused on mobile devices or self explanatory. Also
note that some of these command line switches are shared with the
frida-trace tool which is explained in the next section.
4.4.2 frida-trace
frida-trace is a tool that allows us to instrument processes or apps
without the need of writing an instrumentation tool, this one is
essentially based on Frida’s Interceptor API. Its main features are:
• -q:
remove Frida’s API call formatting for each instrumenting
call.
• --runtime: Choose the desired runtime. QuickJS is
recommended for performance and V* for modern JS features
and a more in-depth error reporting.
• --debug: opens Frida’s debug console.
terminal:
$ frida-trace -i "CreateFileW" notepad.exe
1 Instrumenting functions...
2 CreateFileW: Auto-generated handler at "/Users/fernandou/\
3 __handlers__/KERNEL32.DLL/CreateFileW.js"
4 CreateFileW: Auto-generated handler at "/Users/fernandou/\
5 __handlers__/KERNELBASE.dll/CreateFileW.js"
6 Started tracing 2 functions. Press Ctrl+C to stop.
Now there is something that catches our attention, why are two stubs
generated? CreateFileW is present in KERNELBASE.DLL and it stores a
reference to KERNEL32’s CreateFile and, since it was not specified
which module we want to instrument frida-trace instruments both
by default. The next problem is to extract meaningful information
from this API call and for this purpose we can examine the official
Microsoft MSDN documentation¹¹:
1 HANDLE CreateFileW(
2 LPCWSTR lpFileName,
3 DWORD dwDesiredAccess,
4 DWORD dwShareMode,
5 LPSECURITY_ATTRIBUTES lpSecurityAttributes,
6 DWORD dwCreationDisposition,
7 DWORD dwFlagsAndAttributes,
8 HANDLE hTemplateFile
9 );
1 /* TID 0x325c */
2 5877 ms CreateFileW()
3 5877 ms CreateFileW(C:\Users\fdiaz\Documents)
4 5877 ms CreateFileW()
5 5878 ms CreateFileW(C:\Users\fdiaz\Documents\test.dat)
6 5879 ms CreateFileW()
API Description
Memory.allocAnsiString Allocating ANSI strings
(windows-only)
Memory.allocUtf8String Allocating UTF8 strings
Memory.allocUtf16String Allocating UTF16 strings
(windows-only)
When you allocate strings always make them constant to avoid any
problems with the string being wiped from memory at some point
(this might happen because of several reasons, mainly the program
freeing memory regions):
const myTestString = Memory.allocAnsiString(“HELLO WORLD”);
API Description
.readCString Read C-Style strings
.readAnsiString Read ANSI strings
.readUtf8String Read UTF8 strings
.readUtf16String Read UTF16 strings
Dealing with data types with Frida 31
In case it was a C-String and this string is 1024 bytes long it is possible
to pass the size of the string as an argument:
myTestString.readCString(1024);
Frida figures out in most cases where the string ends for each string
type however, when you are sure of the size of the string by all means
share it with Frida! :).
1 DWORD SearchPathW(
2 LPCWSTR lpPath,
3 LPCWSTR lpFileName,
4 LPCWSTR lpExtension,
5 DWORD nBufferLength,
6 LPWSTR lpBuffer,
7 LPWSTR *lpFilePart
8 );
First let’s take a look at the SearchPathW parameters, in this case the
second argument matches lpFileName and its type is LPCWSTR which
means a pointer to a wide string or UTF-16 in case of Windows.
I made an example program to test it out, you can compile it under
Windows using Visual Studio:
Dealing with data types with Frida 32
1 #include <iostream>
2 #include <Windows.h>
3
4 int main()
5 {
6 TCHAR lpBuffer[MAX_PATH];
7 LPWSTR *lpFilePart{};
8 DWORD result;
9
10 result = SearchPath(NULL, L"c:\\windows\\", NULL, MAX\
11 _PATH, lpBuffer, lpFilePart);
12 std::cout << "SearchPath retval: " << result << std::\
13 endl;
14 }
This program can be further modified to test more things if you are
interested but for this basic example we will just check if c:/windows
folder path exists.
It is possible to instrument this application from Frida’s
REPL(command line interface) but first let’s write an instrumentation
script.
As mentioned in Section 4.3 JavaScript vs TypeScript, it is possible
to write instrumentation scripts in JavaScript and TypeScript. For
the time being instrumentation is written in JavaScript but the same
code equivalent in TypeScript is shown(refer to Section 5.10. Writing
our first agent for building the agent with TypeScript).
First, let’s create a file named instrumentation.js. From there, we
need:
Then, we can launch the C++ app we created before with the
instrumentation code:
frida -l instrumentation.js -f searchPathCpp.exe --no-pause
The --no-pause flag means that the app will run right
after the instrumentation code is applied by Frida. The -l
flag sets the instrumentation script.
As you can see, the code is a bit longer (due to types mostly) but
also looks cleaner and clearer. The main difference is that instead of
directly writing Interceptor.attach it is wrapped in a class which
overloads the onEnter callback.
Dealing with data types with Frida 34
5.2 Numbers
It is possible to operate with numbers in a similar fashion as with
strings, but there are some caveats to take into account.
The first and most important one is that we need to know whether the
argument is just a number type or an address to it, because if it is not
an address in memory then we cannot use Frida’s API for numerical
types and if we do we are going to screw up the target process.
Now we are going to see how to read and write these values whether
they are passed by value or by reference.
1 int
2 add(int a, int b)
3 {
4 return a + b;
5 }
If we write:
1 Interceptor.attach(addPtr, {
2 onEnter(args) {
3 console.log("a: " + args[0].toInt32());
4 console.log("b: " + args[1].toInt32());
5 }
6 }
7 );
Then if we try to read these numbers, args[0] and args[1] will point
to a hex representation of the arguments and we can simply call
toInt32() to get the real input.
Dealing with data types with Frida 35
API Description
{}.readInt() Read an Integer from the given address
{}.readUInt() Read an unsigned Integer from the
given address
Read a signed 8-16-32-64 bit integer
from the given address
{}.readShort() Read a short integer from the given
address
{}.readFloat Read a float number from the given
address.
{}.readDouble Read a double number from the given
address
{}.readLong() Read a long number from the given
address
{}.readULong() Read an unsigned long number from the
given address.
{}.readUShort() Read an unsigned short number from
the given address.
Read an unsigned integer from the
given address.
If we try to read args[0] and args[1] in this case, we will only get a
random address that is not understandable for us, but we can use the
hexdump API to see its contents:
1 7ffecdce5c08 a3 1c 00 00 39 05 00 00 c0 91 d0 b3 49 56 0\
2 0 00 ....9.......IV..
1 Interceptor.attach(addPtr, {
2 onEnter(args) {
3 console.log("a: " + args[0].readInt());
4 console.log("b: " + args[1].readInt());
5 }
6 }
7 );
1 Interceptor.attach(addPtr, {
2 onEnter: function(args) {
3 args[0].writeInt(10);
4 args[1].writeInt(20);
5 }
6 });
5.3 Pointers
It is possible to read the address that a pointer is pointing to by using
the readPointer() API. This use case is going to be useful when there
is a pointer to a struct to be read. But a more in depth use case is
covered later in Section 7.4.
The following example shows a use case where the readPointer API
returns useful information. The recvfrom function takes the socklen_-
t argument as it is documented in the man pages¹:
1 // ...
2 onEnter: function (args) {
3 console.log(
4 args[5].readPointer();
5 );
6 }
7 // ...
About NativePointers
Frida is able to interact with pointers thanks to the
NativePointers objects that are present in Frida. The reason
why the NativePointer data type exists is because the JS number
type is backed by double, so it is not able to represent all 64-bit
pointers therefore whenever pointers are used in Frida they are
always backed by this data type.
myBaseAddr = Module.findBaseAddress('myLib.so');
This API returns a pointer in case a valid export is found and null
in case nothing matches.
Dealing with data types with Frida 40
5.5.1 findExportByName vs
getExportByName
It is important to notice (specially if we are using autocomplete) that
there are two methods which seem to be similar, these are Mod-
ule.getExportByName and Module.findExportByName - The main
difference resides in what will happen if an export is not found.
.getExportByName will throw an exception in case the export is
not found whereas .findExportByName will simply return null. I
recommend using .getExportByName to be able to spot errors but
if you want to use .findExportByName be sure to check the return
values.
1 [Local::]-> Memory.allocUtf8String('foo')
2 "0x7f81143f6be0"
In this case the datatype has access to the .unwrap() method which
returns a pointer that points to the first element of the ArrayBuffer:
Dealing with data types with Frida 41
1 [Local::]-> test.unwrap()
2 "0x7fc17c210930"
Size of pointers
The size of pointers is something that must be taken
into account when performing more complex operations
and to ensure that an instrumentation script is portable
enough.
The API Process.pointerSize returns the size of a pointer
in bytes of the process that is being instrumented. This will
be needed in later sections like Section 6.3 and Section 6.4
1 {
2 offset: number,
3 length: number,
4 header: true|false,
5 ansi: true|false,
6 }
For this quick example, write a simple “hello world” program, com-
pile it and fire it up in Frida. Once we are in it, we will call
Process.enumerateModules() and get the one matching our binary:
Dealing with data types with Frida 42
1 $ clang hello.c
2 $ frida -f a.out
3
4 [Local::a.out]-> Process.enumerateModulesSync()
5 [
6 {
7 "base": "0x1072f1000",
8 "name": "a.out",
9 "path": "/Users/fernandou/Desktop/a.out",
10 "size": 16384
11 },
12 ...
13 ]
Then we get the base address of our binary, which we can now print
using Frida:
You are free to use custom options in case you want to start at a
different offset or need longer lengths.
Beware that frida-create will create the agent in the current work-
ing directory. The output we should be getting should be this:
The first time we create the agent we need to run npm install to
bootstrap. Then, when we want to build our agent we will run:
npm run build
And it will create a file _agent.js with the instrumentation script for
us to use. It is also possible to run npm run watch to get live-reload
when a file is saved.
When we are finished with our control script, we have two options:
Although we can use REPL for quick tests, we will go the long way
now by writing a control script.
17 pid = device.spawn(process_name)
18 print('pid: %d' % pid)
19
20 session = device.attach(pid)
21
22 script = session.create_script(code)
23 script.on('message', on_message)
24 script.load()
25
26 device.resume(pid)
27
28 print('Press CTRL-Z to stop execution.')
29 sys.stdin.read()
30 session.detach()
31
32 if __name__ == '__main__':
33 main(sys.argv[1])
Dealing with data types with Frida 46
ically.
Let’s explain the most important parts of this script:
The on_message callback will receive the messages from the agent, we
will print them and avoid handling them for now.
1 device = frida.get_local_device()
2 pid = device.spawn(process_name)
3 print('pid: %d' % pid)
1 session = device.attach(pid)
2
3 script = session.create_script(code)
4 script.on('message', on_message)
5 script.load()
6
7 device.resume(pid)
To finish with this section, in the next one let’s take a look at remote
instrumentation.
First, let’s set up the remote server. What is needed is only the
frida-server binary so it can be downloaded from the aforemen-
tioned github releases page:
1 $ wget https://github.com/frida/frida/releases/download/1\
2 4.2.14/frida-server-14.2.14-linux-x86_64.xz`
²https://github.com/frida/frida/releases
Dealing with data types with Frida 50
1 file frida-server-14.2.14-linux-x86_64
2 chmod +x frida-server-14.2.14-linux-x86_64
Now that the server-side part is covered, we can finally get to the the
client part(our local computer).
The -H flag in frida and frida-trace allows us to connect to an
specific address/port combination:
frida -H IP:PORT
or
frida-trace -H IP:PORT
In this example, we open the /bin/ls binary in the remote server and
try to obtain the Process.pointerSize of it. In this case, the syntax
would be:
frida -H 192.168.1.101 -f /bin/ls
or
(global as NativePointer).CreateFileWPtr = CreateFileWPtr
And once we run our small script we can access our variable:
20 ._reactor.schedule(lambda: self._on_child_added(child)))
21 self._device.on("child-removed", lambda child: se\
22 lf._reactor.schedule(lambda: self._on_child_removed(child\
23 )))
24 self._device.on("output", lambda pid, fd, data: s\
25 elf._reactor.schedule(lambda: self._on_output(pid, fd, da\
26 ta)))
27
28 def run(self):
29 self._reactor.schedule(lambda: self._start())
30 self._reactor.run()
31
32 def _start(self):
33 argv = ["/bin/sh", "-c", "cat /etc/hosts"]
34 env = {
35 "BADGER": "badger-badger-badger",
36 "SNAKE": "mushroom-mushroom",
37 }
38 print("� spawn(argv={})".format(argv))
39 pid = self._device.spawn(argv, env=env, stdio='pi\
40 pe')
41 self._instrument(pid)
42
43 def _stop_if_idle(self):
44 if len(self._sessions) == 0:
45 self._stop_requested.set()
46
47 def _instrument(self, pid):
48 print("[*] attach(pid={})".format(pid))
49 session = self._device.attach(pid)
50 session.on("detached", lambda reason: self._react\
51 or.schedule(lambda: self._on_detached(pid, session, reaso\
52 n)))
53 print("[*] enable_child_gating()")
54 session.enable_child_gating()
55 print("[*] create_script()")
56 script = session.create_script("""\
57 Interceptor.attach(Module.getExportByName(null, 'open'), {
58 onEnter: function (args) {
59 send({
60 type: 'open',
61 path: Memory.readUtf8String(args[0])
Intermediate usage 55
62 });
63 }
64 });
65 """)
66 script.on("message", lambda message, data: self._\
67 reactor.schedule(lambda: self._on_message(pid, message)))
68 print("[*] load()")
69 script.load()
70 print("[*] resume(pid={})".format(pid))
71 self._device.resume(pid)
72 self._sessions.add(session)
73
74 def _on_child_added(self, child):
75 print("[+] child_added: {}".format(child))
76 self._instrument(child.pid)
77
78 def _on_child_removed(self, child):
79 print("[-] child_removed: {}".format(child))
80
81 def _on_output(self, pid, fd, data):
82 print("[*] output: pid={}, fd={}, data={}".format\
83 (pid, fd, repr(data)))
84
85 def _on_detached(self, pid, session, reason):
86 print("[-] detached: pid={}, reason='{}'".format(\
87 pid, reason))
88 self._sessions.remove(session)
89 self._reactor.schedule(self._stop_if_idle, delay=\
90 0.5)
91
92 def _on_message(self, pid, message):
93 print("[*] message: pid={}, payload={}".format(pi\
94 d, message["payload"]))
95
96
97 app = Application()
98 app.run()
When we are finished writing the script, we can run it and get the
following output:
Intermediate usage 57
Let’s do a quick example, say way have the following add function:
Intermediate usage 58
1 int
2 add(int a, int b)
3 {
4 return a + b;
5 }
And we want to call in our own terms at will, how do we create the
native function? We are not taking into account for now the ABI.
This leaves us with a=int, b=int, return=int. So, we can build our own
NativeFunction now:
new NativeFunction(ptr(my_address), 'int', ['int', 'int'])
In case that you are sure the ABI is _fastcall for example, you can
add the calling convention parameter:
new NativeFunction(ptr(my_address), 'int', ['int', 'int'],
'fastcall)
1 #include <sys/types.h>
2 #include <sys/stat.h>
3 #include <unistd.h>
4 #include <stdio.h>
5
6 void
7 main()
8 {
9 mkdir("/home/fernandou/frida/test_folder", 0700);
10
11 struct stat st = { 0 };
12 if (stat("/home/fernandou/frida/test_folder", &st) ==\
13 -1)
14 {
15 puts("Folder does not exist.\n");
16 }
17 else
18 {
19 puts("Folder exists.\n");
20 }
21 }
Once we are inside the program, we first need a pointer to the mkdir
API:
1 class myInstrumentedFunction{
2 firstParam:NativePointer = null;
3
4 onEnter(args:NativePointer[]) {
5 this.firstParam = args[0];
6 }
7
8 onLeave(retval:NativeReturnValue) {
9 console.log(this.firstParam.readCString())
10 }
11 }
1 BOOL CryptDecrypt(
2 HCRYPTKEY hKey,
3 HCRYPTHASH hHash,
4 BOOL Final,
5 DWORD dwFlags,
6 BYTE *pbData,
7 DWORD *pdwDataLen
8 );
MSDN notes:
Intermediate usage 63
1 pbData
2
3 A pointer to a buffer that contains the data to be decryp\
4 ted. After the decryption has been performed, the plainte\
5 xt is placed back into this same buffer.
6
7 The number of encrypted bytes in this buffer is specified\
8 by pdwDataLen.
So we have an hKey but that is not the most important argument for
us in this, it is the *pbData pointer and the *pwdDataLen pointer. The
way this API works is that once the function body is executed and
we are in the onLeave or return stage the pbData pointer which is
initially encrypted is decrypted and we can read it.
To achieve this, we will need to store the pointers of pbData and
pdwDataLen to be able to access them later on.
1 class CryptDecrypt {
2 buffer_size? : NativePointer;
3 buffer? : NativePointer;
4
5 onEnter (args:NativePointer[]) {
6 this.buffer = args[4];
7 this.buffer_size = args[5];
8 }
9
10 onLeave (retval:InvocationReturnValue) {
11 this.buffer.readCString(this.buffer_size);
12 }
13 }
we can work with the hexdump API specially if we know about the
length:
1 class CryptDecrypt {
2 buffer_size?:NativePointer;
3 buffer?:NativePointer;
4
5 onEnter (args:NativePointer[]) {
6 this.buffer = args[4];
7 this.buffer_size = args[5];
8 }
9
10 onLeave (retval:InvocationReturnValue) {
11 let buffer_size;
12
13 if (this.buffer_size) {
14 hexdump(this.buffer,{ length:
15 this.buffer_size.readPointer().toInt32()
16 });
17 }
18 }
19 }
1 #include <sys/types.h>
2 #include <sys/stat.h>
3 #include <unistd.h>
4 #include <stdio.h>
5
6 void
7 main()
8 {
9 struct stat st = { 0 };
10 if (stat("/bin/ls", &st) == -1)
11 {
12 puts("File does not exist.\nInstalling our own bu\
13 sybox binaries");
14 // execute real code
15 }
16 else
17 {
18 puts("Folder exists.\n");
19 // exit without doing anything
20 }
21 }
This program is similar to the one that we have seen before, it calls
stat to check if the file exists and if it doesn’t it continue its execution.
In this case, our first idea would be to get the pointer to the stat
function but that will lead us to an error. Frida will give us a valid
pointer to stat but that address is not the one that is going to be
called in the end. For this, we will check with a disassembler (radare2
in my case) and Frida:
Intermediate usage 66
1 [Local::a.out]-> Module.enumerateImports("a.out")
2 [
3 {
4 "address": "0x7fff2051725c",
5 "module": "/usr/lib/libSystem.B.dylib",
6 "name": "dyld_stub_binder",
7 "slot": "0x103229000",
8 "type": "function"
9 },
10 {
11 "address": "0x7fff205456f8",
12 "module": "/usr/lib/libSystem.B.dylib",
13 "name": "memset",
14 "slot": "0x10322d000",
15 "type": "function"
16 },
17 {
18 "address": "0x7fff20410274",
19 "module": "/usr/lib/libSystem.B.dylib",
20 "name": "puts",
21 "slot": "0x10322d008",
22 "type": "function"
23 },
Intermediate usage 68
24 {
25 "address": "0x7fff204ca39c",
26 "module": "/usr/lib/libSystem.B.dylib",
27 "name": "stat$INODE64",
28 "slot": "0x10322d010",
29 "type": "function"
30 }
31 ]
1 {
2 "address": "0x7fff204ca39c",
3 "module": "/usr/lib/libSystem.B.dylib",
4 "name": "stat$INODE64",
5 "slot": "0x10322d010",
6 "type": "function"
7 }
1 #include <sys/types.h>
2 #include <sys/stat.h>
3 #include <unistd.h>
4 #include <stdio.h>
5
6 void
7 check_file(char* path)
8 {
9 struct stat st = { 0 };
10 if (stat(path, &st) == -1)
11 {
12 printf("File [%s] does not exist.\n", path);
13 }
14 else
15 {
16 printf("File [%s] does not exist.\n", path);
17 }
18 }
19 void
20 main()
21 {
22 check_file("/bin/ls");
23 check_file("/bin/cd");
24 }
6.9 std::string
Something that is very interesting to us is the ability to read
strings, however this is not always possible by simply calling Frida’s
readUtf8String/readCString built-ins due to the different ways a
string can be represented. For example, Window’s UNICODE_STRING
is defined in a struct as follows:
1 #include <iostream>
2
3 void print_std_string(std::string arg_1) {
4 std::cout << arg_1 << std::endl;
5 }
6
7 int
8 main(void) {
9 std::string my_string =
10 "Frida is great,"
11 " you should check it out at frida.re";
12 print_std_string(my_string);
13 return 0;
14 }
1 Interceptor.attach(Module.getExportByName(null, '_Z16prin\
2 t_std_stringNSt3__112basic_stringIcNS_11char_traitsIcEENS\
3 _9allocatorIcEEEE'), {
4 onEnter(args) {
5 const LSB = args[0].readU8() & 1;
6 console.log('LSB: ' + LSB);
7 const stdString = args[0].
8 add(Process.pointerSize * 2).
9 readPointer().
10 readUtf8String();
11 console.log("std::string: " + stdString);
12 }
13 });
Then, we can run this small script and get the following output:
Intermediate usage 74
1 LSB: 1
2 std::string: Frida is great, you should check it out at f\
3 rida.re
4 [Local::a.out]-> Process terminated
Every 4 bytes there is a member of the vector from 0x01 to 0x0a. The
tail of this vector can be obtained by offsetting the original pointer
by the Process.pointerSize (in this case a 64-bit application the
pointerSize=8). Because every member of the vector is an int every
4 bytes of the vector it is possible to obtain the value of the it.
In other words:
To test this out, the following code will iterate every member of the
std::vector and store them in a list:
1 [Local::a.out ]->
2 1,2,3,4,5,6,7,8,9,10
3 vector_size:10
• The pointer that points to the address of the vector head is placed
at Process.pointerSize
• The pointer that points to the address of the vector tail is placed
at double the Process.pointerSize
20 }
21 })
1 [Local::vectortest.exe ]->
2 1,2,3,4,5,6,7,8,9,10,11
3 vector_size:11
1 this.buffer_size = args[1].readCString().length + 1;
2 console.log("buffer_size:" + this.buffer_size);
3 this.arrayBuf = args[1].readByteArray(this.buffer_size);
Once the string is modified the str2ab function transforms the string
back to an Uint8Array but we cannot just reassign this Uint8Array
to args[1] because it is expecting a pointer. To do so, Frida has an
auxiliary method called .unwrap() that returns a pointer to the first
element of the ArrayBuffer.
Then, it is possible to verify the output:
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <fcntl.h>
4
5 int
6 main(int argc, char *argv[])
7 {
8 int fd;
9 fd = open("code.dat", O_RDONLY);
10 if (fd == -1)
11 {
12 fprintf(stderr, "file not found\n");
13 }
14
15 return 0;
16 }
This pattern will match any “Frida _____!” pattern found in memory
and return it in a list.
Now, we will use our previous example that checks whether a file ex-
ists or not and we will ask it to search for a file named “Frida rocks!”.
We will use the Memory.scanSync to find any pattern containing frida
_____! in memory.
We first fire up our Frida REPL and get the information of the first
module:
We now have the bin variable that stores the base address of the
module, path, and size. Once we have this information, we can scan
using the previous pattern string:
Advanced usage 83
It is also possible to use partial wild cards instead of ??, you can use
a single ? to pair it up: 46 2? 21.
1 #include <stdio.h>
2 #include <time.h>
3 #include <unistd.h>
4
5 struct keyPress {
6 int key_type;
7 int timestamp;
8 int scan_code;
9 int virtual_scan_code;
10 };
11
12 void guess_pressed_key(struct keyPress* p)
13 {
14 printf("key_type: %d scan_code: %d\n", p->key_typ\
15 e, p->scan_code);
16 sleep(5);
17 if(p->scan_code == 52)
18 {
19 printf("arrow up\n");
20 }
21
22 if(p->scan_code == 51)
23 {
24 printf("arrow right\n");
25 }
26 }
27
28 int main()
29 {
30 struct keyPress kp;
31 kp.key_type = 301;
32 kp.timestamp = (int)time(NULL);
33 kp.scan_code = 52;
34 kp.virtual_scan_code = 52;
35 printf("%p\n", guess_pressed_key);
36 guess_pressed_key(&kp);
37 return 0;
38 }
The aforementioned code takes a simple struct and prints “arrow up”
or “arrow right” depending on the value of scan_code which is an
int member of a struct. The main idea behind this example is to get
Advanced usage 85
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main() {
5 FILE *fp;
6 fp = fopen("file.txt", "w+");
7 fprintf(fp, "%s %s", "May the force", "be with you");
8 fclose(fp);
9 return 0;
10 }
1 #include <stdlib.h>
2 #include <stdio.h>
3
4 FILE *my_fopen(const char *filename, const char *mode) {
5 printf("lib: %s\n", filename);
6 return fopen(filename, mode);
7 }
1 #include <stdlib.h>
2 #include <stdio.h>
3
4 FILE *my_fopen(const char *filename, const char *mode);
Once these files are created, then the only remaining task is using
clang to create a shared library:
1 file libtest.o
2 libtest.o: ELF 64-bit LSB shared object, x86-64, version \
3 1 (SYSV), dynamically linked, not stripped
When Frida loads the module it returns a module object that now
operates as is. For example it is possible to enumerate the loaded
library exports:
1 [Local::a.out]-> myModule.enumerateExports()
2 [
3 {
4 "address": "0x7f0cf409d120",
5 "name": "my_fopen",
6 "type": "function"
7 }
8 ]
1 [Local::a.out]-> %resume
2 [Local::a.out]-> lib: file.txt
3 lib: /dev/urandom
The custom library function correctly prints the values of the first
argument.
1 #include <stdio.h>
2
3 int add(int a, int b) {
4 return a + b;
5 }
6
7 void
8 main()
9 {
10 printf("result: %d", add(10, 20));
11 }
This program simply calls the function add(int a, int b) and returns
the sum of them.
Say we are on ARM, once this function is called a is stored in the x0
register and b is stored in the x1 register. We can quickly check this
is true by writing the following script:
Since we are loading using the LDR instruction the number 1337 into
the x0 register when we add a RET instruction the caller will use
whatever is stored in x0 as the return value.
Now, we can run the script:
And you can see we have easily modified the function in memory!
For a complete list of methods available, refer to:
https://frida.re/docs/javascript-api/#arm64writer
1 struct myStruct
2 {
3 short member_1;
4 int member_2;
5 int member_3;
6 char *member_4;
7 } sample_struct;
In order to gather the offsets we need to figure out the sizes of each
type, a short list:
1 {
2 "short": 4,
3 "int": 4,
4 "pointer": Process.pointerSize,
5 "char": 1,
6 "long": Process.pointerSize,
7 "longlong": 8,
8 "ansi": Process.pointerSize,
9 "utf8": Process.pointerSize,
10 "utf16": Process.pointerSize,
11 "string": Process.pointerSize,
12 "float": 4,
13 };
1 struct myStruct
2 {
3 short member_1; // 0x0 (4 bytes)
4 int member_2; // 0x4 (4 bytes)
5 int member_3; // 0x8 (4 bytes)
6 char *member_4; // 0x12 (8 bytes)
7 } sample_struct;
How can we check this is true for each type? We can compile a test
program and get these values from sizeof().
So, now we have the offsets of the structure and we want to read each
value. In this case we will use the .add() operator.
.add() as the name says adds an offset to a given NativePointer.
Therefore, we can place our pointer in the desired offset to read each
value:
1 // Given s = args[0]:NativePointer
2
3 s.readShort() // 1st member.
4 s.add(4).readInt() // 2nd member.
5 s.add(8).readInt() // 3rd member.
6 s.add(12).readPointer().readCString(); // 4th member.
This way we will have obtained the values for each structure offset.
Next, we will try to parse a linux SYSCALL struct.
From this we can quickly figure out that timeval and timezone are
two structs. And we cannot check what these values are by simply
using Frida’s API.
The timeval struct is:
1 struct timeval {
2 time_t tv_sec; /* seconds */
3 suseconds_t tv_usec; /* microseconds */
4 };
The time_t size is even dependent on the API level you are
targeting in Android systems. Do not forget to get it’s size with
Process.PointerSize()
1 struct timezone {
2 int tz_minuteswest; /* minutes west of Greenwich \
3 */
4 int tz_dsttime; /* type of DST correction */
5 };
For this example we will write a simple command and compile it with
clang:
Advanced usage 95
1 #include <sys/time.h>
2 #include <stdio.h>
3
4 int
5 main()
6 {
7 struct timeval current_time;
8 gettimeofday(¤t_time, NULL);
9 printf("seconds : %ld\nmicro seconds : %ld\n",
10 current_time.tv_sec, current_time.tv_usec);
11
12 printf("%p", ¤t_time);
13 getchar();
14 return 0;
15 }
And run: clang -Wall program.c. The expected output should be:
1 pala@jkded:~/code$ ./a.out
2 seconds : 1601394944
3 micro seconds : 402896
4 0x7fff4a1f8d48
So, given this we will try to access the time_t structure given
0x7fff4a1f8d48 is the structure pointer:
1 [Local::a.out]-> Process.pointerSize
2 8
Now that we know that the pointerSize is 8, we can infer that long’s
size will be 8 bytes and place ourselves in the right offset.
Advanced usage 96
Wow! Quite a complicated struct that we have here right? Let’s first
find the size of each offset, especially the ones that can be troublesome
such as LPVOID.
On a Windows 10 64-bit system compiled for 32-bit under Visual
C++ we get the following values:
Advanced usage 97
Type Size
WORD 2
DWORD 4
DWORD_PTR 4
LPVOID 4
1 [Local::ConsoleApplication2.exe]-> Process.pointerSize
2 4
Type Size
WORD 2
DWORD 4
DWORD_PTR 8
LPVOID 8
1 dwPageSize
2 lpMinimumApplicationAddress
3 dwNumberOfProcessors
And this is the example program that we will be using to test our
guesses:
1 #include <iostream>
2 #include <Windows.h>
3 int main()
4 {
5 SYSTEM_INFO sysInfo ;
6 GetSystemInfo(&sysInfo);
7 printf("%p", &sysInfo);
8 getchar();
9 }
Now that we have the complete offset list, we can know get
the values of dwPageSize, lpMinimumApplicationAddress, and
dwNumberOfProcessors respectively:
Advanced usage 99
1 [Local::ConsoleApplication2.exe]-> sysInfoPtr.add(4).read\
2 Int()
3 4096
4 [Local::ConsoleApplication2.exe]-> sysInfoPtr.add(8).read\
5 Int()
6 65536
7 [Local::ConsoleApplication2.exe]-> sysInfoPtr.add(20).rea\
8 dInt()
9 8
1 int _stat(
2 const char *path,
3 struct _stat *buffer
4 );
With clang, we can get the record layout with two steps:
Advanced usage 100
Which will generate a file that can be later used with the -cc1
parameter:
clang -cc1 -fdump-record-layouts ptest.c
7.9 CModule
The CModule API allows us to pass a string of C code and compile it
to machine code in memory. It is important to note however that this
feature compiles under tinycc¹ and thus is somewhat limited.
CModule is useful to implement functions that need to run in the
highest performance mode. It is also useful to implement hot call-
backs for Interceptor and Stalker with the objective of increasing
performance or easier interaction with C objects and pointers.
CModule syntax:
new CModule(source, [, symbols])
1 void init(void)
2 void finalize(void)
¹https://bellard.org/tcc/
Advanced usage 102
In this use case our aim is to be able to read the timeval structure
with ease, however as we mentioned before we do not have access to
Advanced usage 103
1 #include <gum/guminterceptor.h>
2 #include <stdio.h>
3 #include <sys/time.h>
4
5 typedef struct _IcState IcState;
6 struct _IcState
7 {
8 void * arg;
9 };
10
11 void onEnter(GumInvocationContext *ic){
12 IcState * is = GUM_IC_GET_INVOCATION_DATA(ic, IcState);
13 is->arg = gum_invocation_context_get_nth_argument(ic, 0\
14 );
15 printf("%p\\n", is->arg);
16 }
17
18 void onLeave(GumInvocationContext * ic)
19 {
20 IcState * is = GUM_IC_GET_INVOCATION_DATA(ic, IcState);
21 printf("%p\\n", is->arg);
22 struct timeval * t = (struct timeval*)is->arg;
23 printf("timeval: %ld\\n\\n", t->tv_sec, t->tv_usec);
24 }
With this context we are able to use auxiliary functions of the gum
API such as GUM_IC_GET_INVOCATION_DATA which we will use for ini-
Advanced usage 105
and then we are able to use it in our onLeave callback, but first we
need to cast the argument so that we are able to use the struct:
struct timeval * t = (struct timeval*)is->arg;
And then we are able to access the timeval struct argument with
t.tv_secs and t.tv_usecs.
1 [Local::a.out]-> %resume
2 [Local::a.out]-> cmodule struct pointer: 0x7ffd8e826e00
3 Myprogram struct pointer 0x7ffd8e826e00
4 cmodule timeval: 1612343654
5 cmodule usec: 263111
6 myprogram seconds : 1612343654
7 myprogram micro seconds : 263111
1 void
2 onLeave(GumInvocationContext * ic)
3 {
4 int retval;
5 retval = (int) gum_invocation_context_get_return_valu\
6 e(ic);
7
8 printf("=> return value=%d\\n", retval);
9 }
This example assumes that the return value is an integer but there is
however a cleaner way to solve this:
Advanced usage 106
1 void
2 onLeave(GumInvocationContext * ic)
3 {
4 const int retval = GPOINTER_TO_INT(gum_invocation_con\
5 text_get_return_value(ic));
6
7 printf("=> return value=%d\\n", retval);
8 }
1 #include <stdio.h>
2 #include <math.h>
3 #include <stdlib.h>
4 #include <time.h>
5
6 double local_sqrt(double a) {
7 return sqrt(a);
8 }
9
10 int main() {
Advanced usage 107
11 clock_t t;
12 t = clock();
13 for(int i = 0; i < 100000; i++) {
14 local_sqrt((double)i);
15 }
16 t = clock() - t;
17 double total_time = (double)t / CLOCKS_PER_SEC;
18 printf("Time ellapsed: %f", total_time);
19
20 return 0;
21 }
This program just takes a number from the for iteration and cal-
culates its square root. When executed without instrumentation it
takes 0.002 seconds to complete.
Now, to test how instrumentation affects performance the following
instrumentation script is used:
1 Interceptor.attach(localSqrtPtr, cm);
2
3 setTimeout(() => {
4 console.log("sqrt value after 2 seconds: " + sqrtReturn\
5 Ptr.readDouble());
6 }, 2000)
1 setTimeout(() => {
2 printCurrentValue();
3 }, 1000);
This function can now be called from the CModule side this way
notify_from_c(&value);. The next step is adding in the JS side
the CModule symbols a callback that receives the value from the
notify_from_c function and acts on it. This is done by expanding
the symbols argument in the CModule constructor:
With this set the onLeave callback in our CModule will call the
notify_from_c function whenever the square root value modulus of
10000 is zero:
15 notify_from_c(&sqrtReturnPtr);
16 }
17 }
18 `, {
19 sqrtReturnPtr,
20 notify_from_c: new NativeCallback(notifyPtr => {
21 const notifyValue = notifyPtr.readDouble();
22 console.log('notification from C code: ' +
23 notifyValue);
24 }, 'void', ['pointer'])
25 });
When executing this script against the target program we get the
following output:
1 $ ls
2 include/ meson.build test.c
3 $ ls include/
4 capstone.h glib.h gum/ json-glib/ platform.\
5 h x86.h
6 $ ls include/gum/
7 arch-x86/ guminterceptor.h gummetalarray.h gum\
8 modulemap.h gumspinlock.h
9 gumdefs.h gummemory.h gummetalhash.h gum\
10 process.h gumstalker.h
1 #include <gum/guminterceptor.h>
2
3 static void frida_log (const char * format, ...);
4 extern void _frida_log (const gchar * message);
5
6 void
7 init (void)
8 {
9 frida_log ("init()");
10 }
11
12 void
13 finalize (void)
14 {
Advanced usage 115
15 frida_log ("finalize()");
16 }
17
18 void
19 on_enter (GumInvocationContext * ic)
20 {
21 gpointer arg0;
22
23 arg0 = gum_invocation_context_get_nth_argument (ic, 0);
24
25 frida_log ("on_enter() arg0=%p", arg0);
26 }
27
28 void
29 on_leave (GumInvocationContext * ic)
30 {
31 gpointer retval;
32
33 retval = gum_invocation_context_get_return_value (ic);
34
35 frida_log ("on_leave() retval=%p", retval);
36 }
37
38 static void
39 frida_log (const char * format,
40 ...)
41 {
42 gchar * message;
43 va_list args;
44
45 va_start (args, format);
46 message = g_strdup_vprintf (format, args);
47 va_end (args);
48
49 _frida_log (message);
50
51 g_free (message);
52 }
We can now have the basic methods include or modify them to suit
our needs, we also have access to GumInvocationContext members
Advanced usage 116
and type-checking.
To build the CModule, the following commands are required:
$ meson build && ninja -C build
7.12 Stalker
Stalker is a code tracing engine which allows following threads and
capture every function, block and instruction being called. Explaining
how a code tracer works is out of the scope of this book, however if
you are interested you can read about the anatomy of a code tracer².
It is possible to run stalker directly using C (via frida-gum) but we
will focus on using it from JS. This is the basic syntax of Stalker (to
follow what is happening on a thread):
Stalker.follow([threadId, options])
1 events: {
2 call: true, // CALL instructions: yes please
3 ret: false, // RET instructions
4 exec: false, // all instructions
5 block: false, // block executed: coarse execution tra\
6 ce
7 compile: false // block compiled: useful for coverage
8 }
Only use the exec option when you are sure you need it because it
takes a huge impact on performance and it is a lot of data to digest
for Frida.
²https://medium.com/@oleavr/anatomy-of-a-code-tracer-b081aadb0df8
Advanced usage 117
1 [
2 {
3 "context": {
4 "pc": "0x113341568",
5 "r10": "0x10f363000",
6 "r11": "0x246",
7 "r12": "0x10f363578",
8 "r13": "0x0",
9 "r14": "0x1133e3298",
10 "r15": "0x1133eb070",
11 "r8": "0x31",
12 "r9": "0x0",
13 "rax": "0x1133c7132",
14 "rbp": "0x7ffee089c8d0",
15 "rbx": "0x3722d28603514",
16 "rcx": "0x10f363000",
17 "rdi": "0x1133e46e0",
18 "rdx": "0x0",
19 "rip": "0x113341568",
20 "rsi": "0x4",
21 "rsp": "0x7ffee089bab8",
22 "sp": "0x7ffee089bab8"
23 },
24 "id": 1031,
25 "state": "waiting"
26 }
27 ]
1 Interceptor.attach(myInstrumentedFunction, {
2 onEnter (args) {
3 Stalker.follow(this.threadId, {
4 // ...
5 });
6 // ...
7 }
8 onLeave (retval) {
9 Stalker.unfollow(this.threadId);
10 }
11 });
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <fcntl.h>
4 #include <unistd.h>
5
6 int
7 main(int argc, char *argv[])
8 {
9 pause();
10 int fd;
11 fd = open("code.dat", O_RDONLY);
12 if (fd == -1)
13 {
14 fprintf(stderr, "file not found\n");
15 }
16
17 return 0;
18 }
Advanced usage 119
Once we compile it, we will open it in Frida’s REPL and check its
exports:
1 [Local::a.out]-> Module.enumerateExportsSync("a.out")
2 [
3 {
4 "address": "0x10bfa8000",
5 "name": "_mh_execute_header",
6 "type": "variable"
7 },
8 {
9 "address": "0x10bfabf00",
10 "name": "main",
11 "type": "function"
12 }
13 ]
1 onEnter (args) {
2 Stalker.follow(this.threadId, {
3 events: {
4 call: true,
5 ret: false,
6 exec: false,
7 block: false,
8 compile: false,
9 },
And we will get the following output displaying each called address
and the total times it was called (for illustration purposes, only the
summary will be properly displayed):
Advanced usage 122
Advanced usage 123
1 push rbp
2 mov rbp, rsp
3 push r15
4 push r14
5 push r13
6 push r12
7 push rbx
8 sub rsp, 0xb8
9 mov qword ptr [rbp - 0xd8], r9
10 mov r12, r8
Advanced usage 124
And we can get a trace of all the instructions being executed in real-
time. It is also possible to filter out given a certain mnemonic:
1 jne 0x7fff20409785
2 jne 0x7fff20409753
3 jne 0x7fff20410fe3
4 jne 0x7fff2040c0e0
5 jne 0x7fff20416095
6 jne 0x7fff204399fc
7 jne 0x7fff204f9259
8 jne 0x7fff204f93f7
9 jne 0x7fff204f9375
10 jne 0x7fff204f93e1
11 jne 0x7fff204f9407
12 jne 0x7fff204f937e
13 jne 0x7fff2041007b
Advanced usage 125
And returns:
Advanced usage 126
1 RET @ 0x10ac8ee8a
2 RET @ 0x10ac8ee8a
3 RET @ 0x10ac8ee8a
4 RET @ 0x10ac8ee8a
5 RET @ 0x10ac8ee8a
6 RET @ 0x10ac8ee8a
7 RET @ 0x10ac8901e
8. MacOS
Although I am not yet very familiar with MacOS and Swift, I thought
it was interesting to write about using Frida with MacOS (and this
translates into knowledge for iOS) with some example applications.
It is important to notice that working with MacOS and iOS apps is
a bit different from what we have seen until now, in the sense that
ObjC classes and methods syntax are different.
8.1 ObjC
The ObjC object allows us to access a variety of useful information:
ObjC.classes.NSString.stringWithString
Which we can use: ObjC.classes.NSTring.stringWithString_-
("foobar");
1 ObjC.classes.NSString.$ownMethods.slice(0, 10)
2 [
3 "+ NSStringFromLSInstallPhase:",
4 "+ NSStringFromLSInstallState:",
5 "+ NSStringFromLSInstallType:",
6 "+ stringWithUTF8String:",
7 "+ stringWithFormat:",
8 "+ string",
9 "+ allocWithZone:",
10 "+ initialize",
11 "+ supportsSecureCoding",
12 "+ stringWithCharacters:length:"
13 ]
1 [Local::objCLI]-> ObjC.classes.NSString.$moduleName
2 "/System/Library/Frameworks/Foundation.framework/Versions\
3 /C/Foundation"
13 return
14 }
15 print("The Response is : ",response)
16 print(data);
17 }
18
19 task.resume()
Our first option here is using frida-trace to see which NSURL* classes
and methods are being called, the syntax for this however is different
from what we have seen until now.
frida-trace -f swiftApp -m "-[NSURL **]"
Which will in turn create a .js file for each handler it has detected and
print us something like this:
1 /* TID 0x407 */
2 1003 ms -[NSURL initWithString:0x7fa35550bb40]
3 1008 ms | -[NSURL initWithString:0x7fa35550bb40 rel\
4 ativeToURL:0x0]
5 1008 ms -[NSURL isFileReferenceURL]
6 1008 ms | -[NSURL _cfurl]
7 1008 ms -[NSURL retain]
And frida-trace has generated a stub file for us to fill now, but there
is something catching our attention here and that is that frida’s stub
is printing us ${args[2]}, why is that?
When we intercept ObjC objects, we need to take into account that
the args[] array does not contain elements the same way it would on
Windows or Linux binaries. This array instead stores args[0]->self
args[1]->selector and args[2+(n-1)]->arguments.
Which translates into us having to work directly with args[2] instead
to get the first argument and use this address to create ObjC.Object’s
instead.
Note: In case that the above formula is not clear if a function/method
has 2 arguments the second argument address will be placed at
args[3] in the args[] array (instead of the usual args[1]). - We will
have an example of this later on.
We can now fill the stub using ObjC.Object and the address provided
by args[2] as follows:
MacOS 131
1 /* TID 0x407 */
2 1003 ms -[NSURL initWithString:0x7fa35550bb40]
3 1003 ms http://www.stackoverflow.com
4 1008 ms | -[NSURL initWithString:0x7fa35550bb40 rel\
5 ativeToURL:0x0]
6 1008 ms -[NSURL isFileReferenceURL]
7 1008 ms | -[NSURL _cfurl]
1 /* TID 0x407 */
2 60 ms -[NSFileManager fileExistsAtPath:0x10f24f018]
3 60 ms | -[NSFileManager getFileSystemRepresentati\
4 on:0x7ffee09b43e0 maxLength:0x400 withPath:0x10f24f018]
Once we are inside the REPL, we need an API resolver to get the
handler and so we will create it:
myResolver = new ApiResolver('ObjC');
myResolver.enumerateMatchesSync('-[NSFileManager
fileExists*])
1 [
2 {
3 "address": "0x7fff211855af",
4 "name": "-[NSFileManager fileExistsAtPath:]"
5 },
6 {
7 "address": "0x7fff2117d115",
8 "name": "-[NSFileManager fileExistsAtPath:isDirec\
9 tory:]"
10 }
11 ]
1 [Local::objCLI]-> myResolver.enumerateMatchesSync("-[NSFi\
2 leManager fileExistsAtPath:]")
3 [
4 {
5 "address": "0x7fff211855af",
6 "name": "-[NSFileManager fileExistsAtPath:]"
7 }
8 ]
And that’s it! We have created our first instrumentation script for
ObjC apps without REPL interaction.
1 [Local::objCLI]-> ObjC.classes.NSFileManager.$ownMethods
2 [
3 "+ defaultManager",
4 "- dealloc",
5 "- delegate",
6 "- setDelegate:",
7 "- fileExistsAtPath:",
8 "- createDirectoryAtPath:withIntermediateDirectories:\
9 attributes:error:",
10 "- createDirectoryAtURL:withIntermediateDirectories:a\
11 ttributes:error:",
12 "- homeDirectoryForCurrentUser",
13 "- URLsForDirectory:inDomains:",
14 "- getRelationship:ofDirectoryAtURL:toItemAtURL:error\
15 :",
16 "- enumeratorAtURL:includingPropertiesForKeys:options\
17 :errorHandler:",
18 "- temporaryDirectory",
19 "- stringWithFileSystemRepresentation:length:",
20 "- removeItemAtPath:error:",
21 "- enumeratorAtPath:",
22 "- contentsOfDirectoryAtPath:error:",
23 "- isExecutableFileAtPath:",
24 "- destinationOfSymbolicLinkAtPath:error:",
25 ...
1 Interceptor.attach(ptr(t), {
2 onEnter(args) {
3 this.isDir = args[3];
4 },
5 onLeave: function(retval) {
6 let objCIsDir = new ObjC.Object(this.isDir);
7 console.log(objCIsDir);
8 }
9 })
1 [Local::objCLI]-> %resume
2 2021-02-16 22:48:51.725 objCLI[96805:23540150] File exist\
3 s.
4 [Local::objCLI]-> nil
It says our .c file is not a directory so the parameter is set to nil after
method execution.
1 #import <Foundation/Foundation.h>
2
3 void print_ptr(CFDataRef dRef) {
4 NSLog(@"%@", dRef);
5 }
6
7 int main(int argc, const char * argv[]) {
8
9 @autoreleasepool {
10 const UInt8 *myString = "foobar";
11 CFDataRef data = CFDataCreateWithBytesNoCopy(NULL\
12 , myString, strlen(myString), kCFAllocatorNull);
MacOS 138
13
14 NSLog(@"%p", print_ptr);
15 getchar();
16 print_ptr(data);
17 }
18 return 0;
19 }
So, we get the bytes and bytes length but not the representation of
the string. For this purpose, we can access the .bytes() method in
ObjC.Object and from this representation call .readUtf8String() to
read it as an UTF8 string.
In the case that it is not an UTF8 string, be sure to use the appropriate
method to read it.
1 import Foundation
2 import CryptoKit
3
4 let pass = "foobar"
5 let data = "frida is fun!".data(using: .utf8)!
6 let key = SymmetricKey(data: SHA256.hash(data: pass.data(\
7 using: .utf8)!))
8 let iv = AES.GCM.Nonce()
9 let mySealedBox = try AES.GCM.seal(data, using: key, nonc\
10 e: iv)
11 let dataToShare = mySealedBox.combined?.base64EncodedData\
12 ()
This simple example takes the “frida is fun!” string and encrypts it
using “foobar” as key. After building this sample, we can disassemble
the binary before opening it up in Frida. What we will find in the list
of functions is:
MacOS 140
1 0x100003c30 1 6 sym.imp.static_CryptoKit.AE\
2 S.GCM.seal_A_where_A:_Foundation.DataProtocol___:_A__usin\
3 g:_CryptoKit.SymmetricKey__nonce:_Swift.Optional_CryptoKi\
4 t.AES.GCM.Nonce___throws____CryptoKit.AES.GCM.SealedBox
5 0x100003c42 1 6 sym.imp.CryptoKit.AES.GCM.S\
6 ealedBox.combined.getter_:_Swift.Optional_Foundation.Data
7 0x100003c48 1 6 sym.imp.type_metadata_acces\
8 sor_for_CryptoKit.AES.GCM.SealedBox
And if we check the dissassembly itself we can find that the CryptoKit
function we wrote is being called:
Now, we can fire up Frida and check our binary imports (I named this
example binary swiftCLI):
Module.enumerateImportsSync("swiftCLI")
And we can quickly notice our target among all the imports:
MacOS 141
1 ...
2 {
3 "address": "0x7fff56762210",
4 "module": "/System/Library/Frameworks/CryptoKit.f\
5 ramework/Versions/A/CryptoKit",
6 "name": "$s9CryptoKit3AESO3GCMO4seal_5using5nonce\
7 AE9SealedBoxVx_AA12SymmetricKeyVAE5NonceVSgtK10Foundation\
8 12DataProtocolRzlFZ",
9 "slot": "0x10079d030",
10 "type": "function"
11 },
12 ...
There are other functions that might be interesting to inspect but they
are not needed in this case:
1 $s8swiftCLI3key9CryptoKit12SymmetricKeyVvp
2 $s8swiftCLI2iv9CryptoKit3AESO3GCMO5NonceVvp
3 $s8swiftCLI11mySealedBox9CryptoKit3AESO3GCMO0dE0Vvp
14 });
15 }
16 });
13 String();
14
15 console.log("Raw data: " + rawData);
16 console.log("Key: " + key);
17
18 },
19 });
20 }
21 });
8.7 Swift.String
Another thing you might have noticed is that strings are inlined when
they are small but once they grow bigger than 15 bytes it is not
possible to parse them easily as before. The reason is Swift’s memory
layout for string types.
To test this, we are going to take a very basic Swift example: A
program that builds a hello string and receives the person’s name as
an argument.
For <= 15 byte strings, flags are stored in the latest byte and the string
itself are the first 15 bytes. We can read the string as usual.
For strings longer than 16 they are considered large by Swift’s
memory layout and thus are split in 8 bytes for countAndFlagsBits
MacOS 144
1 [Local::swiftCLI]-> Module.enumerateExportsSync("swiftCLI\
2 ")
3 [
4 {
5 "address": "0x108a4f000",
6 "name": "_mh_execute_header",
7 "type": "variable"
8 },
9 {
10 "address": "0x108a52a80",
11 "name": "main",
12 "type": "function"
13 },
14 {
15 "address": "0x108a52dc0",
16 "name": "$s8swiftCLI5greet6personS2S_tF",
17 "type": "function"
18 },
19 {
20 "address": "0x108a57048",
21 "name": "$s8swiftCLI8greetingSSvp",
22 "type": "variable"
23 }
24 ]
¹https://github.com/apple/swift/blob/main/docs/ABI/RegisterUsage.md
MacOS 145
r2frida commands
When searching for r2frida documentation or blogposts,
it is likely that you will find commands starting with
backslash “”. This is the old way of running commands
within r2frida, commands are now run starting with :.
1 $ r2 frida:///bin/ls
2 -- This binary no good. Try another.
3 [0x00000000]>
The next step is verifying that Frida is working and is read properly
by r2frida:
1 [0x00000000]> :?V
2 {"version":"15.1.17"}
z will read the first argument as a UTF-8 string and ^ traces the
onEnter instead of the onLeave block. To understand how to use the
tracing command it is better to do it through practical examples.
Note that the unlike the previous example where r2frida was
spawned this time the frida:/// block is surrounded by double
quotes, this allows to pass arguments to the target binary. The target
function to instrument in this case is fopen that receives the filename
in the first argument and mode in second argument both as const
char*’s. Once we are in the r2 shell we should have tue following
console:
r2frida 149
1 r2 "frida:///usr/bin/wget man7.org"
2 r_config_set: variable 'asm.cmtright' not found
3 -- If you're having fun using radare2, odds are that you\
4 're doing something wrong.
5 [0x00000000]>
The output has been shortened, but the important bit is the first entry
of libc-2.27.so. By entering the address in r2’s console, we position
ourselves in the module address:
1 [0x00000000]> 0x00007ff0a9db9000
2 [0x7ff0a9db9000]>
1 [0x7ff0a9db9000]> :iE~fopen
2 0x7ff0a9e45450 f _IO_file_fopen
3 0x7ff0a9e37de0 f fopen
4 0x7ff0a9e380d0 f fopencookie
5 0x7ff0a9e37de0 f _IO_fopen
6 0x7ff0a9e37de0 f fopen64
7 [0x7ff0a9db9000]>
The :dtf command returns true signaling that the command was
issued succesfully. z will display the value read as an UTF8String and
the second z will do the same thing for the second argument. ‘^’ also
shows the backtrace of the function. To resume execution, the :dc
does so.
The output has been filtered to include only the interesting bits, but
it can be seen that each argument is interpreted correctly as a UTF8
string and displayed along their backtrace.
1 // gcc check_password.c
2 #include <stdio.h>
3 #include <stdlib.h>
4 #include <string.h>
5
6 // Damn_YoU_Got_The_Flag
7 char password[] = "\x18\x3d\x31\x32\x03\x05\x33\x09\x03\x\
8 1b\x33\x28\x03\x08\x34\x39\x03\x1a\x30\x3d\x3b";
9
10 inline int check(char* input);
11
12 int check(char* input) {
13 for (int i = 0; i < sizeof(password) - 1; ++i) {
14 password[i] ^= 0x5c;
15 }
16 return memcmp(password, input, sizeof(password) - 1);
17 }
18
19 int main(int argc, char **argv) {
20 if (argc != 2) {
21 printf("Usage: %s <password>\n", argv[0]);
22 return EXIT_FAILURE;
23 }
24 int size_of_password = (sizeof(password) - 1);
25 printf("size: %d", size_of_password);
26 if (strlen(argv[1]) == (sizeof(password) - 1) && check(\
27 argv[1]) == 0) {
28 puts("You got it !!");
29 return EXIT_SUCCESS;
30 }
31
32 puts("Wrong");
33 return EXIT_FAILURE;
34
35 }
This time the code is compiled using gcc and when opening in in
r2frida and inspecting the imports/exports of the binary it is a blank
slate:
r2frida 153
1 [0x00000000]> :dm
2 0x00005591d1947000 - 0x00005591d1948000 r-x /tmp/a.out
3 0x00005591d1b47000 - 0x00005591d1b48000 r-- /tmp/a.out
4 0x00005591d1b48000 - 0x00005591d1b49000 rw- /tmp/a.out
5 # ...
6 [0x00000000]> s 0x00005591d1947000
7 [0x5591d1947000]> :iE
8 [0x5591d1947000]> :ii
9 [0x5591d1947000]>
When opening this binary with r2 -A to analyze it, this is the output
obtained when listing functions:
1 $ r2 -A a.out
2 [0x00000610]> afl
3 0x00000610 1 42 entry0
4 0x00000640 4 50 -> 40 sym.deregister_tm_clones
5 0x00000680 4 66 -> 57 sym.register_tm_clones
6 0x000006d0 5 58 -> 51 sym.__do_global_dtors_aux
7 0x00000600 1 6 sym.imp.__cxa_finalize
8 0x00000710 1 10 entry.init0
9 0x000008a0 1 2 sym.__libc_csu_fini
10 0x000008a4 1 9 sym._fini
11 0x00000830 4 101 sym.__libc_csu_init
12 0x0000077b 7 170 main
13 0x0000071a 4 97 sym.check
14 0x00000598 3 23 sym._init
15 0x000005c0 1 6 sym.imp.puts
16 0x000005d0 1 6 sym.imp.strlen
17 0x000005e0 1 6 sym.imp.printf
18 0x00000000 2 25 loc.imp._ITM_deregisterTMClo\
19 neTable
20 0x000005f0 1 6 sym.imp.memcmp
21 0x000001a5 1 38 fcn.000001a5
22 [0x00000610]>
What is seen in the first column are the offsets for the functions
and the ones which are of interest to us are sym.imp.memcmp and
sym.check:
r2frida 154
1 # offset function
2 0x0000071a 4 97 sym.check
3 0x000005f0 1 6 sym.imp.memcmp
After retrieving the values for both functions, the next step is spawn-
ing the binary using r2frida to calculate the memory addresses of
these functions. To ensure that both the memcmp and the check function
are called the binary has been spawned with the following argument:
r2 "frida:///tmp/a.out testtesttesttesttestt"
The next step is retrieving the base address of a.out which can be
done by using the :dm command to list modules joint with ∼ to filter
out the results:
1 [0x00000000]> :dm~out
2 0x000055c05fc86000 - 0x000055c05fc87000 r-x /tmp/a.out
3 0x000055c05fe86000 - 0x000055c05fe87000 r-- /tmp/a.out
4 0x000055c05fe87000 - 0x000055c05fe88000 rw- /tmp/a.out
Since memcmp receives two const void* parameters the tracing format
that we are using here is hh to hexdump the address of both argu-
ments. Now that both functions have been traced the execution of
the process can be resumed by calling :dc:
r2 "frida:///tmp/a.out testtesttesttesttestt"
And then set emu.str=true to view the strings obtained from emula-
tion and place ourselves at the sym.check address:
1 [0x00000000]> e emu.str=true
2 [0x00000000]> :dm~out
3 0x000055bf4aefa000 - 0x000055bf4aefb000 r-x /tmp/a.out
4 0x000055bf4b0fa000 - 0x000055bf4b0fb000 r-- /tmp/a.out
5 0x000055bf4b0fb000 - 0x000055bf4b0fc000 rw- /tmp/a.out
6 [0x00000000]> 0x000055bf4aefa000 + 0x0000071a
7 [0x55bf4aefa71a]>
1 [0x00000000]> :di?
2 di intercept help
3 di-1 intercept ret_1
4 di0 intercept ret0
5 di1 intercept ret1
6 dif intercept fun help
7 dif-1 intercept fun ret_1
8 dif0 intercept fun ret0
9 dif1 intercept fun ret1
10 difi intercept fun ret int
11 difs intercept fun ret string
12 dii intercept ret int
13 dis intercept ret string
14 div intercept ret void
What this means is that :di-1 will replace the return value of the
address with -1, :di0 will make the return value 0 and the same goes
for :di1 which sets the return value to one. The same code as in the
previous section is what we areusing to test this command out.
The idea is to patch the check function’s return value so that it returns
0 allowing the code to return the string “You got it !!”. The first thing
to do to get the address of the check function:
1 [0x00000000]> 0x0000563b168e5000 + 0x0000071a
2 [0x563b168e571a]> :di0 0x563b168e571a
And when checking the main function, the latest function called is
0x563b168e55c0 on which we are going to place a breakpoint by using
the :db command to be able to see what happens:
r2frida 158
We can see that although the process was spawned with the
“testtesttesttesttestt” string instead of the correct flag it returned 0
and the code returns “You got it !!” in turn.
1 [0x00000000]> :di?
2 di intercept help
3 di-1 intercept ret_1
4 di0 intercept ret0
5 di1 intercept ret1
6 dif intercept fun help
7 dif-1 intercept fun ret_1
8 dif0 intercept fun ret0
9 dif1 intercept fun ret1
10 difi intercept fun ret int
11 difs intercept fun ret string
12 dii intercept ret int
13 dis intercept ret string
14 div intercept ret void
What this means is that :di-1 will replace the return value of the
address with -1, :di0 will make the return value 0 and the same goes
r2frida 159
for :di1 which sets the return value to one. The same code as in the
previous section is what we areusing to test this command out.
The idea is to patch the check function’s return value so that it returns
0 allowing the code to return the string “You got it !!”. The first thing
to do to get the address of the check function:
And when checking the main function, the latest function called is
0x563b168e55c0 on which we are going to place a breakpoint by using
the :db command to be able to see what happens:
We can see that although the process was spawned with the
“testtesttesttesttestt” string instead of the correct flag it returned 0
and the code returns “You got it !!” in turn.
r2frida 160
1 [0x00000000]> :dmal
2 0x7f58d1436b60 "r2fridarul3s"
1 [0x00000000]> :dx?
2 dxc dx call
3 dxo dx objc
4 dxs dx syscall
1 #include <stdio.h>
2
3 int main()
4 {
5 FILE *fp = NULL;
6 fp = fopen("sample_file.dat", "w");
7 fclose(fp);
8 return 0;
9 }
Now that both strings have been allocated the next step is figuring
out the address of the fopen function. This can be done as previously
learned by getting the base address of the process:
The result is that the address of the fopen function for this process is
0x55c153ffa560 which can now be used to call :dxc:
1 $ ls | grep r2
2 r2fridarul3s
10. Optimizing our Frida
setup
When instrumenting applications it is not only important to optimize
our instrumentation code for edge cases, but also optimizing the
library that we are injecting in our target application.
Frida provides in its config.mk file certain features that might not be
needed in our agent, and this would help reduce the memory footprint
of the injected agent. Among the features that can be disabled are:
• V8 Runtime
• Frida connectivity (TLS and ICE, OpenSSL)
• Frida Objective-C bridge
• Swift Bridge
• Java Bridge
9.5M vs 24M, this is roughly a 61% decrease in the size of the agent
Optimizing our Frida setup 165
* in disabled means that you should enable if your use case requires
it.
For a more detailed overview of Frida’s memory footprint I recom-
mend readying through frida.re/docs/footprint/¹.
Once you clone this repository, you will find a file named config.mk
inside with the following settings (among others):
¹https://frida.re/docs/footprint/
Optimizing our Frida setup 166
1 FRIDA_V8 ?= enabled
2 FRIDA_CONNECTIVITY ?= enabled
3 FRIDA_DATABASE ?= enabled
4 FRIDA_JAVA_BRIDGE ?= auto
5 FRIDA_OBJC_BRIDGE ?= auto
6 FRIDA_SWIFT_BRIDGE ?= auto
1 FRIDA_V8 ?= disabled
2 FRIDA_CONNECTIVITY ?= enabled
3 FRIDA_DATABASE ?= enabled
4 FRIDA_JAVA_BRIDGE ?= auto
5 FRIDA_OBJC_BRIDGE ?= auto
6 FRIDA_SWIFT_BRIDGE ?= auto
And once this change has been made, the agent can be compiled:
make python-linux-x86_64
11. A real-world use case:
Building an anti-cheat
with Frida
This project is a proof of concept of an anti-cheat that emerged from
a challenge: writing an anti-cheat without modding the client nor
server-side.
However, I believe this small project is worth being documented in
this book so that you are able to see how powerful FRIDA is and the
infinite possibilities you have when using such toolkit.
11.1 Background
The way this proof of concept was born emerged while playing a
Quake 3 engine based game named Jedi Knight: Jedi Academy. This
game features lightsabers (swords for the ones not familiar with Star
Wars) and it is the only game (that I know of) that along with Jedi
Knight: Jedi Outcast features swordfigting using the Quake 3 engine.
These games are still played and competitive players always require
playing under the original game, this is using the original November
2003 binaries. Why? The main Problem with this game is that it was
built using an ancient compiler, Intel’s ICC with a version that it is
not possible to retrieve now but, it was adapted to modern compilers
such as latest MSVC++ and GCC.
This produced some side effects such as differences FPU calculations
and some extra instructions here and there, hence the swordfight-
ing changed altering clashes, damages and rendering useless dual-
wielding (due to an increased block-rate).
Although this issue not only happened to the base game built in newer
compilers. Mods that were built and ran along with the original server
binaries still generated alterations in the swordfighting.
A real-world use case: Building an anti-cheat with Frida 168
Why hilts? In this game lightsaber hilt model’s have slightly different
ignition tags which means they are slightly longer or shorter (gives
an edge in a fight).
A real-world use case: Building an anti-cheat with Frida 169
11.2.1 Timenudge
What is timenudge? It is a client command that adds local lag to try
to make sure you interpolate instead of extrapolate and may give you
advantage when there is a ping difference of +-10ms between two
players (helps predicting where a player will land or make you weird
in the eyes of other players).
The interpolation window is 50ms in servers with a tickrate of 20
(sv_fps 20), the default. With the command cl_timenudge it is possible
to remove the interpolation window, forcing the client to show the
latest position instead of the smooth trajectory.
The cl_timenudge command is only allowed between a range of -
30 (thus reducing in 30ms the interpolation time) and +30, but it is
possible to modify the client to bypass this restriction. This is what
happens to the lagometer (network graph) when setting a timenudge
value of -60, 0 and +60:
The upper blue line where timenudge=0 means that the client is in
sync with the server in real-time whereas in timenudge=-60 there it
shows yellow spikes due to game desyncying.
A real-world use case: Building an anti-cheat with Frida 170
• CentOS 8:
- vim
- dnf install patch pkg-config gcc gcc-c++ make
glibc-devel glibc-devel.i686
• frida:
- pip install frida frida-tools
• Original server binaries:
- wget https://files.jkhub.org/jka/official/jalinuxded_-
1.011.zip
The Quake3 Virtual Machine is a complex topic and out of the scope
of these pages thus I will only be covering what we need to follow
this part.
What we need to know is that an export named vmMain acts as a
syscall dispatcher, it receives an integer as its first argument and then
checks against the gameExport_t table to verify which event has to be
handled, a quick overview:
A real-world use case: Building an anti-cheat with Frida 171
Member Info
GAME_INIT Called every time a level
changes
GAME_SHUTDOWN ^ same + server shutdown
GAME_CLIENT_- Player or bot is connected
CONNECT to the server
GAME_CLIENT_- Player modifies their info:
USERINFO_CHANGED network, playermodel,
names
GAME_CLIENT_- Player or bot disconnects
DISCONNECT
GAME_CLIENT_THINK Frames when the server is
idle
GAME_CONSOLE_- Falback to engine when a
COMMAND command is not recognized
but might be available
BOTAI_* AI management (bots)
ICARUS_* ICARUS scripting engine
stuff
From this function, we can see that each time an event is triggered
there is anoter function being called which handles the event (since
vmMain is a dispatcher after all) - And if we only instrument the
functions triggered by the events that are interesting to us we would
remove the overhead issue and would be able to build our agent
around.
For example, if we wanted to instrument the event of a player/bot
connecting to our server we could check the second event of the
gameExport_t table GAME_CLIENT_CONNECT. We would only have to
instrument the ClientConnect function. Once the game is loaded and
the GAME_INIT event is triggered, the gamecode jampgamei386.so lib
is loaded in memory and gives us access to mangled mpgame exports
(which includes functions such as ClientConnect, and some other
engine exports).
After obtaining this information, the anti-cheat design can be de-
cided:
1 class ClientConnect {
2 onEnter (args:NativePointer[]) {
3 let userinfo = Memory.alloc(MAX_INFO_STRING); // siz\
4 e: 1024 or game goes brrrr
5 let isBot = 0;
6 const clientId:number = args[0].toInt32();
7 if (args[2].toInt32() == 1) {
8 isBot = 1
9 log('(bot) clientConnect: ' + clientId);
10 clientList[clientId] = false;
11 }
A real-world use case: Building an anti-cheat with Frida 175
12 else {
13 clientList[clientId] = true;
14 log('clientConnect: ' + clientId);
15 }
16
17 if (!isBot) {
18 getUserInfo(clientId, userinfo, MAX_INFO_STRING);
19 let tmpIp:any = InfoValueforKey(userinfo, ipKey);
20 const clientIP:string|null = tmpIp.readUtf8String();
21
22 log("clientIP: " + tmpIp.readUtf8String());
23 if (bannedIPsArray.includes(tmpIp.readUtf8String())\
24 ) {
25 // if banned, we ban :)
26 log('filtered: ' + clientIP);
27 Interceptor.replace(G_FilterPacketPtr, new Native\
28 Callback((packet) => {
29 return qTrue;
30 }, 'bool', ['pointer']));
31 }
32 }
33
34 const tmpSnaps:any = InfoValueforKey(userinfo, snapsK\
35 ey);
36 log("Snaps: " + tmpSnaps.readUtf8String());
37 const tmpRate:any = InfoValueforKey(userinfo, rateKey\
38 );
39 log("Rate: " + tmpRate.readUtf8String());
40 let nameKey = Memory.allocUtf8String("name");
41 const tmpName:any = InfoValueforKey(userinfo, nameKey\
42 );
43 log("playername: " + tmpName.readUtf8String());
44 }
45 onLeave (retval:NativeReturnValue) {
46 Interceptor.revert(G_FilterPacketPtr);
47 }
48 }
args[0] stores the client identifier, args[2] tells us if they are a bot
or not and the same in args[1] goes for new or not clients.
1 let isBot = 0;
2 const clientId:number = args[0].toInt32();
3 if (args[2].toInt32() == 1) {
4 isBot = 1
5 log('(bot) clientConnect: ' + clientId);
6 clientList[clientId] = false;
7 }
8 else {
9 clientList[clientId] = true;
10 log('clientConnect: ' + clientId);
11 }
1 if (!isBot) {
2 getUserInfo(clientId, userinfo, MAX_INFO_STRING);
3 let tmpIp:any = InfoValueforKey(userinfo, ipKey);
4 const clientIP:string|null = tmpIp.readUtf8String();
5
6 log("clientIP: " + tmpIp.readUtf8String());
7 if (bannedIPsArray.includes(tmpIp.readUtf8String())\
8 ) {
9 // if banned, we ban :)
10 log('filtered: ' + clientIP);
11 Interceptor.replace(G_FilterPacketPtr, new Native\
12 Callback((packet) => {
13 return qTrue;
14 }, 'bool', ['pointer']));
15 }
16 }
1 onLeave (retval:NativeReturnValue) {
2 Interceptor.revert(G_FilterPacketPtr);
3 }
Initially it might be a good idea to get the data via this function but the
problem with that is we are not sure of when to query for it. Theere is
however an alternative path and that is checking whenever the player
changes their userinfo string.
A real-world use case: Building an anti-cheat with Frida 179
Once the auxiliary functions are clear we are now ready to instrument
the ClientUserinfoChanged function.
A real-world use case: Building an anti-cheat with Frida 180
The first part of our onEnter block’s purpose is to retrieve the userinfo
string of the player that has triggered the action in the same way it
was done when a player connects to the server. It is possible to extract
more information from the infostring but for now it is enough with
snaps, rate, and the playernme.
If the user sets snaps that are not valid (in this case, a snaps number
bigger than the server’s tickrate) then it is possible to reject the player
by straight up issuing a kick command to the server by leveraging the
trap_SendServerCommand function and issuing the command client-
kick. The next step is to identify whenever a player is trying to set a
number of tickrate different than the standard (25000).
A real-world use case: Building an anti-cheat with Frida 181
1 if(parseInt(rate) != 25000) {
2 const clientKickString = Memory.allocUtf8String("cl\
3 ientkick " + clientId.toString() + "\n");
4 trap_SendConsoleCommand(0, clientKickString);
5 }
This way, it would not be possible for the players in the server to
change their userinfo settings without being removed from the server
immediately.
1 let currentFrame:number = 0;
2
3 class G_RunFrame {
4 onEnter (args:NativePointer[]) {
5 currentFrame = args[0].toInt32();
6 }
7 }
The biggest issue when obtaining the values from the playerState
struct is that it is nested inside another struct named gentity_s
and the only way to do this is by manually calculating the gentity
struct or reverse engineering the code. In my case, I have manually
calculated the required offsets but I understand this is a troublesome
task for the reader hence the best way to go through this is to reverse
engineer them.
A real-world use case: Building an anti-cheat with Frida 183
In this call it would be possible to track the attacker and the attacked
players as well as damage and the means of death (fall, weapon,
environment…). Let’s get our hands on instrumenting this function
and target the attacker’s timenudge value:
A real-world use case: Building an anti-cheat with Frida 184
1 class PlayerDie {
2
3 killer_clientNum:number = -1;
4 killer_cmdTime:number = 0;
5 killer_clientPing:number = 999;
6 onEnter (args:NativePointer[]) {
7 this.killer_clientNum = args[2].readInt();
8 const killer_playerState_s:NativePointer = args[2].ad\
9 d(532);
10
11 this.killer_cmdTime = killer_playerState_s.readPointe\
12 r().readInt();
13 this.killer_clientPing = killer_playerState_s.readPoi\
14 nter().add(524).readInt();
15 }
The attacker value is stored in the third argument (args[2]) and the
first member of the gentity_t structure returns the client identifier.
The playerState structure begins in the offset 0x532 and its first
member matches the client’s commandTime whereas in playerState’s
0x524 offset the client ping is stored.
1 onLeave () {
2
3 if (clientList[this.killer_clientNum]) {
4 let {timenudge, bogusTimenudge} = isBogusTimenudge(\
5 this.killer_cmdTime, currentFrame, this.killer_clientPing\
6 );
7 if (bogusTimenudge === true) {
8 SITHagent_sendServerChatMessage("timenudge: " + t\
9 imenudge.toString());
10
11 const killer_clientKickString = Memory.allocUtf8S\
12 tring("clientkick " + this.killer_clientNum.toString() + \
13 "\n");
14 trap_SendConsoleCommand(0, killer_clientKickStrin\
15 g);
16 }
17 }
18 }
A real-world use case: Building an anti-cheat with Frida 185
In the onLeave callback it grabs the client from the clientList to verify
if it is a bot or not and proceed to calculate their timenudge values.
If the timenudge value is invalid, the player will be automatically
kicked from the server. This codeblock displays a call to an auxiliary
function isBogusTimeNudge that is used to calculate the timenudge
value and validate it. This function is described as follows:
This function takes the ping, the currentFrame and the commandTime
of the client and calculates the timenudge using the aforementioned
formula. Negative values of at least -7 will drop the client and positive
values over the player ping’s are invalid.
Of course, there are alternatives like instrumenting SV_UserMove or
ClientThink_Real that would allow us to get a real-time calculation
of timenudge values but these alternatives rely heavily on server
frames and hence need to be optimized. Later on we will revisit this
idea but first, let’s optimize or current code.
Interceptor.attach(G_RunFramePtr, runFrameCModule);
To persist across map changes and map restarts, the best effort is to
A real-world use case: Building an anti-cheat with Frida 188
hook this function and reenable our instrumentation once the server
has restarted (in this case, reloaded the libraries).
1 class GShutDownGame {
2 onLeave() {
3 setTimeout(hookJampgameExports, 3000);
4 }
5 }
11.6.2 Conclusions
This is only a ‘simple’ proof of concept anti-cheat but the main idea
behind this development is to demonstrate how many thungs are
possible by using the Frida toolkit. In most scenarios it will be used
to bypass or extract information from an application but it is also
possible to use it to build around an existing closed binary. This proof
of concept has also helped in demonstrating how it is possible to
optimize our code after identifying bottlenecks.
12. Resources
Resources and references that are have helped write this handbook or
are just useful.
Technical concepts
Tutorials
¹⁰https://github.com/rocco8620/useful-android-frida-snippets
¹¹https://github.com/enovella/r2frida-wiki
¹²https://bananamafia.dev/post/r2frida-1/
¹³https://r2wiki.readthedocs.io/en/latest/radare-plugins/frida/
¹⁴https://www.entdark.net/search/label/frida