[This was originally posted at https://research.nccgroup.com/2019/03/25/ebpf-adventures-fiddling-with-the-linux-kernel-and-unix-domain-sockets/.]
tl;dr
eBPF (extended Berkeley Packet Filter) is slowly taking over as a programmatic way for (generally privileged) users to invoke Linux kernel APIs and performantly execute semi-arbitrary code without having to load it from a custom kernel module. eBPF is a general means to load memory safe restricted code that reduces the risk of crashes, deadlocks, and infinite loops of inherent to the kernel module alternative.
In this post, we describe how to effectively use eBPF to trace Linux kernel
functions. We also discuss how we implemented our eBPF-based tracing tool that
can sniff Unix domain sockets across an entire Linux host (this “impossible”
task was what got us started with eBPF). You can install it with
sudo -H pip install unixdump (or from our repo),
but it requires BCC to be installed
separately. But, if you just want to see some of what it can do,
click here.
If you’re looking to get into building custom eBPF-based Linux kernel tracing tools, we recommend starting with BCC, busting out its reference guide, and pinning a tab to the Linux kernel codebase.
Background
Unix domain sockets1 are a core OS-provided IPC (inter-process communication) mechanism that enable
processes on the same host to communicate through send(2)/sendto(2)- and recv(2)/recvfrom(2)-able
file descriptors similarly to network sockets. Unix domain sockets use a file
path (or in the case of Linux’s abstract namespace, a key string) as the
bind(2)/connect(2) “address;” additionally, “unnamed” Unix domain sockets
may be created in connected pairs through the socketpair(2) syscall.
Traditionally, Unix domain
sockets are often used when applications or services require bidirectional
communication between related processes or communication between unrelated ones
that may enforce OS-backed permission checks (though the checks differ between Unixes2). Depending on the internal
implementation, Unix domain sockets are often significantly more performant
than loopback-networking as they do not go through the networking stack. One
downside of this is that Unix domain sockets are extremely difficult to inspect
as there is no simple interface or API for intercepting Unix domain socket
traffic, such as the pcap(3) APIs for network traffic. Instead, when needing
to observe Unix domain socket traffic, one often resorts to interposing a
forwarding daemon with a middle Unix domain socket that the
connect(2)/sendto(2)-ing peer will actually communicate with. In the case
where file descriptors are being passed or peer credentials are being validated,
and it may be overly complicated to run such a daemon with the right process
information, function hooking (typically LD_PRELOAD-alikes) may be used to
interdict libc stubs used to invoke the communication syscalls. Both methods
have drawbacks and additionally require more than a cursory understanding of
how an application is already using Unix domain sockets in order to intercept
them. Additionally, neither method scales across an entire host and may only be
used to individually intercept Unix domain socket traffic for single
applications and services. Intercepting all Unix domain socket traffic across
a host will require dumping the data directly from the kernel; we can do this
by tracing the kernel.
Kernel Tracing and Instrumentation
Generally, kernel tracing utilities enable one to obtain very basic information
about the execution within the OS kernel. Most implementations provide the
ability to dump metadata about executing functions, including their arguments
and return values. Often, this data is very limiting as the relevant
information is embedded within structs and arrays for which only the pointer
address is returned. Fewer OS kernels directly provide instrumentation APIs
that enable deeper information to be gleaned from executing functions and
kernel memory. The gold standard for such functionality is DTrace.
DTrace was originally developed for Solaris, and has since been ported to
FreeBSD and macosx (where, unfortunately, it has been purposefully weakened to
align with other DRM-related changes and will deny attempts to trace binaries
from core system directories). DTrace has also been ported
to Linux, twice; once as a /proc/kallsyms-based kernel module that
implements its own function hooking, and, more recently, by Oracle,
who relicensed it under the GPL (for kernel code). Unfortunately, the former is
essentially unmaintained and fails on current Linux distributions, and the
latter is effectively locked to Oracle Linux at the moment and it is unclear if
it will be accepted into upstream.
Due to the long period without DTrace on Linux and the eventual realization of the need to have useful debugging features within the kernel, Linux has played fast and loose with a number of tracing frameworks that mostly constitute dead ends. Chief among the failures is SystemTap, which has a painful installation process requiring installation of kernel debug symbols and uses a frustrating NIH clone of the DTrace scripting syntax to dynamically generate kernel modules. Needless to say, one is probably better off writing their own function hooking kernel modules. Sysdig, on the other hand, is a useful tool (but doing anything fancy might involve Lua scripting…). Unfortunately, is not capable of drilling down into arbitrary kernel structures in memory as it is based entirely on Linux’s built-in tracepoint definitions (primarily those for syscalls and process scheduling events), which provide only specific pre-defined values that the kernel’s developers felt would be useful for debugging and tracing kernel functionality.
Linux’s Lego Problem
Linux’s in-kernel tracing features are very similar to other facets of modern Linux, specifically containers. Linux slowly gained a number of “namespacing” features that were eventually composed to form the “concept” of “containers,” an ape of purpose-built sandboxing technology such as FreeBSD’s jails and Solaris’ zones. Containers have only recently started to see legitimate security hardening materialize for normal users within the past few years, most notably with the introduction of user namespaces, a feature feared by Linux distros, but which wipes out a number of container escapes by isolating privileges to the container’s namespaces. Jessie Frazelle sums up the “concept” distinction nicely in her infamous blog post comparing them. We will only add that if you decide to make an ocean out of the Death Star set, it will cost $500 and your ocean will be a murky gray.
The Linux kernel has several different inter-related features that support dynamic instrumentation and tracing (for a more thorough introduction, see Julia Evans’ overview on Linux’s tracing systems):
-
kprobes: Kernel API via
register_kprobe(struct kprobe*)that can register callback functions to handle a breakpoint injected at an arbitrary memory address (typically the start of a function). The handlers are provided astruct pt_regs*containing the architecture-specific register values. -
ftrace: A function tracing API provided by the Linux kernel built on lower-level APIs such as kprobes and tracepoints, that provides a filesystem-based userland API (
debugfs) to configure and enable various tracing and profiling operations. -
perf (aka perf_events): Kernel API for hardware performance monitoring counters (e.g. number of instructions executed), timed sampling (e.g. find where in the callstack time is spent), and userspace mapped ring buffers via
perf_event_open(2)/mmap(2)syscalls. -
tracepoints: Kernel API with tracepoints declared through the
TRACE_EVENTmacro, inserted via calls to the tracepoint “function,” and callbacks registered throughtracepoint_probe_register(struct tracepoint*,void*,void*). TheTRACE_EVENTmacro creates metadata useful for perf and ftrace to instrument by tracepoint name.
Some of these are implemented in terms of each other, and several of their
subsystems interact with or support each other. But, up until recently, if you
wanted to directly interact with any of these things in a meaningful way, you
needed to write a bunch of GNU-flavored C in a kernel module and implement
your own scheme for communicating data back to userspace (assuming you want
something more performant than printk). This is essentially how every single
tracing “framework” is implemented, and one of the common things they seem to
share is that they all tend to set up their own ring buffer from scratch
that is mmap(2)‘d using a file descriptor obtained by open(2)-ing a special
path registered by the module (implementations vary heavily, from use of
cdev_add to mounted in-memory
filesystems).
eBPF: Lego Grout
eBPF (extended Berkeley Packet Filter) is an in-kernel JIT-ing virtual machine
that adds extra computational resources (more registers, direct ISA mapping to
major CPU architectures, and fast C interop with internal Linux kernel functionality)3 to the classic BPF virtual machine and
bytecode instruction set; it is referred to as bpf in the Linux source, APIs,
syscalls, etc. eBPF is slowly taking over as a “programmatic” way for users,
often privileged ones, to invoke Linux kernel APIs and execute “performant”
code in kernel space (which limits context switches) without necessitating the
development and loading of a custom kernel module in an unsafe or dangerous
language. eBPF is not specifically a tracing or instrumentation feature, but a
general means to load memory safe restricted code that reduces the risk of
crashes, deadlocks, and infinite loops of inherent to the kernel module
alternative. Given that eBPF itself has already
introduced
vulnerabilities
(Though, being honest, what revolutionary new feature hasn’t had bugs?), it
is exacerbating the failings of the Linux capability “model” and raising
concerns about the balance between functionality and security. But if the
intention behind eBPF is to keep mortals from writing buggy kernel modules, it
may well succeed and improve the security status quo in doing so.
eBPF is slowly but surely becoming a framework within the kernel with an
ever-increasing menagerie of features, programming capabilities, kernel-backed
(and userland-mapped) data structures, helper functions (for which an in-eBPF
implementation would otherwise violate the safety restrictions), and pluggable
APIs. These include multiple forms of packet and socket filtering and
processing, an
interesting
API for adding encapsulation to packets based on route tables, shenanigans to
hook bind(2)
to “fix” broken apps that run in containers, and multiple APIs to attach them
to all manner of kernel-based tracing and instrumentation mechanisms. The last
of these is most relevant to our purposes, but we will nonetheless remind folks
to stay safe and make sure to
properly account for variable-length headers
when processing packets with eBPF.
BCC: C to eBPF By Way of the Long Way Round
BCC
(BPF Compiler Collection) is a toolkit for having userland code (generally
Python) interact with kernel space eBPF code, and includes an LLVM-based
cross-compilation toolchain that compiles C code into eBPF bytecode. At a very
high level, this toolchain is based on using Clang’s
RecursiveASTVisitor
AST traversal library to modify the C code into a “suitable” format and then
use LLVM’s eBPF backend to emit the bytecode. These modifications exist
primarily to replace external memory accesses with equivalent eBPF memory
accessing helper functions and expand other simplified C coding constructs
such as BCC library functions and
“magic” semantics
introduced by BCC that are used to denote the eBPF attach target
(e.g. kprobe__funcname to attach the eBPF-compiled function as a kprobe hook
on funcname). For a slightly deeper dive into how BCC C code works and how
eBPF kprobes are registered, see our talk
given recently at the 35C3 conference.
BCC and eBPF Bytecode Validator Hell
To make eBPF “safe,” the Linux kernel
validates
all eBPF code before loading it. For example, eBPF code is not allowed to
“loop” (to prevent infinite loops), so any attempt to run code at an
address/offset before the currently running eBPF instruction will be deemed
illegal (This restriction has been weakened slightly in newer versions of Linux
to reduce code bloat, but is still enforced by default.). Additionally, the
eBPF validator may detect loops when there are none in the source code; this
can happen due to compiler optimizations or because of faulty identification of
normal function calling operations as being loop-like. As such, BCC C often
uses unrolled loops and inlined functions. The eBPF validator additionally
has a number of call-site validations and register taint tracking logic that
attempt to ensure that helper functions, such as those used to manipulate
memory-mapped tables and access kernel memory, are only passed “safe” argument
values. This “logic” is problematic as it is often not thorough enough to
properly determine value bounds. This problem is further complicated by the
fact that BCC compiles code with
-O2;
most naïve attempts to make such bounds “more obvious” are likely to be
optimized out by Clang. Additionally, updating BCC (and possibly the Linux
kernel) may potentially result in a slightly different bytecode output that
trip the validator. However, this is generally not the case for very simple
code, such as that of the tools and examples code within the BCC repo itself.
We have also observed errors when using certain Linux/BCC versions where the
use of a bool function parameter was not tolerated in certain variants of our
code (e.g. different filtering comparisons being applied) and integer types
were not tolerated in the others; we originally had to solve this by using
#ifdef magic to control the parameter’s type depending on the variant of the
code until a BCC update unbroke it. These issues are so pervasive that the BCC
developers themselves appear to believe
that certain tolerable code constructs, such as variable-length byte copies,
are not possible in eBPF because the idiomatic code C is not accepted by the
eBPF validator.
While these issues present challenges when attempting to develop portable
BCC/eBPF-based tooling, it is useful to be able to disable these inane
validations when simply attempting to quickly trace a kernel function and
extract some interesting data. Unfortunately, the current validator
implementation suffers from high coupling-low cohesion as the validation
routine itself pre-processes the bytecode and configures the internal kernel
data structures responsible for running it. As a result, the
validation routine
cannot be bypassed directly with a NOP or by stomping over its implementation
with a return 0. Instead, one has to individually clip the strings of the
eBPF validator’s golden fiddle by performing a number of nigh-surgical function
hooks that will both bypass lower-level state validation checks and override
registers with safe bound values. We have implemented a set of such hooks that
have bypassed the more pernicious and maddening errors that we have experienced
while writing our Unix domain socket sniffer tool. While we do not recommend
using it in production as it can definitely lead to unstable and, more
importantly, unsafe eBPF code, our yolo-ebpf
kernel module can help in a pinch when attempting to reverse engineer
applications on the fly. And if you still happen to hit an incorrect eBPF
validator error while using it, please
send us an issue.
It is disappointing that such tomfoolery is needed in the first place, but
eBPF and BCC are both relatively new and these things take time.
unixdump: An eBPF-based Unix Domain Socket Sniffer
unixdump
is a full-featured utility for passively capturing Unix domain socket traffic
from Linux hosts built on top of eBPF and BCC. It can capture all traffic
across a host, including file descriptor transfers and Unix credential passes.
unixdump supports fine-grained filtering based on Unix domain socket paths
(including abstract namespace keys) and PIDs, and can perform both inclusive
and exclusive filtering of PIDs. unixdump additionally supports outputting to
readable log files amenable to extracting binary content (We are currently
looking into outputting to the pcapng format, which can support ancillary data,
but performantly and accurately timestamping events may pose a challenge).
Design and Implementation
As with other BCC-based tools, our userland event handling code is written in Python and our kernel space kprobe hook that generates events is written in C. Essentially, the C code is what performs the important operations; in our case, this is the retrieval and filtering of metadata and content from sockets and other kernel structures. This code then marshals the event data into a struct that is unpacked on the Python side. The Python code then processes the event stream into a more user-friendly data output.
This flow is implemented through the use of two ring buffers, one the
perf_event ring buffer, and the other a custom ring buffer built on top of an
eBPF map. Events are pushed to userspace through the perf_event ring buffer
using perf_submit calls in the C code. The Python userspace code constantly
poll(2)s file descriptors associated with these ring buffers to detect event
submissions. The Python code then attempts to read the rest of the data from
the asynchronously updated custom ring buffer mapped into userspace. Following
this, the Python code process the data and clears the custom ring buffer entry.
In unixdump, we are extracting the data sent over Unix domain sockets at one
of the lowest possible levels, from the internal msghdr structs holding them.
When the send syscall is invoked on a Unix domain socket, a msghdr
parameter, msg, gets passed along. The data in the msghdr struct is
contained within another structure, iov_iter, that is embedded into the
msghdr as its msg_iter field. iov_iters can wrap several kernel buffer
structures, but in our case, it uses the const struct iovec* iov union
variant, which is a simple structure that contains a buffer base pointer,
iov_base, and a buffer length, iov_len, that together refer to our Unix
domain socket message content.
We extract this data using the bpf_probe_read() helper function, which acts as
a “safe” memcpy enabling eBPF programs to read arbitrary kernel memory into
their own memory space. An interesting quirk of how BCC works is that function
calls to bpf_* functions, which are part of the kernel’s eBPF API, and other
BCC-specific helper functions/methods (yes, “methods”) are rewritten using an
LLVM-based code generation pass. This enables the helper functions to be
translated into the appropriate bpf_call instructions and is additionally
used to translate all kernel memory dereferences into bpf_probe_read() calls.
Unix domain sockets also allow processes to pass file descriptors to one
another (SCM_RIGHTS), and authenticate their identity (or act on behalf of
another) by passing user credentials (SCM_CREDENTIALS) through the kernel. This “ancillary data”
takes the form of several cmsghdr structures and CMSG_DATA payloads lined
up within a single byte buffer blob. This blob is pointed to by the
void* msg_control field of the msghdr struct and the
size_t msg_controllen field specifies the total size. To differentiate and
identify the raw contents of the CMSG_DATA payloads, the cmsghdr struct
stores metadata about the type and size of the data. For example, if the
int cmsg_type field is SCM_RIGHTS, the particular CMSG is being used to
pass file descriptors. An interesting quirk of the CSMG system in the kernel
is that separate CMSG objects of the same type will be combined into one
CMSG observed by the receiver. Like most ad-hoc data structures in the Linux
kernel codebase, CMSG blobs are not simple to parse given eBPF’s constraints.
In particular, these blobs are typically iterated through by using multiple
layers of pointer shifting macros that embed a for-loop construct to iterate
the initially unknown number of CMSG objects; it is worth keeping in mind
that the msghdr.msg_controllen field refers to the byte length of the whole
CMSG blob, and is used to ensure that CMSG objects are not iterated or
processed past the end of the buffer allocated to them. To get around the eBPF
limitations, we use CLI flag-based tunables to guide (hacky string concat) code
generation of C code that statically iterates these blobs, if present, and
copies the metadata and typing information into our ring buffer; we feel this
was still less painful to implement than it would have been to copy the entire
blob to userspace and process it in Python.
BCC, providing a userland interface on top of a kernel-only one, enables
BCC C code to specify I/O data structures that map to the ones in
<linux/bpf.h> through the use of BCC-provided macros. The primary benefit
of these structures is that they may allocate much more storage space than
is otherwise provided on the eBPF stack and are they are considered a valid
copy target for reading arbitrary kernel memory (there are some inconsistent
“protections” around copying pointer addresses directly onto the eBPF stack).
unixdump uses a BPF_PERCPU_ARRAY() to store (potentially large) message
content as it enables easier iterating of ring buffer slots. For simpler
one-off event notifications, BCC C supports setting and registering a
perf_event output ring buffer-based output struct through
BPF_PERF_OUTPUT(); the perf_submit() helper function may then be called on
the output object declared by the macro. This function call is actually
translated into a bpf_perf_event_output() helper function call through BCC’s
code generation.
Dividing our message content and event metadata across these two storage
mechanisms enables us to better tune memory usage; and detect, mitigate, and
report when unixdump is bottlenecking against the system. One major difference between
these two I/O mechanisms is that BPF_PERF_OUTPUT()-registered data structures
will be automatically parsed/deserialized by BCC, whereas BCC-registered
tables/arrays/maps will be provided to registered event handlers as raw byte
buffers, necessitating the use of custom Python ctypes parsing logic. However,
One major pain point to be aware of with this behavior of BCC is that in the
former case, char[] fields will be parsed as NUL-terminated C strings, and
all data after a NULL byte will be lost; it may not be recovered by using
ctypes.string_at. The solution is to use uint8_t[] for non-C string data,
as it will result in BCC reading all of the data.
When writing unixdump, we quickly learned that display server traffic
(e.g. X11) for terminal applications goes over Unix domain sockets. A naive
implementation would quickly result in a feedback loop that would suck up
memory and CPU resources. Since we wanted to avoid locking up the system, and
since we also wanted to capture tunable amounts of data larger than the eBPF
stack size (since eBPF cannot perform dynamic allocations), we went with a
CLI-configurable ring buffer for content storage. The current implementation
will simply drop events (but notify userland of the drop with additional
metadata) if the ring buffer slot is still in use by the time it is needed
again. We also do not directly perf_submit the large slots of the ring buffer
as this resulted in a large number of kernel-dropped perf events. Instead we
perf_submit smaller event metadata, which includes information like PIDs,
socket paths, and the index of the ring buffer slot synced to userspace using
the bpf_map_* APIs. This results a slight “race condition” in that the ring
buffer slot may not yet be accessible to userspace by the time we attempt to
access it. However, this delay is not subject to the ABA problem or any similar
use-after-free-like issues as the userland pages will have always been cleared
by the userspace code prior to being updated by the kernel. We use a few
fallback mechanisms to poll at it depending on whether or not event order
preservation is necessary, but will give up after a few tries as we have
observed complete losses of the data in some circumstances where the slot
never updates. It is currently unclear if this is due to a flaw in BCC or the
Linux kernel itself.
For better throughput, we perform various checks to determine whether it is
worth it to continue processing. For example, we will bail out early if various
validation checks do not pass (e.g. if certain metadata is missing or
unexpected). On top of this, we provide a number of in-kernel filters to reduce
and refine the amount of processing done in the kernel. Users can filter on
specific Unix domain socket paths (or match path prefixes) and PIDs. It is also
possible to exclude certain noisy PIDs altogether (e.g. the GUI terminal
process rendering the output of its own Unix socket communication with the
display server, or the display server itself). Using the filters will reduce
the amount of data copied to the fixed-size perf_events ring buffer and
therefore help to prevent it from overfilling and dropping (“missing” in the
Linux parlance) events that cannot fit. We also support configuring the size
of this buffer should stable throughput still be too much for the default size.
Case Study: Sniffing Frida C2 Traffic
Frida is a popular “cross-platform dynamic
instrumentation toolkit” that injects a JavaScript interpreter into a target
process and uses it to run a semi-DSL
of JavaScript function hooking code. The impetus for unixdump was part of a
greater desire to answer a seemingly simple question: “How does Frida work?”
More specifically, we were looking to find out how Frida’s agent
communication protocol works. At a high-level, Frida works by attaching to a
target process using platform-specific debugging APIs (i.e. ptrace(2) on
Linux, task_for_pid()/mach_vm_*() on macosx, and
OpenProcess()/VirtualAllocEx()/WriteProcessMemory()/CreateRemoteThread()
on Windows), and then uses them to inject an “agent” that runs within the
target. This agent is what runs the instrumenting JavaScript code and performs
the lower-level operations invoked by it (e.g. function hooking, memory
reads/writes, etc.). While Frida does support non-interactively loading a
single JavaScript file, its primary mode of operation involves the use of a
“client” process that interacts with the agent running within the target.
In addition to the protocol used for direct attachment, Frida also supports
having the client connect to a frida-server instance that makes direct
attachments to targets. While we are interested in the goings-on of the direct
attachment/connection case, it is worth noting that a frida-server can be
loaded into a given process through frida-gadget,
and that the frida-server and client libraries support several protocols to
enable remote attachment to hosts over TCP and mobile devices using a
TCP-forwarding mechanism to connect to a a frida-server on a USB-attached
device (e.g. ADB’s TCP forwarding for Android, and usbmuxd
TCP forwarding)
for iOS).
Through strace(1)-ing the client, it became clear early on that the
direct attachment communication protocol had to be transported over
Unix domain sockets with dynamically-generated paths. The problem was that
multiple such Unix domain sockets are created, and it wasn’t clear which ones
were being used. Additionally, because the Frida client is still
ptrace(2)-ing the target, we cannot simply strace(1) it, as strace(1)
uses ptrace(2) and a process can only be ptrace(2)-d by one tracer at
a time. While we could have tried to hack up a sniffer by instrumenting the
Frida client itself to hook its Unix domain socket I/O, this was not an ideal
solution for a number of reasons, and we instead tried to solve the Unix domain
socket traffic sniffing problem once and for all (on Linux at least). After we
got the MVP version of the eBPF hooks running, it quickly became obvious that
Frida uses DBus to serialize custom API
calls over Unix domain sockets. In fact, outside of specific protocols to
initialize direct connections between Frida clients, servers, and targets,
pretty much all of Frida’s communications use the DBus protocol.
Frida Agent Script Communication Protocol
Using the code in the first example defined in the
Frida documentation, we demonstrate
unixdump’s ability to intercept Unix domain socket traffic. Knowing (from
strace(1)) that Frida’s Unix domain socket paths begin with /tmp/frida, we
can instruct unixdump to filter for messages starting with that path name:
sudo unixdump -s '/tmp/frida' -b
We then proceed to start our target binary, hello, and our Frida hook script,
hook.py, passing the latter the PID from of the hello process:
./hello &
./hook.py $FUNCTION_VALUE
- When Frida starts, it begins authenticating via DBus and the hooked process
begins to send Unix credentials to identifying itself (in this case,
hellowas run as root):Output
==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 1 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ancillary data sent (attempted): 1 CMSG observed SCM_CREDENTIALS: pid=26525 uid=0(root) gid=0(root) ---- 00000000: 00 . ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 6 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 41 55 54 48 0D 0A AUTH.. ==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 46 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 52 45 4A 45 43 54 45 44 20 45 58 54 45 52 4E 41 REJECTED EXTERNA 00000010: 4C 20 41 4E 4F 4E 59 4D 4F 55 53 20 44 42 55 53 L ANONYMOUS DBUS 00000020: 5F 43 4F 4F 4B 49 45 5F 53 48 41 31 0D 0A _COOKIE_SHA1.. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 18 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 41 55 54 48 20 45 58 54 45 52 4E 41 4C 20 33 30 AUTH EXTERNAL 30 00000010: 0D 0A .. ==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 37 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 4F 4B 20 36 37 36 39 37 34 36 38 37 35 36 32 32 OK 6769746875622 00000010: 65 36 33 36 66 36 64 32 66 36 36 37 32 36 39 36 e636f6d2f6672696 00000020: 34 36 31 0D 0A 461.. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 19 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 4E 45 47 4F 54 49 41 54 45 5F 55 4E 49 58 5F 46 NEGOTIATE_UNIX_F 00000010: 44 0D 0A D.. ==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 15 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 41 47 52 45 45 5F 55 4E 49 58 5F 46 44 0D 0A AGREE_UNIX_FD.. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 7 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 42 45 47 49 4E 0D 0A BEGIN.. - Afterwords, Frida performs a
GetAllrequest of the DBus properties:Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 156 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 24 00 00 00 01 00 00 00 68 00 00 00 l...$.......h... 00000010: 08 01 67 00 01 73 00 00 01 01 6F 00 1E 00 00 00 ..g..s....o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 53 /re/frida/AgentS 00000030: 65 73 73 69 6F 6E 50 72 6F 76 69 64 65 72 00 00 essionProvider.. 00000040: 03 01 73 00 06 00 00 00 47 65 74 41 6C 6C 00 00 ..s.....GetAll.. 00000050: 02 01 73 00 1F 00 00 00 6F 72 67 2E 66 72 65 65 ..s.....org.free 00000060: 64 65 73 6B 74 6F 70 2E 44 42 75 73 2E 50 72 6F desktop.DBus.Pro 00000070: 70 65 72 74 69 65 73 00 1F 00 00 00 72 65 2E 66 perties.....re.f 00000080: 72 69 64 61 2E 41 67 65 6E 74 53 65 73 73 69 6F rida.AgentSessio 00000090: 6E 50 72 6F 76 69 64 65 72 31 32 00 nProvider12. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 151 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 01 00 01 1F 00 00 00 01 00 00 00 68 00 00 00 l...........h... 00000010: 08 01 67 00 01 73 00 00 01 01 6F 00 19 00 00 00 ..g..s....o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 43 /re/frida/AgentC 00000030: 6F 6E 74 72 6F 6C 6C 65 72 00 00 00 00 00 00 00 ontroller....... 00000040: 03 01 73 00 06 00 00 00 47 65 74 41 6C 6C 00 00 ..s.....GetAll.. 00000050: 02 01 73 00 1F 00 00 00 6F 72 67 2E 66 72 65 65 ..s.....org.free 00000060: 64 65 73 6B 74 6F 70 2E 44 42 75 73 2E 50 72 6F desktop.DBus.Pro 00000070: 70 65 72 74 69 65 73 00 1A 00 00 00 72 65 2E 66 perties.....re.f 00000080: 72 69 64 61 2E 41 67 65 6E 74 43 6F 6E 74 72 6F rida.AgentContro 00000090: 6C 6C 65 72 31 32 00 ller12. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 48 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 08 00 00 00 02 00 00 00 18 00 00 00 l............... 00000010: 08 01 67 00 05 61 7B 73 76 7D 00 00 00 00 00 00 ..g..a{sv}...... 00000020: 05 01 75 00 01 00 00 00 00 00 00 00 00 00 00 00 ..u............. ==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 48 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 02 01 01 08 00 00 00 02 00 00 00 18 00 00 00 l............... 00000010: 08 01 67 00 05 61 7B 73 76 7D 00 00 00 00 00 00 ..g..a{sv}...... 00000020: 05 01 75 00 01 00 00 00 00 00 00 00 00 00 00 00 ..u............. - Frida then instructs the script to open and waits for a confirmation that
the open succeeded:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 132 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 04 00 00 00 03 00 00 00 70 00 00 00 l...........p... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 01 01 6F 00 1E 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 50 72 a/AgentSessionPr 00000040: 6F 76 69 64 65 72 00 00 03 01 73 00 04 00 00 00 ovider....s..... 00000050: 4F 70 65 6E 00 00 00 00 02 01 73 00 1F 00 00 00 Open......s..... 00000060: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000070: 73 73 69 6F 6E 50 72 6F 76 69 64 65 72 31 32 00 ssionProvider12. 00000080: 01 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 132 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 04 01 01 04 00 00 00 03 00 00 00 70 00 00 00 l...........p... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 01 01 6F 00 1E 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 50 72 a/AgentSessionPr 00000040: 6F 76 69 64 65 72 00 00 03 01 73 00 06 00 00 00 ovider....s..... 00000050: 4F 70 65 6E 65 64 00 00 02 01 73 00 1F 00 00 00 Opened....s..... 00000060: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000070: 73 73 69 6F 6E 50 72 6F 76 69 64 65 72 31 32 00 ssionProvider12. 00000080: 01 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 00 00 00 00 04 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 03 00 00 00 ..g.......u..... ==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 148 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 1C 00 00 00 04 00 00 00 68 00 00 00 l...........h... 00000010: 08 01 67 00 01 73 00 00 01 01 6F 00 18 00 00 00 ..g..s....o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 53 /re/frida/AgentS 00000030: 65 73 73 69 6F 6E 2F 31 00 00 00 00 00 00 00 00 ession/1........ 00000040: 03 01 73 00 06 00 00 00 47 65 74 41 6C 6C 00 00 ..s.....GetAll.. 00000050: 02 01 73 00 1F 00 00 00 6F 72 67 2E 66 72 65 65 ..s.....org.free 00000060: 64 65 73 6B 74 6F 70 2E 44 42 75 73 2E 50 72 6F desktop.DBus.Pro 00000070: 70 65 72 74 69 65 73 00 17 00 00 00 72 65 2E 66 perties.....re.f 00000080: 72 69 64 61 2E 41 67 65 6E 74 53 65 73 73 69 6F rida.AgentSessio 00000090: 6E 31 32 00 n12. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 48 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 08 00 00 00 05 00 00 00 18 00 00 00 l............... 00000010: 08 01 67 00 05 61 7B 73 76 7D 00 00 00 00 00 00 ..g..a{sv}...... 00000020: 05 01 75 00 04 00 00 00 00 00 00 00 00 00 00 00 ..u............. - Following this, Frida creates the JavaScript to be injected:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 251 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 83 00 00 00 05 00 00 00 68 00 00 00 l...........h... 00000010: 08 01 67 00 02 73 73 00 01 01 6F 00 18 00 00 00 ..g..ss...o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 53 /re/frida/AgentS 00000030: 65 73 73 69 6F 6E 2F 31 00 00 00 00 00 00 00 00 ession/1........ 00000040: 03 01 73 00 0C 00 00 00 43 72 65 61 74 65 53 63 ..s.....CreateSc 00000050: 72 69 70 74 00 00 00 00 02 01 73 00 17 00 00 00 ript......s..... 00000060: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000070: 73 73 69 6F 6E 31 32 00 00 00 00 00 00 00 00 00 ssion12......... 00000080: 76 00 00 00 0A 49 6E 74 65 72 63 65 70 74 6F 72 v....Interceptor 00000090: 2E 61 74 74 61 63 68 28 70 74 72 28 22 39 34 32 .attach(ptr("942 000000A0: 37 32 36 36 32 37 32 39 30 34 35 22 29 2C 20 7B 72662729045"), { 000000B0: 0A 20 20 20 20 6F 6E 45 6E 74 65 72 3A 20 66 75 . onEnter: fu 000000C0: 6E 63 74 69 6F 6E 28 61 72 67 73 29 20 7B 0A 20 nction(args) {. 000000D0: 20 20 20 20 20 20 20 73 65 6E 64 28 61 72 67 73 send(args 000000E0: 5B 30 5D 2E 74 6F 49 6E 74 33 32 28 29 29 3B 0A [0].toInt32());. 000000F0: 20 20 20 20 7D 0A 7D 29 3B 0A 00 }.});.. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 44 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 04 00 00 00 06 00 00 00 18 00 00 00 l............... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 05 01 75 00 05 00 00 00 01 00 00 00 ..u......... - Frida then signals the agent to load the script:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 132 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 04 00 00 00 06 00 00 00 70 00 00 00 l...........p... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 0A 00 00 00 ..........s..... 00000050: 4C 6F 61 64 53 63 72 69 70 74 00 00 00 00 00 00 LoadScript...... 00000060: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000070: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. 00000080: 01 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 00 00 00 00 07 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 06 00 00 00 ..g.......u..... - The injected script, when run, returns the requested data through the socket:
Output
==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 180 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 04 01 01 2C 00 00 00 08 00 00 00 78 00 00 00 l...,.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 1B 00 00 00 ssion12......... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 31 7D 00 00 00 00 00 payload":1}..... 000000B0: 00 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 180 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 04 01 01 2C 00 00 00 09 00 00 00 78 00 00 00 l...,.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 1B 00 00 00 ssion12......... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 32 7D 00 00 00 00 00 payload":2}..... 000000B0: 00 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 180 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 04 01 01 2C 00 00 00 0A 00 00 00 78 00 00 00 l...,.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 1B 00 00 00 ssion12......... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 33 7D 00 00 00 00 00 payload":3}..... 000000B0: 00 00 00 00 .... ---snip--- - This message repeats with the payload incrementing by
1as specified in the example code. When the user is done using Frida, Frida will instruct the agent to unload the injected script:Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 132 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 04 00 00 00 07 00 00 00 70 00 00 00 l...........p... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 0D 00 00 00 ..........s..... 00000050: 44 65 73 74 72 6F 79 53 63 72 69 70 74 00 00 00 DestroyScript... 00000060: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000070: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. 00000080: 01 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 00 00 00 00 0B 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 07 00 00 00 ..g.......u..... - The injected script is then closed by Frida:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 112 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 00 00 00 00 08 00 00 00 60 00 00 00 l...........`... 00000010: 08 01 67 00 00 00 00 00 01 01 6F 00 18 00 00 00 ..g.......o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 53 /re/frida/AgentS 00000030: 65 73 73 69 6F 6E 2F 31 00 00 00 00 00 00 00 00 ession/1........ 00000040: 03 01 73 00 05 00 00 00 43 6C 6F 73 65 00 00 00 ..s.....Close... 00000050: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000060: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 132 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 04 01 01 04 00 00 00 0C 00 00 00 70 00 00 00 l...........p... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 01 01 6F 00 1E 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 50 72 a/AgentSessionPr 00000040: 6F 76 69 64 65 72 00 00 03 01 73 00 06 00 00 00 ovider....s..... 00000050: 43 6C 6F 73 65 64 00 00 02 01 73 00 1F 00 00 00 Closed....s..... 00000060: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000070: 73 73 69 6F 6E 50 72 6F 76 69 64 65 72 31 32 00 ssionProvider12. 00000080: 01 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 00 00 00 00 0D 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 08 00 00 00 ..g.......u..... - And, for the final step, Frida instructs the injected agent to unload:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 120 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 00 00 00 00 09 00 00 00 68 00 00 00 l...........h... 00000010: 08 01 67 00 00 00 00 00 01 01 6F 00 1E 00 00 00 ..g.......o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 53 /re/frida/AgentS 00000030: 65 73 73 69 6F 6E 50 72 6F 76 69 64 65 72 00 00 essionProvider.. 00000040: 03 01 73 00 06 00 00 00 55 6E 6C 6F 61 64 00 00 ..s.....Unload.. 00000050: 02 01 73 00 1F 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000060: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 50 72 6F .AgentSessionPro 00000070: 76 69 64 65 72 31 32 00 vider12. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 00 00 00 00 0E 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 09 00 00 00 ..g.......u.....
Frida CLI Tab Completion Protocol
The Frida command line tool has a tab completion-based prompt that allows quick access to all of its features. We will examine the communications that occur while Frida is performing a tab complete operation.
- Frida starts the interaction by sending a
PostToScriptcommand to the injected script. The script sent callsObject.getOwnProperties()on thethisobject:Output
==== STREAM PID 26851.0xffff8cd6a8a25900 (S) > 26847.0xffff8cd62e7e9100 (C), length 220 command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' command[26847]: './hello' ---- 00000000: 6C 01 00 01 5C 00 00 00 0B 00 00 00 70 00 00 00 l...\.......p... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 0C 00 00 00 ..........s..... 00000050: 50 6F 73 74 54 6F 53 63 72 69 70 74 00 00 00 00 PostToScript.... 00000060: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000070: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. 00000080: 01 00 00 00 4A 00 00 00 5B 22 66 72 69 64 61 3A ....J...["frida: 00000090: 72 70 63 22 2C 20 35 2C 20 22 63 61 6C 6C 22 2C rpc", 5, "call", 000000A0: 20 22 65 76 61 6C 75 61 74 65 22 2C 20 5B 22 4F "evaluate", ["O 000000B0: 62 6A 65 63 74 2E 67 65 74 4F 77 6E 50 72 6F 70 bject.getOwnProp 000000C0: 65 72 74 79 4E 61 6D 65 73 28 74 68 69 73 29 22 ertyNames(this)" 000000D0: 5D 5D 00 00 00 00 00 00 00 00 00 00 ]].......... ==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 32 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 02 01 01 00 00 00 00 10 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 0B 00 00 00 ..g.......u..... - This causes the Frida agent within the target process to evaluate the script
and return all properties of the
thisobject, the possible actions and available commands we are attempting to tab complete:Output
==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 1736 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 04 01 01 40 06 00 00 11 00 00 00 78 00 00 00 l...@.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 2C 06 00 00 ssion12.....,... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 5B 22 66 72 69 64 61 payload":["frida 000000B0: 3A 72 70 63 22 2C 35 2C 22 6F 6B 22 2C 5B 22 6F :rpc",5,"ok",["o 000000C0: 62 6A 65 63 74 22 2C 5B 22 4E 61 4E 22 2C 22 49 bject",["NaN","I 000000D0: 6E 66 69 6E 69 74 79 22 2C 22 75 6E 64 65 66 69 nfinity","undefi 000000E0: 6E 65 64 22 2C 22 4F 62 6A 65 63 74 22 2C 22 46 ned","Object","F 000000F0: 75 6E 63 74 69 6F 6E 22 2C 22 41 72 72 61 79 22 unction","Array" 00000100: 2C 22 53 74 72 69 6E 67 22 2C 22 42 6F 6F 6C 65 ,"String","Boole 00000110: 61 6E 22 2C 22 4E 75 6D 62 65 72 22 2C 22 44 61 an","Number","Da 00000120: 74 65 22 2C 22 52 65 67 45 78 70 22 2C 22 45 72 te","RegExp","Er 00000130: 72 6F 72 22 2C 22 45 76 61 6C 45 72 72 6F 72 22 ror","EvalError" 00000140: 2C 22 52 61 6E 67 65 45 72 72 6F 72 22 2C 22 52 ,"RangeError","R 00000150: 65 66 65 72 65 6E 63 65 45 72 72 6F 72 22 2C 22 eferenceError"," 00000160: 53 79 6E 74 61 78 45 72 72 6F 72 22 2C 22 54 79 SyntaxError","Ty 00000170: 70 65 45 72 72 6F 72 22 2C 22 55 52 49 45 72 72 peError","URIErr 00000180: 6F 72 22 2C 22 4D 61 74 68 22 2C 22 4A 53 4F 4E or","Math","JSON 00000190: 22 2C 22 44 75 6B 74 61 70 65 22 2C 22 50 72 6F ","Duktape","Pro 000001A0: 78 79 22 2C 22 52 65 66 6C 65 63 74 22 2C 22 42 xy","Reflect","B 000001B0: 75 66 66 65 72 22 2C 22 41 72 72 61 79 42 75 66 uffer","ArrayBuf 000001C0: 66 65 72 22 2C 22 44 61 74 61 56 69 65 77 22 2C fer","DataView", 000001D0: 22 49 6E 74 38 41 72 72 61 79 22 2C 22 55 69 6E "Int8Array","Uin 000001E0: 74 38 41 72 72 61 79 22 2C 22 55 69 6E 74 38 43 t8Array","Uint8C 000001F0: 6C 61 6D 70 65 64 41 72 72 61 79 22 2C 22 49 6E lampedArray","In 00000200: 74 31 36 41 72 72 61 79 22 2C 22 55 69 6E 74 31 t16Array","Uint1 00000210: 36 41 72 72 61 79 22 2C 22 49 6E 74 33 32 41 72 6Array","Int32Ar 00000220: 72 61 79 22 2C 22 55 69 6E 74 33 32 41 72 72 61 ray","Uint32Arra 00000230: 79 22 2C 22 46 6C 6F 61 74 33 32 41 72 72 61 79 y","Float32Array 00000240: 22 2C 22 46 6C 6F 61 74 36 34 41 72 72 61 79 22 ","Float64Array" 00000250: 2C 22 70 61 72 73 65 49 6E 74 22 2C 22 70 61 72 ,"parseInt","par 00000260: 73 65 46 6C 6F 61 74 22 2C 22 54 65 78 74 45 6E seFloat","TextEn 00000270: 63 6F 64 65 72 22 2C 22 54 65 78 74 44 65 63 6F coder","TextDeco 00000280: 64 65 72 22 2C 22 70 65 72 66 6F 72 6D 61 6E 63 der","performanc 00000290: 65 22 2C 22 65 76 61 6C 22 2C 22 69 73 4E 61 4E e","eval","isNaN 000002A0: 22 2C 22 69 73 46 69 6E 69 74 65 22 2C 22 64 65 ","isFinite","de 000002B0: 63 6F 64 65 55 52 49 22 2C 22 64 65 63 6F 64 65 codeURI","decode 000002C0: 55 52 49 43 6F 6D 70 6F 6E 65 6E 74 22 2C 22 65 URIComponent","e 000002D0: 6E 63 6F 64 65 55 52 49 22 2C 22 65 6E 63 6F 64 ncodeURI","encod 000002E0: 65 55 52 49 43 6F 6D 70 6F 6E 65 6E 74 22 2C 22 eURIComponent"," 000002F0: 65 73 63 61 70 65 22 2C 22 75 6E 65 73 63 61 70 escape","unescap 00000300: 65 22 2C 22 67 6C 6F 62 61 6C 22 2C 22 46 72 69 e","global","Fri 00000310: 64 61 22 2C 22 53 63 72 69 70 74 22 2C 22 57 65 da","Script","We 00000320: 61 6B 52 65 66 22 2C 22 5F 73 65 74 54 69 6D 65 akRef","_setTime 00000330: 6F 75 74 22 2C 22 5F 73 65 74 49 6E 74 65 72 76 out","_setInterv 00000340: 61 6C 22 2C 22 63 6C 65 61 72 54 69 6D 65 6F 75 al","clearTimeou 00000350: 74 22 2C 22 63 6C 65 61 72 49 6E 74 65 72 76 61 t","clearInterva 00000360: 6C 22 2C 22 67 63 22 2C 22 5F 73 65 6E 64 22 2C l","gc","_send", 00000370: 22 5F 73 65 74 55 6E 68 61 6E 64 6C 65 64 45 78 "_setUnhandledEx 00000380: 63 65 70 74 69 6F 6E 43 61 6C 6C 62 61 63 6B 22 ceptionCallback" 00000390: 2C 22 5F 73 65 74 49 6E 63 6F 6D 69 6E 67 4D 65 ,"_setIncomingMe 000003A0: 73 73 61 67 65 43 61 6C 6C 62 61 63 6B 22 2C 22 ssageCallback"," 000003B0: 5F 77 61 69 74 46 6F 72 45 76 65 6E 74 22 2C 22 _waitForEvent"," 000003C0: 49 6E 74 36 34 22 2C 22 55 49 6E 74 36 34 22 2C Int64","UInt64", 000003D0: 22 4E 61 74 69 76 65 50 6F 69 6E 74 65 72 22 2C "NativePointer", 000003E0: 22 4E 61 74 69 76 65 52 65 73 6F 75 72 63 65 22 "NativeResource" 000003F0: 2C 22 4E 61 74 69 76 65 46 75 6E 63 74 69 6F 6E ,"NativeFunction 00000400: 22 2C 22 53 79 73 74 65 6D 46 75 6E 63 74 69 6F ","SystemFunctio 00000410: 6E 22 2C 22 4E 61 74 69 76 65 43 61 6C 6C 62 61 n","NativeCallba 00000420: 63 6B 22 2C 22 43 70 75 43 6F 6E 74 65 78 74 22 ck","CpuContext" 00000430: 2C 22 53 6F 75 72 63 65 4D 61 70 22 2C 22 4B 65 ,"SourceMap","Ke 00000440: 72 6E 65 6C 22 2C 22 4D 65 6D 6F 72 79 22 2C 22 rnel","Memory"," 00000450: 4D 65 6D 6F 72 79 41 63 63 65 73 73 4D 6F 6E 69 MemoryAccessMoni 00000460: 74 6F 72 22 2C 22 50 72 6F 63 65 73 73 22 2C 22 tor","Process"," 00000470: 54 68 72 65 61 64 22 2C 22 42 61 63 6B 74 72 61 Thread","Backtra 00000480: 63 65 72 22 2C 22 4D 6F 64 75 6C 65 22 2C 22 4D cer","Module","M 00000490: 6F 64 75 6C 65 4D 61 70 22 2C 22 46 69 6C 65 22 oduleMap","File" 000004A0: 2C 22 49 4F 53 74 72 65 61 6D 22 2C 22 49 6E 70 ,"IOStream","Inp 000004B0: 75 74 53 74 72 65 61 6D 22 2C 22 4F 75 74 70 75 utStream","Outpu 000004C0: 74 53 74 72 65 61 6D 22 2C 22 55 6E 69 78 49 6E tStream","UnixIn 000004D0: 70 75 74 53 74 72 65 61 6D 22 2C 22 55 6E 69 78 putStream","Unix 000004E0: 4F 75 74 70 75 74 53 74 72 65 61 6D 22 2C 22 53 OutputStream","S 000004F0: 6F 63 6B 65 74 22 2C 22 53 6F 63 6B 65 74 4C 69 ocket","SocketLi 00000500: 73 74 65 6E 65 72 22 2C 22 53 6F 63 6B 65 74 43 stener","SocketC 00000510: 6F 6E 6E 65 63 74 69 6F 6E 22 2C 22 53 71 6C 69 onnection","Sqli 00000520: 74 65 44 61 74 61 62 61 73 65 22 2C 22 53 71 6C teDatabase","Sql 00000530: 69 74 65 53 74 61 74 65 6D 65 6E 74 22 2C 22 49 iteStatement","I 00000540: 6E 74 65 72 63 65 70 74 6F 72 22 2C 22 49 6E 76 nterceptor","Inv 00000550: 6F 63 61 74 69 6F 6E 4C 69 73 74 65 6E 65 72 22 ocationListener" 00000560: 2C 22 49 6E 76 6F 63 61 74 69 6F 6E 43 6F 6E 74 ,"InvocationCont 00000570: 65 78 74 22 2C 22 49 6E 76 6F 63 61 74 69 6F 6E ext","Invocation 00000580: 41 72 67 73 22 2C 22 49 6E 76 6F 63 61 74 69 6F Args","Invocatio 00000590: 6E 52 65 74 75 72 6E 56 61 6C 75 65 22 2C 22 41 nReturnValue","A 000005A0: 70 69 52 65 73 6F 6C 76 65 72 22 2C 22 44 65 62 piResolver","Deb 000005B0: 75 67 53 79 6D 62 6F 6C 22 2C 22 49 6E 73 74 72 ugSymbol","Instr 000005C0: 75 63 74 69 6F 6E 22 2C 22 58 38 36 57 72 69 74 uction","X86Writ 000005D0: 65 72 22 2C 22 58 38 36 52 65 6C 6F 63 61 74 6F er","X86Relocato 000005E0: 72 22 2C 22 53 74 61 6C 6B 65 72 22 2C 22 53 74 r","Stalker","St 000005F0: 61 6C 6B 65 72 49 74 65 72 61 74 6F 72 22 2C 22 alkerIterator"," 00000600: 50 72 6F 62 65 41 72 67 73 22 2C 22 5F 5F 63 6F ProbeArgs","__co 00000610: 72 65 2D 6A 73 5F 73 68 61 72 65 64 5F 5F 22 2C re-js_shared__", 00000620: 22 50 72 6F 6D 69 73 65 22 2C 22 72 70 63 22 2C "Promise","rpc", 00000630: 22 72 65 63 76 22 2C 22 73 65 6E 64 22 2C 22 73 "recv","send","s 00000640: 65 74 54 69 6D 65 6F 75 74 22 2C 22 73 65 74 49 etTimeout","setI 00000650: 6E 74 65 72 76 61 6C 22 2C 22 73 65 74 49 6D 6D nterval","setImm 00000660: 65 64 69 61 74 65 22 2C 22 63 6C 65 61 72 49 6D ediate","clearIm 00000670: 6D 65 64 69 61 74 65 22 2C 22 69 6E 74 36 34 22 mediate","int64" 00000680: 2C 22 75 69 6E 74 36 34 22 2C 22 70 74 72 22 2C ,"uint64","ptr", 00000690: 22 4E 55 4C 4C 22 2C 22 63 6F 6E 73 6F 6C 65 22 "NULL","console" 000006A0: 2C 22 68 65 78 64 75 6D 70 22 2C 22 4F 62 6A 43 ,"hexdump","ObjC 000006B0: 22 2C 22 4A 61 76 61 22 5D 5D 5D 7D 00 00 00 00 ","Java"]]]}.... 000006C0: 00 00 00 00 00 00 00 00 ........ - Seeing this list, we begin to type out
File.and hit tab to see our options.Object.getOwnProperties()is called again, but now it is called onFile. This returns us the following attributes:prototype,length, andname:Output
==== STREAM PID 26851.0xffff8cd6a8a25900 (S) > 26847.0xffff8cd62e7e9100 (C), length 1220 command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' command[26847]: './hello' ---- 00000000: 6C 01 00 01 44 04 00 00 0C 00 00 00 70 00 00 00 l...D.......p... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 0C 00 00 00 ..........s..... 00000050: 50 6F 73 74 54 6F 53 63 72 69 70 74 00 00 00 00 PostToScript.... 00000060: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000070: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. 00000080: 01 00 00 00 30 04 00 00 5B 22 66 72 69 64 61 3A ....0...["frida: 00000090: 72 70 63 22 2C 20 36 2C 20 22 63 61 6C 6C 22 2C rpc", 6, "call", 000000A0: 20 22 65 76 61 6C 75 61 74 65 22 2C 20 5B 22 74 "evaluate", ["t 000000B0: 72 79 20 7B 5C 6E 20 20 20 20 20 20 20 20 20 20 ry {\n 000000C0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000000D0: 20 20 20 20 20 20 20 20 20 20 28 66 75 6E 63 74 (funct 000000E0: 69 6F 6E 20 28 6F 29 20 7B 5C 6E 20 20 20 20 20 ion (o) {\n 000000F0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000100: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000110: 20 20 20 5C 22 75 73 65 20 73 74 72 69 63 74 5C \"use strict\ 00000120: 22 3B 5C 6E 20 20 20 20 20 20 20 20 20 20 20 20 ";\n 00000130: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000140: 20 20 20 20 20 20 20 20 20 20 20 20 76 61 72 20 var 00000150: 6B 20 3D 20 4F 62 6A 65 63 74 2E 67 65 74 4F 77 k = Object.getOw 00000160: 6E 50 72 6F 70 65 72 74 79 4E 61 6D 65 73 28 6F nPropertyNames(o 00000170: 29 3B 5C 6E 20 20 20 20 20 20 20 20 20 20 20 20 );\n 00000180: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000190: 20 20 20 20 20 20 20 20 20 20 20 20 69 66 20 28 if ( 000001A0: 6F 20 21 3D 3D 20 6E 75 6C 6C 20 26 26 20 6F 20 o !== null && o 000001B0: 21 3D 3D 20 75 6E 64 65 66 69 6E 65 64 29 20 7B !== undefined) { 000001C0: 5C 6E 20 20 20 20 20 20 20 20 20 20 20 20 20 20 \n 000001D0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000001E0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 76 61 va 000001F0: 72 20 70 3B 5C 6E 20 20 20 20 20 20 20 20 20 20 r p;\n 00000200: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000210: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000220: 20 20 69 66 20 28 74 79 70 65 6F 66 20 6F 20 21 if (typeof o ! 00000230: 3D 3D 20 27 6F 62 6A 65 63 74 27 29 5C 6E 20 20 == 'object')\n 00000240: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000250: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000260: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 70 20 p 00000270: 3D 20 6F 2E 5F 5F 70 72 6F 74 6F 5F 5F 3B 5C 6E = o.__proto__;\n 00000280: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000290: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000002A0: 20 20 20 20 20 20 20 20 20 20 20 20 65 6C 73 65 else 000002B0: 5C 6E 20 20 20 20 20 20 20 20 20 20 20 20 20 20 \n 000002C0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000002D0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000002E0: 20 20 70 20 3D 20 4F 62 6A 65 63 74 2E 67 65 74 p = Object.get 000002F0: 50 72 6F 74 6F 74 79 70 65 4F 66 28 6F 29 3B 5C PrototypeOf(o);\ 00000300: 6E 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 n 00000310: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000320: 20 20 20 20 20 20 20 20 20 20 20 20 20 69 66 20 if 00000330: 28 70 20 21 3D 3D 20 6E 75 6C 6C 20 26 26 20 70 (p !== null && p 00000340: 20 21 3D 3D 20 75 6E 64 65 66 69 6E 65 64 29 5C !== undefined)\ 00000350: 6E 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 n 00000360: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000370: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000380: 20 6B 20 3D 20 6B 2E 63 6F 6E 63 61 74 28 4F 62 k = k.concat(Ob 00000390: 6A 65 63 74 2E 67 65 74 4F 77 6E 50 72 6F 70 65 ject.getOwnPrope 000003A0: 72 74 79 4E 61 6D 65 73 28 70 29 29 3B 5C 6E 20 rtyNames(p));\n 000003B0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000003C0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000003D0: 20 20 20 20 20 20 20 7D 5C 6E 20 20 20 20 20 20 }\n 000003E0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000003F0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000400: 20 20 72 65 74 75 72 6E 20 6B 3B 5C 6E 20 20 20 return k;\n 00000410: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000420: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000430: 20 7D 29 28 46 69 6C 65 29 3B 5C 6E 20 20 20 20 })(File);\n 00000440: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000450: 20 20 20 20 20 20 20 20 20 20 20 20 7D 20 63 61 } ca 00000460: 74 63 68 20 28 65 29 20 7B 5C 6E 20 20 20 20 20 tch (e) {\n 00000470: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000480: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 5B [ 00000490: 5D 3B 5C 6E 20 20 20 20 20 20 20 20 20 20 20 20 ];\n 000004A0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000004B0: 20 20 20 20 7D 22 5D 5D 00 00 00 00 00 00 00 00 }"]]........ 000004C0: 00 00 00 00 .... ==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 32 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 02 01 01 00 00 00 00 12 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 0C 00 00 00 ..g.......u..... ==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 240 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 04 01 01 68 00 00 00 13 00 00 00 78 00 00 00 l...h.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 57 00 00 00 ssion12.....W... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 5B 22 66 72 69 64 61 payload":["frida 000000B0: 3A 72 70 63 22 2C 36 2C 22 6F 6B 22 2C 5B 22 6F :rpc",6,"ok",["o 000000C0: 62 6A 65 63 74 22 2C 5B 22 70 72 6F 74 6F 74 79 bject",["prototy 000000D0: 70 65 22 2C 22 6C 65 6E 67 74 68 22 2C 22 6E 61 pe","length","na 000000E0: 6D 65 22 5D 5D 5D 7D 00 00 00 00 00 00 00 00 00 me"]]]}......... - Back in the UI, we tab cycle to the
lengthattribute and hit enter onFile.length. This tells the injected script to callevaluateonFile.length. The script responds with an array indicating that the type of the evaluated expression is"number"and the value is2:Output
==== STREAM PID 26851.0xffff8cd6a8a25900 (S) > 26847.0xffff8cd62e7e9100 (C), length 200 command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' command[26847]: './hello' ---- 00000000: 6C 01 00 01 48 00 00 00 0D 00 00 00 70 00 00 00 l...H.......p... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 0C 00 00 00 ..........s..... 00000050: 50 6F 73 74 54 6F 53 63 72 69 70 74 00 00 00 00 PostToScript.... 00000060: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000070: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. 00000080: 01 00 00 00 35 00 00 00 5B 22 66 72 69 64 61 3A ....5...["frida: 00000090: 72 70 63 22 2C 20 37 2C 20 22 63 61 6C 6C 22 2C rpc", 7, "call", 000000A0: 20 22 65 76 61 6C 75 61 74 65 22 2C 20 5B 22 46 "evaluate", ["F 000000B0: 69 6C 65 2E 6C 65 6E 67 74 68 22 5D 5D 00 00 00 ile.length"]]... 000000C0: 00 00 00 00 00 00 00 00 ........ ==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 32 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 02 01 01 00 00 00 00 14 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 0D 00 00 00 ..g.......u..... ==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 212 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 04 01 01 4C 00 00 00 15 00 00 00 78 00 00 00 l...L.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 3B 00 00 00 ssion12.....;... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 5B 22 66 72 69 64 61 payload":["frida 000000B0: 3A 72 70 63 22 2C 37 2C 22 6F 6B 22 2C 5B 22 6E :rpc",7,"ok",["n 000000C0: 75 6D 62 65 72 22 2C 32 5D 5D 7D 00 00 00 00 00 umber",2]]}..... 000000D0: 00 00 00 00 ....
eBPF Coding Tricks
While writing unixdump, we spent an inordinate amount of time attempting to
please the eBPF bytecode validator with code constructs it would accept. Most
of the time, it would not like idiomatic code that was correct; this was
seemingly due to compiler optimizations used by the BCC toolchain. Regardless,
we often had to obscure our code in ways that would enable it to pass
inspection, and, as a result, the code likely performs worse than if the
validator worked correctly in the first place. Additionally, as some of the
data structures we needed to parse are dynamically sized and based on dynamic
offsets, we had to write (or generate) inline code to parse them directly
without loops or recursion. And then there are the generic eBPF hoops that need
to be jumped through on a regular basis.
eBPF Gotchas
No Loops, No Jumper Cables
eBPF doesn’t like loops, that much is clear; but we often still need to perform
such operations. Abusing the eBPF memcpy-alike, bpf_probe_read will only
get one so far, especially if one needs to NULL out a struct. In practice,
short statically-bounded loops will be unrolled by the compiler and work, but
longer loops will not and won’t. However, it is simple to unroll loops with
statically-known bounds using compiler pragmas:
#pragma unroll
for (size_t i=0; i < 30; i++) {
arr[i] = arr[i] + 1;
}
This is a fairly useful construct that be ruthlessly applied to a number of different problems.
Uninitialized Memory
One of the things to be careful about with eBPF is that when attempting to
copy data from the eBPF stack elsewhere, if any uninitialized memory would
be copied, the validator will error with offending stack offsets that are
entirely unhelpful. Usually, this is the result of having padding between
fields in your structs. A simple way of handling this is to use an unrolled
loop akin to memset-ing zero; where possible, such code will be optimized to
use 8-byte writes. However, this is computationally wasteful. Instead, another
option is to carefully control field types and ordering to fill in all gaps.
Failing this, explicitly declaring settable padding values and padding unions
can enable a programmer to manually elide double writes to the struct. And
lastly, one can always use a packed struct
(e.g. struct __attribute__((__packed__)) foo {...}); this may require more
byte shuffling operations to write and read, but can be of help when the
limiting factor of the eBPF code is the effective rate/drop limit of
perf_submit, by reducing the overall amount of data sent.
eBPF Chicanery
For unixdump we had a number of operational needs based on correctness or
performance goals that required writing a significant amount of non-idiomatic C
code and code generation tooling. While none of this is especially
groundbreaking, it is worth discussing how to perform common programmatic tasks
while under constraints like those imposed by eBPF.
Ratcheting
In addition to managing memory shared between kernel space and userspace, we
also needed to maintain state of the current position within the custom ring
buffer. This is achieved simply enough by using another per-CPU ring buffer,
one that only holds a single value. This provides a separate position value
associated with each per-CPU ring buffer. However, the problem with this
setup is not in the data itself, but the mechanism by which it is incremented,
or, more importantly, wrapped. The eBPF validator was displeased with any
ratchet that tried to perform the wrap via a specific switch case. Instead
only it accepts implementations where wrapping is only performed using the
default: label; attempts to wrap the value in the last “valid” case or
“guess” the wrapping position will fail, even if the default: code also
wraps. For example, the following is an eBPF-valid position counter ratchet
implementation:
u32 pos = UINT32_MAX;
int key = 0;
sync = sync_buf.lookup(&key);
if (!sync) {
return 0;
}
pos = 0;
switch (sync->next) {
case 0: {
pos = 0;
sync->next = 1;
break;
};
case 1: {
pos = 1;
sync->next = 2;
break;
};
default: {
pos = 0;
sync->next = 1;
}
}
Dynamic Structure Parsing
While writing unixdump, we got the crazy idea to keep track of all ancillary
data (e.g. file descriptors) passing over Unix domain sockets. While this is of
great benefit for tracking how processes are passing file handles, sockets, and
other descriptors to each other, the “format” into which the data is marshalled
is very fluid and poorly specified. For example, similarly to SMS messages,
received messages may have a different structure from what what actually sent;
in particular, multiple messages of the same type may be coalesced into a
single message containing multiple values, regardless of the order in which
they were sent.
In unixdump, we use unrolled nested loops to iterate through the CMSG
structures containing ancillary data. Where possible, we use the CMSG_*
macros to index into the buffer and access fields; however, we reimplemented
several of these macros to be compatible with BCC’s pointer dereference
instrumentation which was unable to handle all of the CMSG_* macros. To
store the data and report it back to userspace, we used a typed union struct
that can store both SCM_RIGHTS (file descriptors) and SCM_CREDENTIALS
(Unix credentials), which additionally keeps track of the count of the
former and whether or not the last element returned to userspace was actually
the last element of the in-kernel CMSG structure. Both the max count of
copyable CMSGs and slots within a CMSG (for storing SCM_RIGHTS file
descriptors) are configurable via the CLI; this also modifies the unrolled loop
counts.
Static Data Structures and Algorithms
To appropriately handle the glut of data caught by unixdump, we needed to
performantly filter PIDs (inclusively or exclusively) in eBPF C code, so as to
limit the amount of data and number of events sent to userland. Iteratively
comparing each one would be extremely costly, so we instead opted to use a
binary search tree. As a recursive binary search implementation will trip the
loop check, we instead generate the entire static C implementation
(dynamically in Python) for the values being filtered. For reference, the
implementation can be found here.
Dark eBPF Thaumaturgy
Even with all of the above tricks to keep it happy, the eBPF validator’s muse is still a fickle miscreant with a very short attention span. Be it due to changes in the toolchain or the Linux kernel itself, the validator may look upon your overly clever code and decide to smite you where you stand. Sometimes, appeasing the validator requires ever greater sacrifices of idiomaticity.
Dynamic Length Byte Copies
Per the issue
mentioned earlier, it is not immediately clear that variable length byte
copies from kernel memory are possible with eBPF. Given that the recommended
solution is to use a helper function for copying NUL-terminated C strings, this
would be a problem when the variable length data is binary content that may
contain NULL bytes. However, this is not the case, and such copies can be
performed, albeit with some careful slight slight-of-hand. While this is not an
issue for socket paths, of which the sun_path field of struct sockaddr_un
is guaranteed to be at least UNIX_PATH_MAX (108) bytes long, this is an issue
for copying arbitrary socket data. While it is important to ensure that
stack-based arrays are fully written to (e.g. write NULL bytes to the remainder
of arrays), this limitation does not exist for eBPF map structures as they are
zero-initialized by the kernel to prevent information leaks. Instead, the
trouble occurs when one attempts to truncate the copy length. Between BCC and
the eBPF validator, it is often the case that a byte copy of the length of an
array or less is considered unsafe, and therefore rejected. Instead, when
tapering off the array length, it was previously necessary to cap the copy
length to sizeof(buffer)-1. The odd behavior here is that if the source
length is the same as the destination length, it must still be truncated.
Additionally, to prevent optimizations that may elide certain comparisons
needed to provide the eBPF validator with register bounds, we found that it was
possible to simply wrap the desired code in a static inline function to
shadow the variables in play. For example, in unixdump we perform this copy
and track whether or not the data was truncated in the following code:
inline static
void copy_into_entry_buffer(data_t* entry, size_t const len,
char* base, u8 volatile* trunc) {
int l = (int)len;
if (l < 0) {
l = 0;
}
if (l >= BUFFER_SIZE) {
*trunc = 1;
}
if (l >= BUFFER_SIZE) {
l = BUFFER_SIZE - 1;
}
bpf_probe_read(entry->buffer, l, base);
}
Note: This behavior has changed a few times between BCC and Linux kernel versions, and when using current versions of both, it is possible to implement the optimal case of copying right up to the end of the array; however, to support older versions we continue to use the less optimal “truncate on equal” version shown above.
Type Juggling
Another spooky behavior we observed with a previous version of BCC (which we
have not observed since) was an interesting case where the return type of a
function could cause the validator to raise an error. While it may be simple
enough to imagine such a situation involving mixing signed and unsigned
integers, this instance related to the use of bool as both the return and
variable type, which was eventually casted to size_t. In some versions of
our code, the validator would raise an error if the return value was bool,
but in others it would raise an error if the return value was size_t. For
context, unixdump will, based on CLI options for certain features, enable
or disable certain kprobe functionality with #if(def)s. As a result, we
simply used the same feature detections to set a BOOL_TYPE define used as
the return and variable type with either bool or size_t. At the time, we
did not bother to triage this issue (sorry!), but it does not affect the
current unixdump code when using a current BCC. As for whether or not
this is because the current BCC fixed the issue, or our current code is
unaffected, it is a mystery.
Obfuscation, or: How I Learned to Stop Worrying and Outsmart the Compiler
When writing eBPF code, one’s greatest enemy is often the compiler’s
optimizers. eBPF’s most glaring flaw is that the compiler and the validator
have no means to communicate other than through the generated code. Try as
you might to write your code in a concise way that would otherwise ensure its
correctness, the compiler may simply optimize out all of your “unnecessary”
data validation checks, leaving the validator to complain that you are not
“properly” validating all of the edge cases. While sometimes, one can get
around such occurrences with the volatile keyword, other times it will be
necessary to rework the code over and over in an attempt to fool both the
compiler and eBPF validator. As noted earlier, we have observed that placing
code verbatim within an inline static function would result in certain
offending code passing validation. This appeared to be due to the fact that
certain assumptions on the “parameters” could no longer be made, preventing the
compiler from eliding code required by the validator. However, it is worth
noting that because of such blunders within the validator, one’s code must
sometimes be implemented suboptimally, which will incur unnecessary
performance penalties. We still prefer to accept such specific penalties over
configuring BCC to compile eBPF C code with -O0.
Conclusion
While it can be a bit tricky to write anything more than the sorts of very simple eBPF kernel tracing tools currently promoted as BCC reference examples focused on basic system profiling, it is very much possible to use eBPF to develop full-featured tracing tools and tooling. Additionally, though the developer experience has a tendency to be extremely perplexing, it does appear to be actively improving over time, given the lessened need for hacky validator appeasement rituals.
We got our feet wet in the world of eBPF-based kernel tracing by attempting to
solve a somewhat niche problem, but the outcome seems promising. Our initial
test case for eBPF, unixdump, is open source and available on GitHub; check
it out here: https://github.com/nccgroup/ebpf/tree/master/unixdump. We plan
to continue to add features and filters to unixdump, and would greatly
appreciate any contributions. The next features on the roadmap are proper
timestamping, and outputting to pcapng so that one can load Unix domain socket
traffic dumps into Wireshark/tshark and apply their vast repertoire of
protocol dissectors.
-
Depending on your OS, Unix domain sockets may be described in
unix(7),unix(4), orsockaddr(3socket). ↩ -
While Linux enforces file path permissions on file path-based Unix domain sockets, this behavior is not consistent across all Unix implementations. However, in general, Unix OSes have similar sets of APIs enabling Unix domain socket peer processes to verify each other’s identity. ↩
-
https://www.kernel.org/doc/Documentation/networking/filter.txt ↩