As part of our initiative to contribute to and improve CNCF projects, I’ve recently found a bypass vulnerability in Sysdig – CVE-2019-8339. This allows bypassing its syscall detection and as a result, allows bypassing Falco rules and running any system calls undetected.
In this blog post, I will explain briefly about the parts of Falco that related to the vulnerability and about the vulnerability itself.
Sysdig is a monitoring tool which can monitor all calls to syscalls on the system. Combined with Falco, it can compare the syscalls to a set of defined rules to detect malicious activity.
Sysdig uses a kernel module to set kernel tracepoints before and after calling syscalls. It creates a buffer for each CPU core, captures the syscalls and their arguments, and sends them to the matching buffer for further handling.
Scap is a library made by Sysdig which is in charge of collecting the captured events from the buffers.
In scap.c:scap_next_live, we can see that the function looks at every buffer for their first event and picks the oldest one.
Sysdig-probe is the kernel module which is in charge of setting the tracepoints and sending the events from them to Scap through a matching buffer.
We can see that the function ppm_events.c:103-ppm_strncpy_from_user fills the buffer until either all bytes are written or until it is out of space.
The vulnerability is in the probe – main.c:1672.
The function checks that there is enough space in the buffer for the minimum event size and, if there isn’t, it discards it and continues to the next event while expecting that there will be enough space in the buffer.
An attacker can flood this buffer and then call malicious syscalls which would be undetected. This vulnerability can be exploited in both containers and hosts.
On the surface, it may appear difficult to flood a single buffer and then run malicious code on the same buffer but it actually isn’t. Each core has a 1MB buffer so 16 syscalls of 64K are sufficient.
If an attacker creates a number of processes matching the number of cores, assigns each process to each core with sched_setaffinity, forks a few times (to continue running in case a context switch happens), and runs some syscalls with long parameters (>=64K) then Scap wouldn’t be able to handle all of the syscalls.
All of the buffers will be full and the kernel module will start discarding events until free space in a buffer will be available again. At this point, a malicious system call can run undetected. For example, in this video, below, you can see that I open /etc/shadow and Falco let me know that a sensitive file has been opened, but when I run my exploit, I can print /etc/shadow without Falco alerting anything.
Sysdig 0.24.2 and Falco 0.14.0 have been explicitly tested, and according to the code, the vulnerability exists on all of the other older versions.
An attacker can run malicious system calls that can not be detected by Sysdig or Falco.
We disclosed the vulnerability to Sysdig and got a quick response which said that they will release a new version with a patch, and as they said, they released a new version with a patch for the vulnerability which adds an alert when syscalls are dropped.
The patch fixing the issue – https://github.com/falcosecurity/falco/pull/561
Documentation – https://falco.org/docs/event-sources/dropped-events/