How to use ebpf-go, case: Hook the vfs_read function to obtain the file name output

Translation wujiuye 300 0 2024-05-31

This article is a translation of the original text, which can be found at the following link: https://www.wujiuye.com/article/5a3ec2e5754141c78b8e699023c8ad65

Author: wujiuye
Link: https://www.wujiuye.com/article/5a3ec2e5754141c78b8e699023c8ad65
Source: 吴就业的网络日记
This article is an original work by the blogger and is not allowed to be reproduced without the blogger's permission.

To utilize the github.com/cilium/ebpf package, commonly known as ebpf-go, you must be operating within a Linux environment. Unfortunately, this package is not compatible with macOS or Windows. Fear not, though! You can always fire up a Linux virtual machine. For instance, you could use Ubuntu as your development environment for ebpf projects.

For Mac users, I recommend Parallels Desktop as a virtual machine software—it’s a breeze to set up, though it comes at a cost.

This guide is based on an experiment conducted using the Ubuntu 22.04 system with the ARM64 CPU architecture.

Development Environment Setup

First and foremost, we need to prepare our development environment according to the compilation environment listed in the official ebpf-go documentation.

Official Documentation: https://ebpf-go.dev/guides/getting-started/#whats-next

Screenshot 2024-05-29 11.21.02

Screenshot 2024-05-29 11.18.21

Here’s a summary of the requirements:

  1. Your system’s Linux kernel version must be >= 5.7.
  2. You need to have clang installed, with a version >= 11. If the installed clang does not include llvm, you will also need to install llvm.
  3. If you are using a Debian/Ubuntu system, you will need to install libbpf-dev.
  4. For Debian/Ubuntu systems, you need to create a symbolic link with sudo ln -sf /usr/include/asm-generic/ /usr/include/asm, or you may encounter errors about missing asm/*.h header files during compilation.
  5. Your Go language version must be >= the Go version declared in ebpf-go’s go.mod.

Once your Linux virtual machine is ready and you’ve installed Go (I recommend directly installing the latest version of Go), you can proceed to install the required tools.

First, check and confirm that the kernel version is greater than 5.7:

$ uname -r

Install clang and llvm:

sudo apt install clang
sudo apt install llvm

Verify the clang version:

$ clang --version

Install libbpf-dev:

sudo apt install libbpf-dev

Create a symbolic link as instructed to run the official examples:

sudo ln -sf /usr/include/asm-generic/ /usr/include/asm

Start Implementing a Demo

First, organize your requirements and write the ebpf C program code. For instance, let’s implement an ebpf-go probe to intercept the vfs_read system function call, capture the file name, and print it out.

C code (vfs-trace.c):

//go:build ignore

#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <linux/ptrace.h>

char __license[] SEC("license") = "Dual MIT/GPL"; // This line is mandatory

SEC("kprobe/vfs_read")
int kprobe_vfs_read(struct pt_regs *ctx) {
    struct file *f = (struct file *)PT_REGS_PARM1(ctx); // PT_REGS_PARM1 is a method defined in the bpf/bpf_tracing.h header file for capturing the first argument of the intercepted method. If you want the second argument, use PT_REGS_PARM2, and so on.
    struct path path;
    bpf_probe_read(&path, sizeof(struct path), &f->f_path);
    struct dentry dentry;
    bpf_probe_read(&dentry, sizeof(struct dentry), path.dentry);
    char filename[256];
    bpf_probe_read(filename, sizeof(filename), dentry.d_name.name);
    bpf_printk("file name: %s \n", filename);
    return 0;
}

The PT_REGS_PARM1 macro is defined in the bpf_tracing.h header file.

Initialize the project by creating a Go project directory, for example: vfs-ebpf-trace. Place the generated C code file in this directory as well.

Navigate to the directory and execute the following commands in order to initialize the Go project with ebpf-go.

cd vfs-ebpf-trace
go mod init vfs-ebpf-trace
go mod tidy
## Make sure the installed Go version is as new as the one used by ebpf-go
go get github.com/cilium/ebpf/cmd/bpf2go

Write the gen.go file:

Modify -target arm64 according to the CPU architecture, and specify the root directory of all header files with -I/usr/include. The code uses #include <bpf/bpf_tracing.h> to find the file at /usr/include/bpf/bpf_tracing.h.

package main

//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -cc clang -cflags $BPF_CFLAGS -target arm64 vfstrace vfs-trace.c -- -I/usr/include

Compile the C program to generate Go code:

go generate

You may encounter compilation errors:

In file included from ~/wujiuye/goprojects/src/vfs-ebpf-trace/vfs-trace.c:3:
/usr/include/linux/ip.h:21:10: fatal error: 'asm/byteorder.h' file not found
#include <asm/byteorder.h>
         ^~~~~~~~~~~~~~~~~
1 error generated.
Error: can't execute clang: exit status 1
exit status 1
gen.go:3: running "go": exit status 1

Upon checking the /usr/include/asm-generic/ directory, you’ll find that there is indeed no byteorder.h header file, but I found it in another directory, which is aarch64-linux-gnu.

Screenshot 2024-05-29 11.32.19

Referencing the documentation’s sudo ln -sf /usr/include/asm-generic/ /usr/include/asm, I changed it to sudo ln -sf /usr/include/aarch64-linux-gnu/ /usr/include/asm, and it no longer reports that it can’t find byteorder.h, but there are still many other header files that can’t be found.

Remember to rm -rf /usr/include/asm before making the change.

/usr/include/string.h:26:10: fatal error: 'bits/libc-header-start.h' file not found
#include <bits/libc-header-start.h>
         ^~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Error: can't execute clang: exit status 1
exit status 1
gen.go:3: running "go": exit status 1

/usr/include/features.h:461:12: fatal error: 'sys/cdefs.h' file not found
#  include <sys/cdefs.h>
           ^~~~~~~~~~~~~
1 error generated.
Error: can't execute clang: exit status 1
exit status 1
gen.go:3: running "go": exit status 1

/usr/include/features.h:485:10: fatal error: 'gnu/stubs.h' file not found
#include <gnu/stubs.h>
         ^~~~~~~~~~~~~
1 error generated.
Error: can't execute clang: exit status 1
exit status 1
gen.go:3: running "go": exit status 1

I created symbolic links for these directories to aarch64-linux-gnu in the same manner.

sudo ln -sf /usr/include/aarch64-linux-gnu/asm /usr/include/asm
sudo ln -sf /usr/include/aarch64-linux-gnu/bits /usr/include/bits
sudo ln -sf /usr/include/aarch64-linux-gnu/gnu /usr/include/gnu

Following this plan, the compilation should proceed without issues. If errors persist, check the error messages to see if they are caused by mistakes in your C code; if so, they should be easy to fix.

If you are using an Ubuntu with X86_64 (amd64) architecture, you should also encounter similar issues. Simply replace aarch64-linux-gnu with x86_64-linux-gnu:

sudo ln -sf /usr/include/x86_64-linux-gnu/asm /usr/include/asm
sudo ln -sf /usr/include/x86_64-linux-gnu/bits /usr/include/bits
sudo ln -sf /usr/include/x86_64-linux-gnu/gnu /usr/include/gnu

eBPF C code compiled and the Go scaffolding generated, all that remains is to write the Go code responsible for loading the program and attaching it to the Linux kernel’s hooks.

The implementation of main.go:

package main

import (
	"github.com/cilium/ebpf/link"
	"github.com/cilium/ebpf/rlimit"
	"log"
	"os"
	"os/signal"
	"time"
)

func main() {
	// Remove kernel resource limits for kernels < 5.11.
	if err := rlimit.RemoveMemlock(); err != nil {
		log.Fatal("Removing memlock:", err)
	}

	// Load the compiled eBPF ELF and load it into the kernel.
	var objs fusetraceObjects
	if err := loadFusetraceObjects(&objs, nil); err != nil {
		log.Fatal("Loading eBPF objects:", err)
	}
	defer objs.Close()

	// Use kprobe to mount.
	kp, err := link.Kprobe("vfs_read", objs.KprobeVfsRead, nil)
	if err != nil {
		log.Fatalf("opening kprobe: %s", err)
	}
	defer kp.Close()

	// Listen for interrupt signals to exit the process.
	stop := make(chan os.Signal, 5)
	signal.Notify(stop, os.Interrupt)
	for {
		select {
		case <-stop:
			log.Print("Received signal, exiting..")
			return
		}
	}
}

Compile and run:

go build
./vfs-ebpf-trace

Chain the three commands:

go generate && go build && ./vfs-ebpf-trace

View the log output:

The logs printed by bpf_printk are not output to the console but are written to the /sys/kernel/debug/tracing/trace_pipe file.

sudo cat /sys/kernel/debug/tracing/trace_pipe

bpf_printk

If you encounter an error when viewing the logs: cat: /sys/kernel/debug/tracing/trace_pipe: Device or resource busy, it’s because the file is being used by another process, possibly because our last SSH session to view it wasn’t closed, and then SSH timed out. We opened a new window, but the original process still has the file, hence the error.

Use the lsof command to find out which process is using this file, then kill the process.

root@vultr:~# sudo lsof /sys/kernel/debug/tracing/trace_pipe
COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
cat     122966 root    3r   REG   0,12        0 12385 /sys/kernel/debug/tracing/trace_pipe
root@vultr:~# kill -9 122966

Pitfalls

Header File Not Found Issues

For compilation errors related to missing header files, the aforementioned approach only suits the case in the article. If we use other headers, such as <linux/fs.h>, we may still encounter various issues.

We can use this powerful trick.

Generate a header file that includes all structures based on the current kernel version using the bpftool command.

bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

Then place the vmlinux.h file in the /usr/include directory:

mv ./vmlinux.h /usr/include

Finally, modify the import section of the C program code:

#include <vmlinux.h>
//#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
//#include <linux/ptrace.h>

Headers starting with bpf/ are obtained from the installation of libbpf-dev and are not system headers, so they cannot be replaced by vmlinux.h. Headers starting with linux/ are provided by the system, and after including vmlinux.h, they are no longer needed, so they are commented out.

Reference: https://www.strickland.cloud/post/1

Data Reading Optimization

In the example, we need to read f->path.dentry->d_name.name. Directly calling like this won’t work, so we used bpf_probe_read several times to read the file name.

SEC("kprobe/vfs_read")
int kprobe_vfs_read(struct pt_regs *ctx) {
    struct file *f = (struct file *)PT_REGS_PARM1(ctx);
    struct path path;
    bpf_probe_read(&path, sizeof(struct path), &f->f_path);
    struct dentry dentry;
    bpf_probe_read(&dentry, sizeof(struct dentry), path.dentry);
    char filename[256];
    bpf_probe_read(filename, sizeof(filename), dentry.d_name.name);
    bpf_printk("file name: %s \n", filename);
}

We can use the BPF_CORE_READ macro defined in the bpf_core_read.h header file to support chained reading.

After installing libbpf-dev, this header file is located at /usr/include/bpf/bpf_core_read.h.

....
#include <bpf/bpf_core_read.h> // Include this header file

SEC("kprobe/vfs_read")
int kprobe_vfs_read(struct pt_regs *ctx) {
    struct file *f = (struct file *)PT_REGS_PARM1(ctx);
    const unsigned char *filename;
    filename = BPF_CORE_READ(&f->f_path, dentry, d_name.name);
    bpf_printk("file name: %s \n", filename);
}

Here we use BPF_CORE_READ to read the file name in one go, solving the problem of needing to call bpf_probe_read multiple times.

In the example, BPF_CORE_READ(&f->f_path, dentry, d_name.name) reads the dentry (structure) from f_path (structure), then reads the d_name (structure) from dentry, and obtains the name field of d_name (const unsigned char *).

Other Issues

1

To verify that the probe points are correct, you can use the bpftrace tool.

For example:

sudo bpftrace -e 'kprobe:vfs_open {printf("hello")}'

2

If you encounter the error Removing memlock: failed to set memlock rlimit: operation not permitted when running, you need to start with root privileges.

If you are using a development tool like GoLand in a virtual machine to run the project, you need to start GoLand with root privileges.

sudo ./goland.sh

3

If you encounter this error when running:

Loading eBPF objects: field KprobeFuseFileWrite: program kprobe_fuse_file_write: load program: invalid argument: cannot call GPL-restricted function from non-GPL compatible program (13 lines omitted)

Refer to the example and add the following line to our C program code:

char __license[] SEC("license") = "Dual MIT/GPL";

You need to re-execute go generate.

4

For Linux system functions, you can search on this website and even search according to your own system’s kernel version. It can be used to find which parameters a function has and what the types of the parameters are. https://elixir.bootlin.com/linux/v5.10.218/source/fs/open.c#L928