eBPF Basics (Part 2)¶
In this tutorial, you will continue learning the basics of eBPF. Please read tutorial 3 before starting this tutorial.
The first project of this course will be to write a series of eBPF programs. This tutorial will continue helping you to understand the basics of eBPF and how to write eBPF programs.
Communication between eBPF programs and user-space programs¶
From what was learned in the previous tutorial, you can now create simple eBPF programs that hook certain events, and perform some basic action, for example thanks to helper functions. However, right now the code that you can write is very limited. You simply execute a series of instructions once an event is triggered but you cannot store any data or state between events.
This is where maps come in. Maps are a way to store data in the kernel that can be accessed by (other) eBPF programs and even by user-space programs.
In this section, you will learn:
How to communicate from user space to eBPF programs, when launching your code, using global variables.
How to communicate between eBPF programs using maps.
How to communicate from eBPF programs to user space using perf buffers.
Note
Global variables are in fact implemented as maps, but they are a special kind of map that is easier to use but also very limited.
Global variables¶
Thanks to global variables, you can set the value of a variable from the command line, which can then be used by all eBPF programs of your code.
This is an example of eBPF programs that use a global variable:
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
char LICENSE[] SEC("license") = "Dual BSD/GPL";
const volatile int activate_hooks = 1;
SEC("tracepoint/syscalls/sys_enter_read")
int handle_read(void *ctx)
{
if (!activate_hooks)
{
return 0;
}
pid_t pid;
pid = bpf_get_current_pid_tgid() >> 32;
bpf_printk("read syscall from pid: %d\n", pid);
return 0;
}
SEC("tracepoint/syscalls/sys_enter_write")
int handle_write(void *ctx)
{
if (!activate_hooks)
{
return 0;
}
pid_t pid;
pid = bpf_get_current_pid_tgid() >> 32;
bpf_printk("write syscall from pid: %d\n", pid);
return 0;
}
Now, if you run the help command provided by ecli on your program (after compiling it), you will see that the global variable is listed as an option:
$ ecli src/package.json -h
Error: Failed to run native eBPF program
Caused by:
Bpf error: Failed to parse argument: A simple eBPF program
Usage: prog [OPTIONS]
Options:
--verbose Whether to show libbpf debug information
--activate_hooks <activate_hooks> Set value of `int` variable activate_hooks
-h, --help Print help
-V, --version Print version
Built with eunomia-bpf framework.
See https://github.com/eunomia-bpf/eunomia-bpf for more information.
Note
You can ignore the error message.
You can now change the value of the global variable from the command line when you run your program. For example, you can deactivate the hooks by running:
$ sudo ecli src/package.json --activate_hooks 0
Note
It is also possible to define global variables that are not constant, but you won’t be able to initialize them (neither in the code nor in the command line). Therefore, their appeal is limited.
Global variables are simple to use, but they are very limited.
Maps¶
Maps are a more powerful way to store data in the kernel. They can be used from an eBPF program to store data that can be accessed by (other) eBPF programs.
Note
In fact, it is also possible to communicate between user space and eBPF programs using maps, but it is not covered in this tutorial. (This is what the “Syscall commands” sections of the documentation of the different map types are about).
To define maps, the SEC
macro will be handy once again. You will need
to define a C structure that will represent the kind of map you want to
create. For example:
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, 1024);
__type(key, __u32);
__type(value, struct value);
} values SEC(".maps");
Note
__uint(), __type(), SEC() and BPF_MAP_TYPE_HASH are all macros defined by the kernel headers.
This structure defines a map that can store a maximum of 1024 entries. Each
entry is a key-value pair where the key is a 32-bit unsigned integer and the
value is a struct value
(that you must define).
Finally, the type of the map is BPF_MAP_TYPE_HASH
. This is one of the simplest
types of maps. It is a hash table where you can store key-value pairs. You can
find other types of maps here.
To communicate with the map from an eBPF program, you need to use helper
functions. For example, to insert a value in the map, you can use the
bpf_map_update_elem()
helper function. In the link provided above, you can
find the helper functions you can use for each map type (for example, for
BPF_MAP_TYPE_HASH
: https://docs.ebpf.io/linux/map-type/BPF_MAP_TYPE_HASH/).
Be careful, the fields that you define in the map structure and the helper function you use are not always the same. Check this out in the documentation when using a new map type.
perf buffers¶
With global variables and maps, you can communicate from user space to the eBPF
programs and between eBPF programs. To communicate from eBPF programs to user
space, in tutorial 3 you used the bpf_printk()
function. A better way to
communicate from eBPF programs to user space is to use perf buffers.
Note
The perf buffer is not specific to eBPF. It is a high-performance ring buffer mechanism provided by the Linux perf_event subsystem, designed for efficient event logging, performance monitoring and tracing. It enables the kernel to send data to user-space applications with minimal overhead.
Since eunomia-bpf does all the “user-space” work for you, using perf buffers is quite simple:
Define a map of type
BPF_MAP_TYPE_PERF_EVENT_ARRAY
in your.bpf.c
file (it will always look the same, no need to modify it, except for the name of the map):struct { __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); __uint(key_size, sizeof(u32)); __uint(value_size, sizeof(u32)); } events SEC(".maps");
In a header file, define a structure that will represent the data that you want to send to user space. For example:
struct struct_to_give_to_perf { int pid; char message[TASK_COMM_LEN]; };
In your eBPF program, use the
bpf_perf_event_output()
helper function to send data to the perf buffer.
Then when you run the ecli
command, you will directly see the output of the
buffer in the console:
$ sudo ecli src/package.json
INFO [faerie::elf] strtab: 0x5ee symtab 0x628 relocs 0x670 sh_offset 0x670
INFO [bpf_loader_lib::skeleton::preload::section_loader] User didn't specify custom value for variable __eunomia_dummy_struct_to_give_to_perf_ptr, use the default one in ELF
INFO [bpf_loader_lib::skeleton] Running ebpf program...
TIME PID MESSAGE
21:58:22 26364 New process created
21:58:22 26367 New process created
As you can see, all of the fields of your struct_to_give_to_perf
structure
are printed in different columns. The name of each field is used as the column
name.
You can find the full code here. To download it on your VM:
$ wget --no-check-certificate https://people.montefiore.uliege.be/~gain/courses/info0940/asset/perf_example.tar.gz
$ tar -xzvf perf_example.tar.gz
Hooks (part 2)¶
Last week, you learned how to use syscall hooks and uprobes. This week, you will learn how to use kprobes.
kprobes¶
Similar to uprobes which allow you to hook a function from a user-space application, kprobes allow you to hook a function from the kernel.
The syntax to use kprobes is almost the same as for uprobes: the only
difference is that you do not provide a path to the executable where you want
to hook the function. For example, to hook the blk_mq_start_request(struct
request *rq)
function, which is the function that is called when a block
device is about to start a request, you can write the following code:
SEC("kprobe/blk_mq_start_request")
int BPF_KPROBE(handle_blk_mq_start_request, struct request *rq)
{
u64 start_time_ns = BPF_CORE_READ(rq, start_time_ns);
bpf_printk("Timestamp (in nanoseconds) that this request was allocated for this IO: %lld (current time: %lld)\n", start_time_ns, bpf_ktime_get_ns());
return 0;
}
Note
BPF_CORE_READ
is a macro that allows you to read a field of a
structure. In this case, it reads the start_time_ns
field of the rq
structure.
Similarly to uprobes, you provide the name of the function you want to hook in
the SEC
macro, you use the BPF_KPROBE()
whose first argument is the name of
the function that will be called when the event is triggered, and the rest of
the arguments are the arguments of the function you want to hook.
In order to know which function to hook and what are its arguments, you need to be able to read the kernel source code and/or documentation. To this end, you can check the next page of this tutorial: Kernel Code Overview.
A final note about kprobes is that you should avoid using inline functions as probe points. This is because kprobes may not be able to guarantee that probe points are registered for all instances of that function. To know whether a function is inlined or not, you can check the prototype of the function in the kernel source code.
Note
Instead of using kprobes, you can look into fentry, which offers a advantages over kprobes. It is not talked about in this tutorial for brevity (and you won’t need to use both in the project).
Download the full code of this example using:
$ wget --no-check-certificate https://people.montefiore.uliege.be/~gain/courses/info0940/asset/kprobe_example.tar.gz
$ tar -xzvf kprobe_example.tar.gz