eBPF Basics (Part 1)¶

In this tutorial, you will learn the basics of eBPF (extended Berkeley Packet Filter). eBPF is a technology that allows you to run sandboxed programs in the Linux kernel.

The first project of this course will be to write a series of eBPF programs. This tutorial will help you to understand the basics of eBPF and how to write eBPF programs.

How to modify the Linux kernel¶

The Linux kernel is an open-source and free (free as in freedom, not as in free beer) operating system kernel. Therefore, you are free to modify the Linux kernel on your machine (or Virtual Machine) as you wish.

Modifying the kernel can be a fun way to learn more about the Linux kernel and how it works. The privileged nature of the kernel can also be leveraged to implement observability, security, and networking functionality. Companies like Google, Facebook, or Amazon modify the Linux kernel to fit their needs.

There are three conventional ways to run our code in the kernel or modify its behavior:

Modify the kernel source code and recompile the kernel
Write a kernel module and load it into the kernel
Write an eBPF program and load it into the kernel

The kernel is a very complex and large codebase. Therefore, whatever the way you choose to modify the kernel, it will not be an easy task. However, eBPF tries to make this task easier by providing a simple and safe way to run programs in the kernel.

Important

When you will write kernel code, you will not be able to use the standard C library (libc) or the standard C headers. You will need to use other headers and libraries provided by the kernel. For example, to compare two strings in an eBPF program, you will use bpf_strncmp instead of strncmp. This is because the kernel is not a user-space program and does not have access to the same resources as a user-space program.

eBPF Overview¶

eBPF is a technology that allows you to run sandboxed programs in the Linux kernel at runtime. It was introduced in 2014, where it was mainly used for networking purposes. Since then, eBPF has been extended to other parts of the kernel, such as tracing, security, and observability.

You can write a program, compile it to eBPF bytecode, and load it into the kernel. Your bytecode will be executed with the help of a Just-In-Time (JIT) compiler.

A verification engine will also check that your bytecode is safe to run in the kernel (ensures memory safety, prevents loops that might not terminate, etc. This is important because a bug in your program could crash the kernel or compromise the security of the system. This, combined with the sandboxing, means that your code will need to abide by a set of rules to be accepted by the kernel. In addition, you will not be able to directly use all the features of the kernel in your eBPF program, but will be using helper functions instead.

In your program, you will set up hooks to run your eBPF program in kernel mode when a specific event occurs.

The eunomia-bpf Framework¶

Developing eBPF programs can be a bit tricky. For example, the kernel-code development is only a part of the story. User-space code to load the eBPF program into the kernel and interact with it also needs to be written.

For this reason, frameworks such as eunomia-bpf can be used to simplify development, distribution, and execution of eBPF programs. There are multiple development frameworks to choose from, such as BCC (BPF Compiler Collection) libbpf, cilium/ebpf, eunomia-bpf, etc

For your first project, you will use the eunomia-bpf framework to write eBPF programs.

Installing eunomia-bpf¶

In order to compile high-level languages (like C) to eBPF bytecode, eunomia-bpf uses the clang compiler and the LLVM compiler infrastructure. To install these, run the following command:

$ sudo apt install clang llvm

eunomia-bpf provides the ecc command to compile eBPF code. ecc will create a json file that contains the eBPF bytecode. To run the eBPF program, you will need to run the ecli command on the json file. Download those tools by running the following command:

$ wget https://aka.pw/bpf-ecli -O ecli
$ chmod +x ./ecli

$ wget https://github.com/eunomia-bpf/eunomia-bpf/releases/latest/download/ecc
$ chmod +x ./ecc

For ease of use, you can make sure that the ecc and ecli commands are in your $PATH. For example, let’s move the ecc and ecli commands to /usr/local/bin, which is already in your $PATH:

$ sudo mv ecc ecli /usr/local/bin/

Note

An alternative was to update your $PATH to include the directory where the ecc and ecli commands are located.

You can check that the ecc and ecli commands are correctly installed by running the following commands:

$ ecc -h
# you should see the help message of ecc here, not an error message

$ ecli -h
# you should see the help message of ecli here, not an error message

Finally, eunomia-bpf provides a template to write eBPF programs. We simplified this template to avoid unnecessary complexity. Please download the template by running the following command:

$ wget --no-check-certificate https://people.montefiore.uliege.be/~gain/courses/info0940/asset/eBPF_template.tar.gz
$ tar -xzvf eBPF_template.tar.gz

In the template, you will find a Makefile to compile the eBPF program and a bpf.c file that will contain your code. Right now, it contains two include directives and it defines a license; you don’t need to modify these lines. We are going to complete this template in following sections.

Hooks¶

As mentioned above, your eBPF program will be executed when a specific event occurs. These events are called hooks. For example, you can set up a hook to run your eBPF program when a network packet is received.

There are different types of hooks: system calls, function entry/exit, kernel tracepoints, network events, etc.

Today, we will focus on two types of hooks: system calls and uprobes. Next week, other types of hooks will be introduced.

To define a hook in your eBPF program, you will need to use the SEC macro provided by eunomia-bpf. Understanding how this macro works is not crucial for the project, but here is a brief explanation:

(facultative explanation) On Linux, executables use the elf format. This format contains sections that are used to store different types of data or code. For example, the .text section generally contains most of the executable code, the .data section contains the initialized data, etc.

In eBPF, some custom sections are defined to indicate how to treat the code or data in these sections. For example, you can find a section called tracepoint/syscalls/sys_enter_write that indicates the code to run when the syscall write is called.

eunomia-bpf uses the SEC(section_name) macro (which comes from the libbpf library) to define a new section named section_name in the eBPF program.

Note

You can use readelf -S src/template.bpf.o to see the sections of the eBPF program. After adding a hook in your eBPF program and compiling it (in the next section), you should see a new section in the output of the command.

What you need to know is that when you want to define a hook in your eBPF program, you will need to use the SEC macro before your function definition. You can find a list of well-known program sections to use with the SEC macro here.

Let’s move to the next section to see how to use the SEC macro to define a hook in your eBPF program.

System calls hooks¶

System calls are the interface between user-space and kernel-space.

Indeed, without using system calls, user-space programs are tied to their (virtual) address space and can only use unprivileged instructions on the CPU. (e.g., they can declare variables, make calculations, jump to other parts of their code, etc.)

If they want to interact with the hardware, with other processes or access (other) kernel’s services, they need to use system calls. They will then be able to read/write files, create processes, communicate with other processes, etc.

Note

For example, the user-space program that will load your eBPF program into the kernel will use the bpf system call to do so. You can find a list of system calls used in the x86 architecture at /arch/x86/entry/syscalls/syscall_64.tbl in the kernel source code.

Let’s now use the SEC macro to define a hook on a system call. The syntax is:

// (inside the bpf.c file before the function definition)
SEC("tracepoint/syscalls/sys_enter_<syscall_name>")

If you want your code to execute after the system call, you can use:

// (inside the bpf.c file before the function definition)
SEC("tracepoint/syscalls/sys_exit_<syscall_name>")

Note

Tracepoints are predefined hooks in the kernel that allow you to execute code when a specific event occurs.

For example, add this code in your src/template.bpf.c file (after the “Place your code here” comment) to define a hook that will be executed when the execve system call starts:

SEC("tracepoint/syscalls/sys_enter_execve")
int handle_execve(struct trace_event_raw_sys_enter *ctx)
{
    return 0;
}

This code will execute the handle_execve function when the execve system call starts. The execve system call is called when a new program is executed.

Using the SEC macro, you indicated that your handle_execve function should be executed when the execve system call starts (You could have chosen another name for the function). The function you used to hook a syscall will always need to return 0, and will take a parameter of type struct trace_event_raw_sys_enter *. This parameter contains information about the system call that is being hooked.

You can already try to compile your code by running the following command inside the eBPF_template directory:

$ make
cd src && ecc template.bpf.c
INFO [ecc_rs::bpf_compiler] Compiling bpf object...
INFO [ecc_rs::bpf_compiler] Generating package json..
INFO [ecc_rs::bpf_compiler] Packing ebpf object and config into package.json...

And then run it using:

$ sudo ecli src/package.json
[sudo] password for student:
INFO [faerie::elf] strtab: 0x23e symtab 0x278 relocs 0x2c0 sh_offset 0x2c0
INFO [bpf_loader_lib::skeleton::poller] Running ebpf program...

Currently, the handle_execve function does nothing, but you can use the following command (in another terminal inside your VM) to see that your eBPF program is running:

$ sudo bpftool prog show
# [...] You should see other eBPF programs here, then you should see your
# eBPF program at the end of the list looking something like this:
41: tracepoint  name handle_execve  tag 7a016a493ae85c79  gpl
     loaded_at 2025-02-32T12:41:19+0000  uid 0
     xlated 48B  jited 40B  memlock 4096B  map_ids 9
     btf_id 94

To stop the program, you can press Ctrl+C (in the terminal where the ecli command is running). This will also unload the eBPF program from the kernel. (You can check that the program is no longer running by running the sudo bpftool prog show command again.)

Uprobe hooks¶

Hooks are not limited to kernel events such as system calls. You can also set up hooks on user-space events. This means that when a specific user-space function is called, your eBPF program will be executed (in the kernel).

You can place hooks at the beginning, end, or at a specific address of a function. To place a hook, you need to specify:

the name of the function you want to hook. (or its address)
the path of the binary which contains this function.

When your uprobe is registered, its effect is (almost) immediate on all the processes that have been and that will be instantiated from the binary you are hooking.

For example, to add an uprobe on the readline function of the bash binary, you can use the following code:

#include <vmlinux.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
// ...

SEC("uprobe//usr/bin/bash:readline")
int BPF_UPROBE(my_probe, const char *prompt)
{
    // your code here
    return 0;
};

BPF_UPROBE is a macro that will take as first argument the name of the function that will be executed when the uprobe is hit (my_probe in this case; you can name it however you want). The remaining arguments correspond to the parameters of the function being hooked. In this case, the readline function’s prototype is char *readline(const char *prompt).

Tip

Try and find in the documentation here how to set up an uprobe after the execution of a function.

Knowing how uprobes work is not crucial for the project, but here is a brief explanation:

Note

To fully understand this explanation, you will need to know about inodes and File Control Blocks (Chapter 11: Implementing File Systems) and virtual memory (chapters 6 and 7).

../_images/uprobe.png — Uprobe hook (facultative explanation)¶

(facultative explanation) When you set up an uprobe, the kernel will use the inode of the binary to be able to identify which processes are (or will be) running this binary.

Important

The inode is the Linux equivalent of the FCB (File Control Block), which you learned about in the theoretical part of the course. You can consider that they represent the same thing. Please remember both terms for the exam as M. Mathy uses both.
(facultative explanation) To be able to execute your eBPF code when the uprobe is hit, a breakpoint is placed at the beginning of the function you are hooking. This breakpoint is placed in the virtual address space of the process, not in the code within the program on the disk. The breakpoint is a software interrupt (int3, opcode is 0xCC). The interrupt handler will restore the instruction that was replaced by the breakpoint and make sure to execute your eBPF code when needed.

Helper Functions¶

To avoid compatibility issues with different kernel versions, eBPF programs cannot directly call kernel functions. Instead, they use helper functions that are provided by the kernel. You will therefore need to use helper functions to perform specific tasks in your eBPF program like printing a message, reading from a specific data structure called maps (that will be introduced next week), get information about a process, etc.

Here is the documentation of helper functions.

For example, let’s use the bpf_printk helper function to print a message in the kernel logs. Add the following line to the handle_execve function:

bpf_printk("Hello, World!\n");

Note

The bpf_printk is very simple to use, but it is not the most efficient way to debug your eBPF program or communicate with the user-space. For example, you are limited to 3 parameters. However, for now it is enough to see that your eBPF program is running.

You can then compile and run your program as explained above. You can access the kernel logs by running the following command:

$ sudo cat /sys/kernel/debug/tracing/trace_pipe

Now, you should see the message Hello, World! in the kernel logs.

If you try to open new terminals or launch a new process, you will see that new Hello, World! messages are printed in the kernel logs. This is because the execve system call is called when a new process is executed.

Another helper function that you will probably need in the project is the bpf_get_current_pid_tgid function. This function returns the PID and the TID of the current task. For example, to get the pid of the process that called the execve system call, you can use:

pid_t pid = bpf_get_current_pid_tgid() >> 32;

Important Resources¶

If you want an overview of eBPF, you can check the following website (reexplains what was said above, and more): https://ebpf.io/what-is-ebpf/
To find out the different hooks, helper functions, etc available, check the eBPF kernel-related documentation: https://docs.ebpf.io/
Checking the eunomia-bpf tutorial and examples will probably be very useful. This has been the main source of inspiration for tutorials 3 and 4, as well as the first project: https://eunomia.dev/tutorials/
If you want to hook specific kernel functions, you will maybe need to check the kernel source code and/or the kernel documentation of kernel 6.8 (which is the kernel you use in the VM).

Summary: The Life Cycle of an eBPF Program¶

Here is a summary of the life cycle of an eBPF program:

Write your eBPF program in a high-level language (like C).
Compile your eBPF program to eBPF bytecode using the ecc command.
Load your eBPF program into the kernel using the ecli user-space program.
Your eBPF program will be executed in kernel mode when a specific event occurs (hook). It will use helper functions to use kernel features and will communicate with user-space via the kernel logs (you will learn better ways to communicate with user-space next week).
Unload your eBPF program from the kernel using Ctrl+C on the terminal where the ecli command is running.

Now, you have all the tools you need to start the first challenge of project 1.

Notions that you might come across¶

BPF vs eBPF: eBPF stands for extended BPF. BPF (Berkeley Packet Filter) was introduced in 1992 and was used to filter packets in the kernel. eBPF was introduced in 2014 and extended BPF to allow you to run sandboxed programs in the kernel. Since eBPF is not limited to networking anymore, the acronym does not really make sense anymore.

Important

In the kernel documentation/source, “BPF” is sometimes used to refer to eBPF.

CO-RE: Compile Once, Run Everywhere. It simply means that you can compile your eBPF program once and run it on different kernels.