Challenge 4: Page Fault Stats ============================= In this challenge, you'll dive into eBPF to monitor page faults—those moments when a process accesses memory that isn't currently mapped to their virtual address space, triggering a costly operation. Your goal? Track and display page faults for processes that generate faults. In order to avoid flooding the user with messages, you will only display a message every X page faults per process, where *X* is the *log_step* you can set. Description ----------- Your goal is to develop an eBPF program that tracks the number of page faults generated by **each process** running of the system and prints the process ID, the name and the number of page faults of the process. By page faults, we regroup the generic term for the situation when a program tries to access a page of memory that is not currently mapped in their address space. For instance, this can happen when a program tries to access memory that is not currently mapped, or when a program tries to write to a read-only page. In both cases, the kernel must intervene to resolve the situation, which can be costly in terms of performance. In order to help you test your eBPF program, you are provided with a user-space program that generates a number of page faults. Setup ----- Download the files for this challenge using: .. code-block:: bash $ wget --no-check-certificate https://people.montefiore.uliege.be/~gain/courses/info0940/asset/page_faults.tar.gz $ tar -xzvf page_faults.tar.gz The ``page_fault_gen`` program is located in ``page_faults/page_fault_gen``. You can compile the user-space program using the Makefile provided (simply run ``make`` within the ``page_fault_gen`` directory). Then you can run it: .. code-block:: bash $ ./page_fault_gen This program generates a number of page faults which is **approximately** close to ``num_page_faults``. This program is an helper to generate page faults and to help you test your eBPF program more easily. Inside ``page_faults/src``, you will find the same template as in tutorial 3. Use it to implement the eBPF program that tracks the number of page faults generated by all the processes running on the system. What you need to do ------------------- Your task is to develop an eBPF program that monitors and track the number of page faults for each process and generate messages when the number of page fault of a particular process is a multiple of *log_step*. To achieve this, you will need to efficiently store and update fault statistics using eBPF maps. Your program should also send relevant data to user space by using **perf buffers**. All in all, the eBPF program should be able to: 1. Track the number of page faults generated by each process running on the system. 2. Print the process ID, name and the number of page faults when the number of page faults of a process is a multiple of *log_step*. By default *log_step* is 50. You must use a **perf buffer** to print these information As indicated, the *log_step* needs to be configurable. Therefore when loading the eBPF program (using the ``ecli`` command), the following argument can be provided: - ``--log_step``: The *log_step* that will be used to determine when to print the message. Default should be 50. This is an example of output that your program should generate: .. code-block:: none $ sudo ecli src/package.json --log_step 70 INFO [faerie::elf] strtab: 0xa58 symtab 0xa90 relocs 0xad8 sh_offset 0xad8 INFO [bpf_loader_lib::skeleton::preload::section_loader] load runtime arg (user specified the value through cli, or predefined in the skeleton) for log_step: Number(70), real_type= 'int' bits:32 off:0 enc:signed, btf_type=BtfVar { name: "log_step", type_id: 29, kind: GlobalAlloc } [...] # Other information provided by ecli INFO [bpf_loader_lib::skeleton] Running ebpf program... TIME PID COMM NB_PAGE_FAULT 10:45:16 2134 gcc 70 10:45:16 2134 gcc 140 10:45:16 2134 gcc 210 10:45:16 1531 bash 70 # <- 10:45:16 2135 bash 70 # <- As you can see here, multiple different 10:45:16 1531 bash 140 # <- bash processes are running. Their page 10:45:16 2136 bash 70 # <- faults are tracked separately! 10:45:16 1531 bash 210 # <- 10:45:16 1531 bash 280 # <- 10:45:18 2137 gcc 70 10:45:18 2138 cc1 70 10:45:18 2138 cc1 140 10:45:18 2138 cc1 210 10:45:18 2138 cc1 280 10:45:18 2138 cc1 350 10:45:18 2138 cc1 420 10:45:18 2138 cc1 490 10:45:18 2138 cc1 560 10:45:18 2139 as 70 10:45:18 2139 as 140 10:45:18 2138 cc1 630 10:45:18 2138 cc1 700 10:45:18 2138 cc1 770 10:45:18 2138 cc1 840 10:45:18 2139 as 210 10:45:18 2139 as 280 10:45:18 2140 collect2 70 .. tip:: Pay attention to the fact that much of the code related to page faults is architecture-dependent, and some functions cannot be hooked on all architectures. Additionally, the actual implementation of page fault handling is complex, but there is a key function where the handling process begins that can be hooked across all architectures. The following resources can help you identify it: `mmu-tlb-and-page-faults `_ and `chapter 4.6.1 Handling a Page Fault `_. .. important:: - The eBPF program is not specific to one process, but should track the page faults generated by all the processes running on the system. - When several processes of the same name are running, the eBPF program should **not** aggregate the number of page faults generated by all these processes. The number of page faults should be tracked per different process id (different PIDs).