Getting Kernel Information
==========================

Once again, this part of the tutorial is not mandatory to do the projects.

The pseudo filesystem (procfs)
------------------------------

Linux offers various interfaces to expose internal kernel information. One of them is procfs (Process Filesystem). Linux shows information through files under ``/proc``. Every file under the directory is an interface for accessing to particular information.

The procfs is mounted on the root of the file system (``/``) under the proc
directory. The data under procfs is not persistent and all operations happen in
memory (i.e., the data is not stored on disk).

Obtain General Information about the Kernel
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Some of the structures exposed via procfs:

* ``/proc/cpuinfo``: Provides cpu details like the number of cores, cpu size, model, etc
* ``/proc/meminfo``: Provides information on physical memory
* ``/proc/interrupts``: Information about interrupts
* ``/proc/vmstat``: Virtual memory stats
* ``/proc/filesystems``: Listing of the filesystems supported by the kernel
* ... 

The following command is an example to get CPU information through procfs::

  $ cat /proc/cpuinfo

You can use grep and pipes to search for specific information::

  $ cat /proc/cpuinfo | grep "model name" | uniq

If you wish to get memory information, you can enter the following command::

  $ cat /proc/meminfo

Obtain Information about a Specific Process
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For accessing information of a specific process, we should look into the directory ``/proc/$PID`` where ``PID`` represents the pid of the process. For instance, we can see the generic status of a process whose PID is 1 (init process)::

  $ sudo cat /proc/1/status
  Name:   systemd
  Umask:  0000
  State:  S (sleeping)
  Tgid:   1
  [...]

Another example is the virtual address mapping of a specific process (i.e., the area where various parts of the process (e.g. the stack, the heap) is located in virtual memory) by checking the content of ``/proc/$PID/maps``. In that case, we show the virtual address mapping of the init process::

  $ sudo cat /proc/1/maps
  625610a84000-625610a8a000 r--p 00000000 08:02 1846356                    /usr/lib/systemd/systemd
  625610a8a000-625610a95000 r-xp 00006000 08:02 1846356                    /usr/lib/systemd/systemd
  625610a95000-625610a9b000 r--p 00011000 08:02 1846356                    /usr/lib/systemd/systemd
  625610a9b000-625610a9d000 r--p 00016000 08:02 1846356                    /usr/lib/systemd/systemd
  625610a9d000-625610a9e000 rw-p 00018000 08:02 1846356                    /usr/lib/systemd/systemd
  62563071e000-625630986000 rw-p 00000000 00:00 0                          [heap]
  742b46eac000-742b46eae000 r--p 00000000 08:02 1847424                    /usr/lib/x86_64-linux-gnu/libpcre2-8.so.0.11.2
  [...]

.. note:: You will understand what "virtual memory" and "virtual address" mean
   in chapters 6 and 7 of the theoretical course. Right now, simply consider
   that they are "addresses" without thinking too much about it.

Each row in ``/proc/$PID/maps`` describes a region of contiguous virtual memory in a process or thread. Each row has the following fields:

.. code-block:: none

  address           perms offset  dev   inode   pathname
  08048000-08056000 r-xp 00000000 03:0c 64593   /usr/sbin/ls

In that case, the ``address`` field is the starting and ending address of the
region in the process's address space. The ``perms`` field represents the
permissions and show how the region can be accessed. There are four different
permissions: read, write, execute, and shared (``r``, ``w``, ``x``, and ``s``
or ``p`` (private)). If the process attempts to access memory in a way that is
not permitted, a segmentation fault is generated. Other fields can be ignored
for now.

For the following test, we will run a dummy "sleep" process in background with the following command:

.. code-block:: shell

  $ sleep 3600 &    # The "&" character allows to run the process in background

To retrieve the pid of a specific process, there are several possibilities. If we want to retrieve the pid of the *sleep* process, we need to enter one of the following commands:

.. code-block:: shell

  $ pidof sleep           # Method 1 (may fail on some system)
  $ ps -aux | grep sleep  # Method 2

Note that the ``ps`` command allows to gather information about all the processes.

You can now see the virtual address mapping of our "sleep" process with the following command:

.. code-block:: shell

  $ cat /proc/$(pidof sleep)/maps

Note that you need to have only one sleep process at a time otherwise you may
have some issues. To kill all sleep instances, you can use the following
command:

.. code-block:: shell

  $ kill -9 $(pidof sleep)

Accessing procfs from a C program
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Finally, we can easily gather information by implementing some scripts that read the proc file system. We just need to open it as a file and then access it through the standard read operation as shown below (C example):

.. code-block:: C
  
  FILE* fp = NULL;
  fp = fopen("/proc/cpuinfo", "r");
  if(!fp) {
    perror ("open");
    return -1;
  }

  while (fgets(buffer, sizeof(buffer), fp) != NULL) {
    //...
  }

  return 0;


Exercises
---------

As a first exercise, implement yourself a shell script and a C program which will print the list of network interfaces with the number of packets received in the following format::

  <if_name>:<nb_packet_rcvd> packets received

For this, you must parse the ``/proc/net/dev`` file. An example of expected output is shown below::

  lo:26365903 packets received
  eno1:101094276 packets received
  eno2:0 packets received
  virbr0:30529246 packets received
  virbr0-nic:0 packets received
  docker0:20041 packets received
  br-64bf554fcbdb:140635 packets received

.. raw:: html

   <details>
   <summary><a>See C answer</a></summary>

.. code-block:: C

   #include <stdio.h>
   #include <stdlib.h>
   #include <string.h>
   
   
   int main(void)
   {
       FILE* fp = NULL;
       fp = fopen("/proc/net/dev", "r");
       if(!fp)
       {
           perror("open");
           return 1;
       }
       
       char buffer[1024] = {0};
       unsigned line_count = 0;
   
       while (fgets(buffer, sizeof(buffer), fp) != NULL)
       {
           // Skip the first two lines
           if (line_count < 2)
           {
               line_count++;
               continue;
           }
   
           // Get the number of packets received. It is the 10th value (11th column)
           int nb_packets = 0;
           if (sscanf(buffer, "%*s %*d %*d %*d %*d %*d %*d %*d %*d %*d %d", &nb_packets) != 1)
           {
               perror("sscanf");
               return 1;
           }
   
           // Get the interface name. There may be spaces before the name. It ends with a colon.
           if (sscanf(buffer, " %[^:]", buffer) != 1)
           {
               perror("sscanf");
               return 1;
           }
   
           printf("%s:%d packets received\n", buffer, nb_packets);
       }
   
       fclose(fp);
       return 0;
   }

.. raw:: html

   </details>

.. raw:: html

   <details>
   <summary><a>See shell script answer</a></summary>

.. code-block:: shell

    #!/bin/sh

    cat /proc/net/dev | tail -n +3 | awk '{print $1 $11 " packets received"}'

``tail -n +3`` allows to skip the first two lines of the file.

.. raw:: html

   </details>

.. note:: As you can see, the shell script is much more concise than the C
   program |:slight_smile:|

**Additional Exercise** (solution not provided but don't hesitate to ask to review your code): Implement a C program and/or a shell script which will display the virtual address mapping of a specific process in the following format::

 $ ./parse_maps sleep
 PID of sleep: 7654
  VMA 1: 
    - Start Addr:   0x555555400000
    - End   Addr:   0x555555550000
    - Size:         0x150000
    - Permissions:  READ/-/EXECUTE
  VMA 2: 
    - Start Addr:   0x55555574f000
    - End   Addr:   0x55555578a000
    - Size:         0x3b000
    - Permissions:  READ/WRITE/EXECUTE
 [...]
  VMA 99: 
    - Start Addr:   0x7ffff759c000
    - End   Addr:   0x7ffff759d000
    - Size:         0x1000
    - Permissions:  READ/WRITE/-

You must read the pseudo filesystem in order to perform this task. In that case, each VMA represents a different virtual space (i.e. a new line in the ``/proc/$PID/maps`` pseudo file). For this exercise, consider only the 3 first permissions.