Project 2024

Get the serial code and sample input data

Once you are connected to NIC5, you can retrieve the serial code by running the following commands:

module load Info0939Tools

get_info0939_project

Compile and run the serial code

To compile the serial code, load the GCC module and run the following compilation command:

module load GCC

cd project_info0939
gcc -O3 -o shallow shallow.c -lm

This will generate an executable named shallow. You can test the executable by running one of the simple (and small) examples:

cd example_inputs/simple
../../shallow param_simple.txt

In this example, the serial code is run on the login node of NIC5. This is fine for such a small example. However this is not considered good practice to run your code on the login node(s) of HPC clusters (see Rule #3 of the fair use of CÉCI clusters). In general, applications should run on a compute node. See this section of the lecture notes for a detailed description on how to launch the serial code on a compute node.

Visualize the results

The code produces a series of VTK Image files (.vti) as well as a manifest in the ParaView Data file format (.pvd). To visualize these file you can use ParaView. Paraview is available for all major operating systems:

Basic visualization

Before starting the visualization, you need to copy the output files to your computer. How to do that is described on this page. Note that it is easier to create a Zip file before copying the outputs. The creation of a Zip file is described here.

Open ParaView and navigate to File Open and select the pvd file you want to open. For the simple example, the file name is eta_simple.pvd. Next click Apply button.

By default, ParaView will use the first time step to determine the range of the values. In order to improve the visualization, it's preferable to use a range applying to the entire duration of the simulation. To do that, click the Rescale to data range over all timesteps button:

Next, click the Play button to perform the visualization.

If you followed all the steps above, at the end of the animation, for the simple case, you should have something looking like this

Apply a filter for 3D visualization

To visualize the result in 3D, we apply a Wrap By Scalar Filter. To do that, right-click on the opened file in the Pipeline Browser and select Add Filter Alphabetical Wrap By Scalar.

A new element will appear in the Pipeline Browser. Next, we will apply a scaling factor for the filter by selecting the WrapByScalar1 filter in the Pipeline Browser (it should be selected by default) and set the Scale Factor.

Next, we apply the filter by clicking on the Apply button.

As we are now visualizing in 3D, we will change the ParaView Interaction Mode from 2D to 3D by pressing the 2D button at the top of the rendering pane.

Once you have activated the 3D Interaction Mode, you can use your mouse to manipulate the 3D visualization. After pressing the Play button, you should have something that looks like this

Export the visualization as a movie

To export your visualization pipeline as a movie File Save Animation...

Choose the name under which you wish to save the animation. For example, you can use shallow as the name.

Click on the OK button which will bring you to the next dialog box where you can select the option for the movie. Select the Image Resolution you want to use. You can also change the Frame Rate to a higher value than the default value of 1. For example, you can use 10. After you click the Ok button, a dialog box will be displayed to indicate the progress of the export.

Modifications to write the VTK files (MPI implementation)

The easiest strategy to handle the output files for your MPI implementation is to write a VTK file for each rank. For each rank, we will add an origin field in order to be able to visualize all the files at the same time in ParaView.

Step 1: Add an offsetx and offsety field to the data structure:

struct data {
  int nx, ny;
  int offsetx, offsety;
  double dx, dy;
  double *values;
};

These offset fields represent the starting indexes of your rank subdomains. In order to initialize the data structures with these offsets, you can modify the init_data function:

int init_data(struct data *data, int nx, int ny, int offsetx, int offsety,
              double dx, double dy, double val)
{
  data->nx = nx;
  data->ny = ny;
  data->offsetx = offsetx;
  data->offsety = offsety;
  data->dx = dx;
  data->dy = dy;
  data->values = (double*)malloc(nx * ny * sizeof(double));
  if(!data->values){
    printf("Error: Could not allocate data\n");
    return 1;
  }
  for(int j = 0; j < ny; j++) {
    for(int i = 0; i < nx; i++) {
      SET(data, i, j, val);
    }
  }
  return 0;
}

Step 2: Modify the write_data_vtk and write_manifest_vtk functions. These functions will take new arguments: a rank argument for the write_data_vtk and a numranks argument for the write_manifest_vtk.

int write_data_vtk(const struct data *data, const char *name,
                   const char *filename, int step, int rank)
{
  char out[512];
  if(step < 0)
    sprintf(out, "%s_rank%d.vti", filename, rank);
  else
    sprintf(out, "%s_rank%d_%d.vti", filename, rank, step);

  FILE *fp = fopen(out, "wb");
  if(!fp) {
    printf("Error: Could not open output VTK file '%s'\n", out);
    return 1;
  }

  uint64_t num_points = data->nx * data->ny;
  uint64_t num_bytes = num_points * sizeof(double);

  fprintf(fp, "<?xml version=\"1.0\"?>\n"
              "<VTKFile"
              " type=\"ImageData\""
              " version=\"1.0\""
              " byte_order=\"LittleEndian\""
              " header_type=\"UInt64\""
              ">\n"
              "  <ImageData"
              " WholeExtent=\"0 %d 0 %d 0 0\""
              " Spacing=\"%lf %lf 0.0\""
              " Origin=\"%lf %lf 0\""
              ">\n"
              "    <Piece Extent=\"0 %d 0 %d 0 0\">\n"
              "      <PointData Scalars=\"scalar_data\">\n"
              "        <DataArray"
              " type=\"Float64\""
              " Name=\"%s\""
              " format=\"appended\""
              " offset=\"0\""
              ">\n"
              "        </DataArray>\n"
              "      </PointData>\n"
              "    </Piece>\n"
              "  </ImageData>\n"
              "  <AppendedData encoding=\"raw\">\n_",
              data->nx - 1, data->ny - 1,
              data->dx, data->dy,
              data->offsetx * data->dx, data->offsety * data->dy,
              data->nx - 1, data->ny - 1,
              name);

  fwrite(&num_bytes, sizeof(uint64_t), 1, fp);
  fwrite(data->values, sizeof(double), num_points, fp);

  fprintf(fp, "  </AppendedData>\n"
              "</VTKFile>\n");

  fclose(fp);

  return 0;
}

int write_manifest_vtk(const char *filename, double dt, int nt,
                       int sampling_rate, int numranks)
{
  char out[512];
  sprintf(out, "%s.pvd", filename);

  FILE *fp = fopen(out, "wb");

  if(!fp) {
    printf("Error: Could not open output VTK manifest file '%s'\n", out);
    return 1;
  }

  fprintf(fp, "<VTKFile"
              " type=\"Collection\""
              " version=\"0.1\""
              " byte_order=\"LittleEndian\">\n"
              "  <Collection>\n");

  for(int n = 0; n < nt; n++) {
    if(sampling_rate && !(n % sampling_rate)) {
      double t = n * dt;
      for (int rank = 0; rank < numranks; rank++) {
        fprintf(fp, "    <DataSet timestep=\"%g\" part=\"%d\" file='%s_rank%d_%d.vti'/>\n",
                t, rank, filename, rank, n);
      }
    }
  }

  fprintf(fp, "  </Collection>\n"
              "</VTKFile>\n");

  fclose(fp);

  return 0;
}

Step 3: Modify the rest of your code to take into account the change of the init_data, write_data_vtk and write_manifest_vtk functions.

Intermediate deadline

For the intermediate deadline, student #1 of each group should copy the most recent working version of the group's code into a directory named intermediate in his/her home directory.

Final deadline

For the final deadline, student #1 of each group should copy the group's codes into a directory named final in his/her home directory on NIC5. A single CPU code (MPI+OpenMP) and a single GPU code (OpenMP, or MPI+OpenMP if topic 2 was chosen for the second phase) should be submitted. A README file can be added if you want to communicate additional information related to your implementation. No report (slides, results, ...) is necessary at this point.

Oral exam

For the oral exam, you should prepare a presentation with answers to the questions listed in the project handout, and with a summary of your main results (scaling, etc.). Your presentation should last about 10 minutes. The remaining time of the exam will consist in a Q&A session, where each group member should be ready to explain results and implementation details, modify the code, and compile+run it on the NIC5 and Lucia clusters.

(You are of course allowed to modify your code between the final deadline and the date of the oral exam, if you wish e.g. to implement additional features.)