INFO0902 - Data structures and algorithms

Random CS quote

A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.

Robert E. Kahn

Informations

Schedule

/ 09 Feb. 2018
Ex.
Project
Tutorial
16 Feb. 2018

Exercise session 1: Pseudo-code and recursion

YouTube More Sierpinsky's triangle-like shapes

Statement for project 0
Code

Tutorial: Let's C

Deadline
22 Feb. 2018

Optional project deadline

Feedback Ex. 23 Feb. 2018

Feedback on project 0

Exercise session 2: Analysis tools

YouTube Big O notation

Project Ex. 02 Mar. 2018

Statement for project 1
Code

Exercise session 2: Analysis tools (second part)

/ 09 Mar. 2018
Ex. 16 Mar. 2018 Exercise session 3: Stacks, Queues, Lists, Vectors and Sequences
Deadline
Project
Ex.
23 Mar. 2018

Don't forget to submit your result for the first assignment on the submission platform

Assignment 2:

  • Statement
  • Code
    • Heap.h updated on 28th Mar.
    • Heap.h updated on 16th Apr.

Exercise session 4: Heaps, Priority queues and Trees (first part)

Ex. 30 Mar. 2018

Exercise session 5: Dictionaries--binary search trees

/ 7 Apr. 2018

Easter holiday

/ 13 Apr. 2018

Easter holiday

Deadline
22 Apr. 2018

Don't forget to submit your result for the second assignment on the submission platform

Project
Ex.
20 Apr. 2018

Assignment 3:

Exercise session 6: Dictionaries--hash tables

Ex. 27 Apr. 2018

Exercise session 7: Brute force and dynamic programming

Ex. 04 May 2018

Exercise session 8: Divide and conquer and greedy algorithms

/ 11 May 2018

Break

Deadline
Ex.
18 May 2018

Assignment:

  • Don't forget to submit your result for the third assigment on the submission platform
  • Check out the FAQ for the third assignment

Exercise session 9: Graphs

Project
04 Jul. 2018

Statement for 2nd session assignment
Code

Deadline
14 Aug. 2018

Don't forget to submit your result for the 2nd session assignment on the submission platform

FAQ for the second assignment

This design choice is left to you. Note, however, that the concept of median cannot be present in the heap implementation; the implementation must be general.
It is a structure whose definition (i.e. the fields it comprises) is not exposed in the .h file. It is an encapsulation mechanism.
As its name suggests, the purpose of the HeapReference is to reference an element of the heap. Since the HeapReference is opaque, you are free to choose what information to store inside (in the Heap.c file).
The only requirement is to be able to access an element in constant time. The content of the HeapReference depends on the representation you chose for the heap (array of pointer-based structure).
Not necessarily. It you opt for the pointer-based representation, you could implement the heap that way. However, it should prove easier to create another structure to represent the heap nodes.
If you go for the array representation, it is discourage to represent the heap as an array of HeapReference.
Your heap cannot have a bounded size. If you decided to implement the heap as an array, it must be extendable. A common way of doing this is to start with an empty array of arbitrary size (though not too large), and to double its length each time it is about to be filled.
Given the opacity of HeapReference, you cannot swap elements of a HeapReference array. What you can swap, however, is pointers to HeapReference. We therefore suggest you to use an array of the following structure:
struct heap_queue_item {
    HeapReference* ref; // a single reference to an element of the heap,
                        // obtained by calling referenceArrayAlloc(1)
    size_t heap;        // 0 if the ref points to the max heap, 1 if it points to min heap
};
Such structure allows you to swap the HeapReference (by swapping the HeapReference pointers) when needed.

Again, you can choose to represent your heap as an array or a pointer-based structure. If you choose the array representation, here is how you could implement it.

First, let us consider what information we need to store both in the Heap and HeapReference structures.

In Heap, we obviously need to have an array to represent the heap structure (let us call it heapArray). For the HeapReference, a first idea would be to simply keep an index to the array cell of the element. This however is not a good idea because the elements in heapArray might change positions on heap update (heapAdd, heapReplace), therefore invalidating the references.

We need to go for a different solution. The problem with a single heap array is that element might change position. Therefore, a solution would be to have a second array (let us call it indexes) of which each cell would store the index of a given element in heapArray and we would have to ensure that, in the indexes array, the position of an element does not change. The HeapReference could then simply keep the index of the element in the indexes array (see image below).

heap implementation

Note that to be able to implement the heapAdd and heapReplace operations correctly, you also need to know where is located an element in indexes given its position in the heap.

Finally, in the Heap structure, we need to store the size of the heap (number of elements stored inside of it), the capacity of the heap (total number of elements that can be stored in the current allocated arrays) and whether it is a max- or a min-heap.

FAQ for the third assignment

If get something like Compression error: 87756716, it relates to the error introduced by the compression as calculated by Eq. (1) of the statement. In other words, the program is working fine.x
In this context, by analytical we expect you to give a formula which can be evaluated in linear time with respect to the length of the interval [p_i, p_{i+1}[, by taking advantage of the error formula.
In contrast, a non-analytical solution could be to
  • Search exhaustively the optimal value.
  • Devise an iterative procedure which would reduce the error on the optimal solution at each step.
Indeed, there is a missing header defining the size_t type. You can include stddef.
A new version of the code has been uploaded to the website and the submission platforms uses the correct version.
The statement descrbies a greedy approach. You can start from there and improve it. More to the point, the algorithm makes two arbitrary choices that can be improved upon:
  • Choosing an interval randomly
  • Placing the threshold in the middle
The choices are about where to place the thresholds. If you divide an interval, you are not changing the thresholds. Re-evaluating a choice would be to move a threshold after it had been placed.

Supplementary material

Visual Algo is a site which illustrates many algorithms and data structures of the course. In addition, in the training section, you are able to create online quizz on those topics to test your knowledge.

Misc.

Testing machines

The projects must compile on the ms8xx (xx=01..24) machines !

Firstly, you need to create an account through the registration page.

Then you can connect to the machines thanks to SSH with the following command:

  • ssh login@ms8xx.montefiore.ulg.ac.be
where you need to replace login by your actual login and xx by a machine number (xx=01..25). SSH will open a terminal on the remote machine. For windows user, the PuTTY utility will mimic SSH behaviour (an illustrated step-by-step tutorial can be found here).

Several solutions are available to ship source code to and from the ms8xx machines.

  • FileZilla: a graphic, cross-platform FTP client (an illustrated step-by-step tutorial can be found here)
  • scp: a command line utility to transfert file from/to remote hosts (it works much like the cp command)
  • rsync: a command line utility to synchronize remote files
  • sshfs: a command line utility to "mount" a remote directory
Those utilities might need some configuration/getting-used-to. A few hints to help you out:
  • Read the man page (so you can say you have)
  • Try the help flags -h, --help (you might even get useful information)
  • Google your questions or get a succinct tutorial (others have stumbled on the same difficulties, let them help you)
  • Script the data transferts, compilation steps, testing suite (human memory is the most expensive)

Oh, and be sure to chmod your home folder to prevent others from messing with your files.

Misc.

Last modified on July 04 2018 11:04