Non-blocking algorithms
You have to send the exercise 4 by email before October 14 at 11:59 pm to gael.thomas@inria.fr.
The lock free queue (not graded)
optional and difficult
In order to avoid the ABA problem, we have to ensure that a node A is not reallocated when a thread uses it. For that purpose, we will use hazard pointers (Maged Michael, "Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects", TPDS 2004).
At high-level, when a thread reads a pointer X from a location L and wants to perform a CAS to change the value of L from X to another value, it has to ensure that X cannot be reallocated between the read and the CAS. For that purpose, the thread registers X in a thread-local variable called hazard while it uses X, and the algorithm has to ensure that X cannot be freed if one of the hazard variables points to X.
With the queue, when a thread reads the pointer X in head in dequeue, it writes thus X in hazard. Then, it re-reads head and if head is not equal to X, it restarts from the beginning. Finally, if the compare-and-swap succeed, instead of freeing the node, the thread writes 0 in its hazard pointer and adds X in a lock-free stack called free_nodes.
Similarly, in enqueue, the thread has also to ensure that the tail is not freed during one of the operations that modify tail. For this reason, similarly, when the thread reads tail, it saves its value in its hazard pointer, re-read tail in order to ensure that it was not yet potentially freed, and writes 0 in its hazard location after the CAS.
Finally, in enqueue, in order to allocate a node, a thread has now first to try to allocate the node from free_nodes before allocating a new node with malloc. In details:
- the thread pops (with a lock-free algorithm) a pointer Z from the free_nodes stack,
- If Z is null (i.e., the stack is empty), the thread allocates a node with malloc,
- otherwise
- if the thread finds the Z pointer in one of the hazard variables of the other threads, it means that Z is still used by another thread. In this case, it has to re-push Z and to use malloc to allocate a new node,
- otherwise, the node is free and not used anymore, the thread can thus directly reuse it to enqueue a new value without calling malloc.
As an additional question, why a node can not appear twice in the free_nodes stack?
The radix tree (not graded, easy)
The linked list (not graded, optional and difficult)
A simple memory allocator (graded)
In this exercice, we will implement a memory allocator. Internally, the allocator manages chunks. A chunk represents a memory region, which can be allocated or not. A chunk consists of this structure:
The interface consists then in two functions:
- struct chunk* alloc_chunk(size_t size): allocate a chunk of size size,
- void free_chunk(struct chunk* c): free the chunk c.
When a chunk c is free, it is stored in a global linked list named struct chunk* free_list. In this case, c->next points to the next free chunk in the list free_list. When a chunk c is allocated, it is not in free_list. In this case, the next field is not used, and the user can store the content of an object in c->content.
In order to allocate a chunk, alloc_chunk finds the first chunk c large enough in the list free_chunk, and remove c from the list. If c->chunk + sizeof(struct chunk) is greater than size, returning the whole chunk c would lead to a memory waste. In order to avoid the waste, in this case, alloc_chunk has to break the chunk in two by creating a new chunk inside the content c.
- Find the first chunk c in free_list such that size < c->size,
- Remove c from the list
- If size + sizeof(struct chunk) < c->size, then:
- Creates a new chunk struct chunk* d = (struct chunk*)((uintptr_t)c + size),
- Set d->size to c->size - size, and c->size to size,
- Free the chunk d,
- Return c.
- Otherwise (i.e., if size + sizeof(struct chunk) >= c->size), simply returns c.
In this first part of the exercice, you will implement a lock-based version of the allocator. For that, you need a global lock named pthread_mutex_t mutex.
free_chunk
create_large_chunk
alloc_chunk
In this second part of the exercice, you will implement a lock-free version of the algorithm step by step. For that, we will start by simply considering free_list as a lock-free stack.