Non-blocking algorithms
The lock free queue
optional and difficult
In order to avoid the ABA problem, we have to ensure that a node A is not reallocated when a thread uses it. For that purpose, we will use hazard pointers (Maged Michael, "Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects", TPDS 2004).
At high-level, when a thread reads a pointer X from a location L and wants to perform a CAS to change the value of L from X to another value, it has to ensure that X cannot be reallocated between the read and the CAS. For that purpose, the thread registers X in a thread-local variable called hazard while it uses X, and the algorithm has to ensure that X cannot be freed if one of the hazard variables points to X.
With the queue, when a thread reads the pointer X in head in dequeue, it writes thus X in hazard. Then, it re-reads head and if head is not equal to X, it restarts from the beginning. Finally, if the compare-and-swap succeed, instead of freeing the node, the thread writes 0 in its hazard pointer and adds X in a lock-free stack called free_nodes.
Similarly, in enqueue, the thread has also to ensure that the tail is not freed during one of the operations that modify tail. For this reason, similarly, when the thread reads tail, it saves its value in its hazard pointer, re-read tail in order to ensure that it was not yet potentially freed, and writes 0 in its hazard location after the CAS.
Finally, in enqueue, in order to allocate a node, a thread has now first to try to allocate the node from free_nodes before allocating a new node with malloc. In details:
- the thread pops (with a lock-free algorithm) a pointer Z from the free_nodes stack,
- If Z is null (i.e., the stack is empty), the thread allocates a node with malloc,
- otherwise
- if the thread finds the Z pointer in one of the hazard variables of the other threads, it means that Z is still used by another thread. In this case, it has to re-push Z and to use malloc to allocate a new node,
- otherwise, the node is free and not used anymore, the thread can thus directly reuse it to enqueue a new value without calling malloc.
As an additional question, why a node can not appear twice in the free_nodes stack?