Threads

In a multi-threaded process, each thread has a context (registers + stack). The rest of the memory (code, data, etc.) and resources (open files, etc.) are shared between threads.

The stacks of the different threads are located in memory so that they can grow. However, if a thread’s stack grows too much, it might overflow onto the stack of another thread. To avoid this problem, the size of the stack is limited (the command ulimit -s gives the maximum stack size). This size limit can be changed using command line (by example ulimit -s 32768), or from a program (in using the setrlimit function).

We present here the Pthread API (POSIX thread) which is the most used in C. The C11 standard defines another interface for manipulating threads. However, there are only few implementations of this interface. The de facto standard therefore remains Pthread.

Unlike the creation of processes which generates a hierarchy (ie. each process has a parent process), there is no hierarchy between threads.

Technically, all the memory space is shared between the threads. It is therefore possible to share all the variables, including local variables.

Example: `strtok`

Another example of a non-reentrant function is the char *strtok(char * str, char * delim) function. This function extracts substrings from a string.

For example, the following code displays the different directories of the PATH variable:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

void extract_path() {
  char* string = getenv("PATH");
  printf("Parsing '%s'\n", string);

  
  for(char* token = strtok(string, ":") ;
      token ;
      token = strtok(NULL, ":") ){
    printf("\t %s\n", token);
  }
}

int main(int argc, char**argv) {
  extract_path();
  return 0;
}

Here is an example of result obtained with this program:

Parsing '/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games'
         /usr/local/bin
         /usr/bin
         /bin
         /usr/local/games
         /usr/games

The strtok function is not reentrant because it is based on a previous state (a pointer to the last character tested in the string). Thus, in this example, the processing applied to each token cannot use strtok. For example:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

void extract_path() {
  char* string = getenv("PATH");
  printf("Parsing '%s'\n", string);
  // string should contain a list of directories separated with :
  // eg. /usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
  
  // Extract the directories
  // eg. /usr/local/bin, /usr/bin, /bin, /usr/local/games, /usr/games
  for(char* token = strtok(string, ":") ;
      token ;
      token = strtok(NULL, ":") ){
    // token contains a directory (eg. /usr/local/bin)
    printf("\t %s contains: ", token);
    
    // Extract the subdirectories
    // eg. usr, local, bin
    for(char* word = strtok(token, "/ ") ;
	word ;
	word = strtok(NULL, "/") ){
      printf("%s ", word);
    }
    printf("\n");
  }
}

int main(int argc, char**argv) {
  extract_path();
  return 0;
}

Will result in:

Parsing '/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games'
         /usr/local/bin contains: usr local bin

Here the first token (/usr/local/bin) is split into words (usr, local, bin) by successive calls to strtok which modify the previous state of strtok, which prevents subsequent calls to token = strtok (NULL, ":") to iterate over the string string.

Making a function reentrant

It is possible to make a non-reentrant function reentrant by adding a parameter corresponding to the state of the function. For example, the reentrant version of char* strtok(char *str, const char *delim); is char* strtok_r(char *str, const char *delim, char **saveptr );

Thus, the previous program can be corrected:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

void extract_path() {
  char* string = getenv("PATH");
  char* saveptr = NULL;
  printf("Parsing '%s'\n", string);

  
  for(char* token = strtok_r(string, ":", &saveptr) ;
      token ;
      token = strtok_r(NULL, ":", &saveptr) ){
    printf("\t %s contains: ", token);

    char* saveptr_word = NULL;
    for(char* word = strtok_r(token, "/ ", &saveptr_word) ;
	word ;
	word = strtok_r(NULL, "/", &saveptr_word) ){
      printf("%s ", word);
    }
    printf("\n");
  }
}

int main(int argc, char**argv) {
  extract_path();
  return 0;
}

Which will result in:

Parsing '/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games'
         /usr/local/bin contains: usr local bin 
         /usr/bin contains: usr bin 
         /bin contains: bin 
         /usr/local/games contains: usr local games 
         /usr/games contains: usr games

TLS variables in C99

Before the C11 standard, using thread-local storage was supported by some compilers using compiler-dependant keywords:
- in C99 with gcc: __thread int variable = 0;
- in C99 with Visual studio: __declspec(thread) int variable = 0;

`pthread_key`

Another way (more portable, but much more painful to write) to declare a TLS variable is to use a pthread_key:
- creation:
  - int pthread_key_create(pthread_key_t *key, void (*destructor)(void*));
- terminaison:
  - int pthread_key_delete(pthread_key_t *key););
- usage:
  - void *pthread_getspecific(pthread_key_t key);
  - int pthread_setspecific(pthread_key_t key, const void *value);
- initialisation:
  - int pthread_once(pthread_once_t *once_control, void (*init_routine) (void));

The following program illustrates the problem of simultaneous access to shared variables. Here, two threads each increment 1 000 000 000 times the same variable:

/*
 * compteurBOOM.c
 *
 * Synchronization problem
 *
 *
 */

#include <error.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>

/* INT_MAX / 2 */
#define NBITER 1000000000

int counter = 0;

void *start_routine(void *arg) {
  int i;

  for (i = 0; i < NBITER; i++) {
      /* OOPS: WRONG ! Access to an unprotected shared variable */
      counter ++;
    }
  pthread_exit(NULL);
}

int main (int argc, char *argv[]) {
  int rc;
  pthread_t thread1, thread2;
  
  rc = pthread_create(&thread1, NULL, start_routine, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_create");

  rc = pthread_create(&thread2, NULL, start_routine, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_create");

  rc = pthread_join(thread1, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_join");
  rc = pthread_join(thread2, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_join");

  if (counter != 2 * NBITER)
    printf("BOOM! counter = %d\n", counter);
  else
    printf("OK counter = %d\n", counter);

  exit(EXIT_SUCCESS);
}

While the counter should be 2 * 1 000 000 000 = 2 000 000 000, running this program gives another result, for example:

$ ./compteurBOOM 
BOOM! compteur = 1076588402

Using a mutex, we can correct the BOOM counter program by ensuring that the counter increments are done in mutual exclusion:

/*
 * compteurBOOM.c
 *
 * Synchronization problem
 *
 *
 */

#include <error.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>

/* INT_MAX / 2 */
#define NBITER 1000000000

int counter = 0;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

void *start_routine(void *arg) {
  int i;

  for (i = 0; i < NBITER; i++) {
    pthread_mutex_lock(&mutex);
    counter ++;
    pthread_mutex_unlock(&mutex);
  }
  pthread_exit(NULL);
}

int main (int argc, char *argv[]) {
  int rc;
  pthread_t thread1, thread2;
  
  rc = pthread_create(&thread1, NULL, start_routine, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_create");

  rc = pthread_create(&thread2, NULL, start_routine, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_create");

  rc = pthread_join(thread1, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_join");
  rc = pthread_join(thread2, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_join");

  if (counter != 2 * NBITER)
    printf("BOOM! counter = %d\n", counter);
  else
    printf("OK counter = %d\n", counter);

  exit(EXIT_SUCCESS);
}

While the result is correct, the use of a mutex significantly slows down the program (144s with mutex, against 4.1s without mutex).

We can fix the counterBOOM program by using atomic operations. To do this, all we have to do is declare the counter like _Atomic int. The counter increment then uses the atomic operation atomic_fetch_add.

/*
 * compteurBOOM.c
 *
 * Synchronization problem
 *
 *
 */

#include <error.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>

/* INT_MAX / 2 */
#define NBITER 1000000000

_Atomic int counter = 0;

void *start_routine(void *arg) {
  int i;

  for (i = 0; i < NBITER; i++) {
    counter ++;
  }
  pthread_exit(NULL);
}

int main (int argc, char *argv[]) {
  int rc;
  pthread_t thread1, thread2;
  
  rc = pthread_create(&thread1, NULL, start_routine, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_create");

  rc = pthread_create(&thread2, NULL, start_routine, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_create");

  rc = pthread_join(thread1, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_join");
  rc = pthread_join(thread2, NULL);
  if (rc)
    error(EXIT_FAILURE, rc, "pthread_join");

  if (counter != 2 * NBITER)
    printf("BOOM! counter = %d\n", counter);
  else
    printf("OK counter = %d\n", counter);

  exit(EXIT_SUCCESS);
}

Here, the result is correct and the program is much faster than when using a mutex:

without synchronization: 4.1s
with a mutex: 144s
with an atomic operation: 35s

Threads

Execution context of a process

Duplicating a process

Execution flows

Multithreaded process

Creating a Pthread

Other Pthread functions

Thread-safe source code

Reentrant source code

Example: `strtok`

Making a function reentrant

TLS – Thread-Local Storage

TLS variables in C99

`pthread_key`

Synchronization

Mutex

Atomic operations

Execution context of a process

Duplicating a process

Execution flows

Multithreaded process

Creating a Pthread

Other Pthread functions

Sharing data

Thread-safe source code

Reentrant source code

Example: strtok

Making a function reentrant

TLS – Thread-Local Storage

TLS variables in C99

pthread_key

Synchronization

Mutex

Atomic operations

Example: `strtok`

`pthread_key`