Operating systems

Portail informatique

Synchronization

Before you start, make sure you have completed the previous lab. You can also use the code from the tp3_base branch of the git repository:

  • git clone -b tp3_base https://gitlab.inf.telecom-sudparis.eu/csc4508/csc4508_facebook.git csc4508_tp3
  • or git checkout tp3_base

Synchronization

The goal of this exercise is to create a library providing synchronization primitives, and use it in the facebook server developed during the previous labs. For this we use a mechanism allowing to "overload" functions: LD_PRELOAD .

Shared libraries

When generating an executable, the linker stores the program using the ELF format: a section contains the code, a section contains the global variables, etc. The linker builds a symbol table that indicates, for each symbol (functions and variables), its section and its offset within the section. If the program uses a symbol without defining it, the linker marks the symbol as being located in the "U" ("unknown") section. A call to function foo (located in a shared library) thus translates into call<foo@plt> where <foo@plt> is an address in the plt ( Procedure Linkage Table ) section.

You can observe the contents of the table of symbols with the nm command:

$ nm server 000000000000298f T do_add_friend 0000000000002d36 T do_add_user 0000000000002480 t __do_global_dtors_aux 0000000000007d38 t __do_global_dtors_aux_fini_array_entry 00000000000028da T do_hello 00000000000028a0 T do_help 0000000000002bc1 T do_list_users 000000000000349c T do_post_message 0000000000002eae T do_view_user U pthread_mutex_init@@GLIBC_2.2.5 U pthread_mutex_lock@@GLIBC_2.2.5 U pthread_mutex_unlock@@GLIBC_2.2.5 00000000000024c5 T system_error 0000000000003f5c T thread_function 0000000000008280 b tids 0000000000008230 D __TMC_END__ 0000000000007dc8 d usage_template 00000000000082e0 B verbose U wait@@GLIBC_2.2.5 U write@@GLIBC_2.2.5 [...]

When a program starts, the system loads the sections of the ELF file, then the required shared libraries. The addresses of loaded functions are inserted into the plt table. So, when the program calls the foo function, it jumps to the given address and therefore executes the code from the function located in the library.

LD_PRELOAD

The environment variable LD_PRELOAD allows to specify a list of libraries to be loaded before loading the libraries required by the program.

Using LD_PRELOAD, it is possible to modify the behavior of the foo function. We define a library (libfoo.so) which implements a foo function having the same signature as the foo function used by the program:

int foo(int a) { printf("Intercepting foo !\n"); }

By running the application with LD_PRELOAD (LD_PRELOAD=./libfoo.so ./application ), the "new" foo function is inserted in the plt table first. When the application calls foo, it is therefore this function which is called:

$ LD_PRELOAD=./libfoo.so ./application Intercepting foo !

LD_PRELOAD therefore makes it possible to modify the behavior of a library.

Reminder: with bash, the syntax VARIABLE=value command executes command with the environment variable VARIABLE set to value.
On Mac systems, instead of using LD_PRELOAD, you may have to use the DYLD_INSERT_LIBRARIES variable which works the same way. Moreover, you may have to add the compilation flag -force_flat_namespace when compiling the application.
Creating a wrapper function

When intercepting a call to foo, it may be useful to create a "wrapper" function which performs a processing, then calls the original foo function, and then performs further processing. The wrapper system is typically used to log calls to a function:

int (*foo_original)(int a); int foo(int a) { printf("Entering foo \n"); int ret = foo_original(a); printf("Exiting foo\n"); return ret; }

For this, it is necessary to know the address of the original foo function. This address can be determined using the dlsym function:

static void __init(void) __attribute__((constructor)); static void __init(void) { foo_original = dlsym(RTLD_NEXT, "foo"); if(!foo_original) { fprintf(stderr, "Warning: cannot find 'foo': %s\n", dlerror()); abort(); } }

The dlsym function allows to find the address of a symbol. The constant RTLD_NEXT is used to find the next definition of the symbol foo (i.e. the original foo function), and instead of the first entry in the plt table.

The constant RTLD_NEXT is specific to Linux and requires the definition of _GNU_SOURCE. To use it, it is necessary to add #define _GNU_SOURCE at the very beginning of the file (before the first #include) or to define it in the Makefile (add -D_GNU_SOURCE to the CFLAGS).

The __attribute __((constructor)) attribute indicates that the function must be called when loading the library. This initializes the library (here, to retrieve the address of the function original foo.

To invoke a function when unloading the library, you can use the attribute __attribute __((destructor)) in a similar way.

Back to work !

Create the liblock.so library. This library currently contains two functions static void liblock_init() and static void liblock_finalize() which are called when loading and terminating a program. These functions just display a message.

If you can't remember how to create a library, you can consult the corresponding CSC4103 course.

Modify the library so that it intercepts calls to pthread_mutex_lock. At each interception, print a message when entering and exiting the function and call the pthread_mutex_lock function. For example:

entering pthread_mutex_lock(mutex=0x559fbbd8e308) leaving pthread_mutex_lock(mutex=0x559fbbd8e308) -> 0 entering pthread_mutex_lock(mutex=0x559fbbd8e308) leaving pthread_mutex_lock(mutex=0x559fbbd8e308) -> 0 [...]

Complete the liblock.so library to intercept calls to other pthread_mutex_* and pthread_cond_* functions. For example:

entering pthread_mutex_init(mutex=0x559fbbd8e308, attr=(nil)) leaving pthread_mutex_init(mutex=0x559fbbd8e308, attr=(nil)) -> 0 entering pthread_cond_init (cond=0x559fbbd8e330, attr=(nil)) leaving pthread_cond_init (cond=0x559fbbd8e330, attr=(nil)). -> 0 entering pthread_mutex_lock(mutex=0x559fbbd8e308) leaving pthread_mutex_lock(mutex=0x559fbbd8e308) -> 0 entering pthread_cond_wait (cond=0x559fbbd8e330, mutex=0x559fbbd8e308) entering pthread_mutex_lock(mutex=0x559fbbd8e308) leaving pthread_mutex_lock(mutex=0x559fbbd8e308) -> 0

Verify that the library is working by intercepting function calls from the facebook server.

Here is the list of functions we want to intercept:
int pthread_mutex_lock(pthread_mutex_t *mutex); int pthread_mutex_trylock(pthread_mutex_t *mutex); int pthread_mutex_unlock(pthread_mutex_t *mutex); int pthread_mutex_init(pthread_mutex_t *mutex, const pthread_mutexattr_t *attr); int pthread_mutex_destroy(pthread_mutex_t *mutex); int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex); int pthread_cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *mutex, const struct timespec *abstime); int pthread_cond_signal(pthread_cond_t *cond); int pthread_cond_broadcast(pthread_cond_t *cond); int pthread_cond_init(pthread_cond_t *cond, const pthread_condattr_t *attr); int pthread_cond_destroy(pthread_cond_t *cond);

We now want to use our own implementation of mutexes and conditions based on futex. For this it is necessary to use a counter for each mutex or condition. Since the application allocates pthread_mutex_t or pthread_cond_t, which we won't use (since we do not call the functions of the libpthread), let's use these memory areas to store our data! Just define a my_mutex_t structure, then cast a pthread_mutex_t into a my_mutex_t to access it.

Create the library libmylock.so which also intercepts pthread_mutex_* functions and pthread_cond_* but which implements the functions in based on futexes.

Make sure that the library is working by intercepting function calls from the facebook server.

The solution to this exercise is available in the tp3_corrige branch of the git repository.