CSC5101 – Advanced Programming of Multicore Architectures

Shared memory

This lab aims to make you understand how memory is managed in an operating system and how shared memory segments are implemented.

A small interface for `printf`

To warm up, we invite you to deepen your knowledge of macros. The code you will write in this exercise will be useful later to hide some differences between the POSIX (Linux) and xv6 interfaces. Completing this exercise does not require xv6.

In a shm-posix.c file, define the following macro:

#define foo(a0, ...) a0 + __VA_ARGS__

In the main function of shm-posix.c, call foo(1, 2, 3) and observe the code generated by gcc -E.

Based on the previous example, define a macro named dprintf (for debug printf) in shm-posix.c that prints a message to the standard output. This macro has the same interface as printf, i.e. the number of parameters is variable (see man 3 printf). Modify the main function of your program to call dprintf("the %s command started with %d parameters\n", argv[0], argc).

in a similar way to dprintf, add a macro error having the interface of printf and printing the message on the standard error stream. Your macro should also end the process with an exit.

Test the following code by compiling with the option -Wno-error=multistatement-macros :

if(0 == 1)
  error("I am a camel\n");
dprintf("I am a process\n");

You may encounter a compilation error indicating a problem with the parameters passed to printf. By examining the code generated by gcc -E, you can notice that although no variadic parameter is passed, a comma remains. You can fix this by using ##__VA_ARGS__ instead of __VA_ARGS__.

Why can't you see any display?

Remember to compile with gcc -E to see the code actually compiled by gcc.

To fix the problem identified in the previous question, you must be able to regroup instructions together to form a block. You could group the dprintf and the error with curly braces, but in this case the user could omit the semicolon at the end of the statement, which could quickly make the program unreadable. To force the macro user to add a semicolon, you have two solutions. You can use a do ... while like here:

do {
  statement1 ;
  statement2 ;
} while(0)

Or you can use enclose the block with braces and parentheses like here

({
  statement1 ;
  statement2 ;
})

Fix the problem identified in the previous question.

POSIX shared memory

In this exercise, you will write a program using a shared memory to synchronize two processes. Firstly, this program is executed in a POSIX environment (ie under Linux).

In the rest of the lab, use dprintf rather than printf to display information, and error to exit in case of errors.

In shm-posix.c, create a child process that prints a message before exiting. The parent must wait for the end of the child before printing a message.

Modify your program so that:

Before the creation of the child process:

The parent creates a shared memory segment of size 8192,

Then, after the creation of the child process:

The parent maps this segment into its address space at address 0x10000000 before destroying the mapping,
The child maps this same segment into its address space, but at the address 0x20000000 before destroying the mapping,
The parent destroys the shared memory segment after the child terminates.

Remember to handle errors.

You will find the following functions useful:

shm_open: opens or creates a shared memory segment. A use typical is
int fd = shm_open("/my-key", O_RDWR | O_CREAT, 0777);
ftruncate: allows to specify the size of a memory segment shared previously opened. A typical use is
ftruncate(fd, 65536);
mmap: allows you to map the shared memory segment in the address space of the process (more generally, mmap allows to map any file). Typical use is:
void* addr = (void*)0x10000000; addr = mmap(addr, 65536, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
shm_unlink: unlink (destroy) a shared memory segment.

Initially, a shared memory segment contains only 0s. Use this property:

to block the parent as long as the first integer shared memory contains 0,
make sure that the child unlocks the father.

You must use the keyword volatile to declare your pointers to the shared integers. It forces the compiler to generate instructions to fetch the updated value from memory instead on relying on registers to cache the values. Also consider printing messages to ensure that your synchronization is correct (keep these messages which will be used in the third exercise).

Instead of using the first integer from shared memory to synchronize the parent and the child, use a global variable. Why does the synchronization not work anymore ? Restore the operation of the program using the first integer of the shared memory segment.

Similarly, block the child while the second integer is 0, and use the parent to unblock the child after the parent himself has been unblocked by the child.

Congratulations, you have just implemented a great inter-process function call! Indeed, the child is the client, sends a request to the parent who executes code, meanwhile, the child waits for the parent's response.

shm-posix.c

Implementation of a shared memory interface in xv6

The goal of this exercise is to create a new interface to manipulate the shared memory segments in xv6. At this stage, we only deal with interfaces, their implementation remains empty.

Start by getting the code for xv6:

git clone https://gitlabev.imtbs-tsp.eu/csc4508/xv6-riscv.git if you start from a new copy
or git checkout master if you already have a local copy of the repository

Then create a local branch to work:

git checkout -b my-shm

Copy the program shm-posix.c to user/shm.c in the xv6 sources and add it to the xv6 compilation chain (see the UPROGS variable from Makefile). Then, comment out the functions which take care of creating or manipulating the shared memory, then modify your program so that it compiles and runs without error with the xv6 kernel; in particular, replace headers with their xv6 counterparts.

Add the following system calls to xv6:

int shm_create(int size) creates a shared memory segment of size size and returns an identifier to this segment. It is the equivalent of a call to the POSIX function shm_open followed by ftruncate
int shm_attach(int id, void* addr) attaches the segment id at the address addr. This function is the equivalent of the POSIX mmap function.
int shm_detach(int id) detaches the id segment. This function is the equivalent of the POSIX munmap function.
int shm_destroy(int id) destroys the id segment. This function is the equivalent of the POSIX shm_unlink function.

Each of these functions should return -1 on error. Add a preliminary implementation of these functions in vm.c which prints function ??? not yet implemented and returns -1.

You can use the special macro __func__: it is replaced by the compiler with the name of the current function.

If you forgot how to add new system calls, have a look at Lab #4).

Uncomment the code that manipulates the shared memory segments of shm.c and adapt it to use the xv6 system calls. Check that your program has the expected behavior (calling shm_create prints "not yet implemented" and returns -1, shm.c quits with a suitable message).

Creation and attachment of a shared memory segment

Implement shm_create and shm_attach.

To begin with, we remind you that argint(n, &v) allows you to store the n-1^th argument passed to the system call at the address of v. In addition, argaddr(n, &v); can be used to retrieve the value of an argument as an address (of type uint64).

To implement these functions, we advise you to use the following data structure (to be added to vm.c) to represent the set of shared memory segments:

// It is best you define the constants in kernel/param.h.
#define SHM_N   16
#define SHM_PAGES_N 10

struct {
  struct spinlock lock;         // #include "spinlock.h"
  struct {
    uint64 pages[SHM_PAGES_N];
    int   npages;
    int   nused;
  } shms[SHM_N];
} shms;

In this structure:

shms.lock is a lock used to protect against concurrent accesses to the table of shared memory segments.
shms.shms[id] describes the shared memory segment with identifier id (up to SHM_N different segments):
- pages contains the physical addresses of the shared pages (up to SHM_PAGES_N, see below);
- npages gives the actual number of pages of the segment; it is equal to 0 if the entry is not used, i.e. if there is no shared memory segment with identifierid;
- nused gives the number of times the segment has been attached (this field will be used in the following exercise).

Initially, the structure is filled with zeros. Specifically, we can know that all the entries in the table are free since, for any segment i, shms.shms[i].npages == 0.

Finally, for managing memory, you should know that:

The kernel function void* kalloc() allocates a new physical page, and returns its physical address (assimilable to a uint64).
The function memset() can be used to fill a memory area with a value.
mappages(myproc()->pagetable, vaddr, PGSIZE, paddr, PTE_R|PTE_W|PTE_U) maps the physical address page paddr to the virtual address vaddr in the current process's page table (myproc()->pagetable, requires including proc.h), and gives the current user process read and write permission to this page.
To calculate the number of memory pages needed to store size bytes, you can use PGROUNDUP(size)/PGSIZE.

When testing your code, you will probably make your kernel panic with "freewalk: leaf". This is expected, and does not mean your code is wrong.

Indeed, xv6 expects a process to only have contiguous pages in its heap. The heap grows through the following chain of function calls:

the user process calls the syscall sbrk;
the implementation of sbrk calls growproc, in kernel/growproc.c;
which in turn calls uvmalloc, in kernel/vm.c: it allocates physical pages and maps them contiguously in the process's virtual address space.

This is for growing the heap; then, when a process terminates and is waited for, xv6 frees its memory:

it calls freeproc in kernel/proc.c;
which eventually calls uvmfree in kernel/vm.c;
this function unmaps all the process's pages, i.e., the leaves in its page table, using uvmunmap;
and the last step to free an exited user process is to call freewalk to free the pages used by its pagetable (to free the PTEs).

However, the function uvmunmap makes the assumption that all pages are contiguous (as explained above), and go from the virtual address 0 to p->sz, i.e., the size of the process's heap that grew with uvmalloc.

In conclusion, uvmunmap misses the shared memory segments. The following function, freewalk, expects all leaves (all pages allocated to the process) to have been freed beforehand, which is not the case with your shared memory segments, and it panics.

You've done a good job already, but if you want to go further, this problem will be solved in the next exercise.

The shm_create function must:

calculate the number of memory pages to allocate;
find a free shm (i.e. whose npages is 0);
allocate the memory pages with kalloc, and store their addresses in the pages array;
fill the allocated pages with 0s;

The shm_attach function must map the pages of the shared memory segment into the process's page table, and increment the nused reference count.

To go further: detach and destroy a shared memory segment

The aim of this exercise is to implement the shm_detach and shm_destroy functions.

To get started, in order to detach the id segment of the p process, you need to know to which address this segment had been mapped. To do this, you need to modify the proc structure in order to keep the (id, addr) associations where id is a segment identifier and addr the projection address. Edit the shm_attach function in order to store this association.

The kernel function void uvmunmap(myproc()->pagetable, vaddr, npages, 0) can be used to unmap npages pages starting at virtual address vaddr without freeing them.

Finally, you should know that kfree frees a memory page.

To go even further, you can:

Prevent the destruction of a segment if it is still mapped in another process using the nused field.
Automatically detach the attached segments from a process when its resources as destroyed (function freeproc in proc.c). Write a function uint64 shm_detach(struct proc *p, int shm) with the actual implementation, and make the syscall implementation call it.