System calls
The goal of this lab is to make you familiar with xv6, a kernel developed at MIT to learn operating systems. The second goal of this lab is that you understand how a system call works. To do this, you will implement a new synchronization mechanism in the kernel and explore the basic structures of structures of xv6.
Installation dependencies (∼ 10mn)
Installation (∼ 20mn)
This command starts xv6 in the qemu emulator. When xv6 starts, it launches a simple shell allowing to execute some basic commands. The code of shell is located in the user/sh.c file.
To exit QEMU, type Ctrl-a x, i.e., hold Ctrl, type a, release Ctrl, type x. Ctrl-a is used to send a control sequence to control QEMU, for instance x to quit.
If you look at the source code, you can find the classic architecture of a system: the kernel image is called kernel (in the directory kernel/), and consists of the files indicated in the variable OBJS that you will find in the Makefile. Each of the basic command in the user space is implemented in a source file (cat.c, sh.c etc.) in the directory user. You can find the list of these commands, each prefixed with an underscore _, in the variable UPROGS (user commands) of the Makefile.
It is also possible to run xv6 in debug mode for directly seeing the low-level kernel operation. To do this, run:
In another terminal, navigate to the root of your sources of xv6, then run:
Once gdb starts correctly, type b main (break main) to instruct gdb to interrupt the execution of xv6 when it reaches the main function. Then, resume the execution of xv6 with the command c (continue), in other words, let xv6 finish booting. If all goes well, xv6 stops at the start of the main function.
Display the code of the main function with l (list). You can examine the code that initializes the system (the source code of this function can be found in main.c. Technically, this code is called from a trampoline in assembler which is located in entry.S and which takes care of installing the vital minimum allowing to switch to C. So the boot process for xv6 is the following:
- QEMU makes the boot CPU jump at the start of the kernel, this is the assembler trampoline at the _entry label;
- the assembler part correctly initializes the processor (we will explain how later) then jumps into the main function.
First system call (∼ 1h10)
The goal of this exercise is to add new system call to xv6. The system call we create is relatively artificial, but reproduces in a fairly realistic way the operation of the Linux futexes that you have studied in previous labs.
To start, we create a new command named ftest allowing us to test our new system call. This command, like any command, is executed in userland and therefore, ultimately, interacts with the kernel. For the moment, our command simply performs a simple display.
To do this, start by creating a ftest.c file in the directory user. In this file, write a function main which:
- uses printf() to display "starting".
- calls exit(RET_CODE) to exit the command with the given return code (e.g., 0).
- int fwake(int slot, int val) : this function considers that the kernel has internal variables, called slots, that are numbered from 0 to 5. The fwake function assigns the value val to the slot th slot, then wakes up the processes that are waiting for the variable slot to change value (see next function);
- int fwait(int slot, int val) : this function pause the process as long as the slot number slot is not val.
A typical usage of fwake/fwait is given here:
This program starts by creating a child process that waits until slot 1 is set to 42 (initially, a slot is set to 0). The father process sleeps for 50 ticks (around 5s, time is managed in a very approximate way in xv6) then sets slot 1 to 42, which wakes up the child process.
Start by replacing your code in ftest.c with this example code.
- give a syscall number to our system calls by adding SYS_fwait (number 22) and SYS_fwake (number 23) in kernel/syscall.h;
- then, add the user-side "implementation" of the syscalls: in user/usys.pl, add two entry's similar to the existing syscalls in this file;
- the file user/usys.pl is a Perl script that generates user/usys.S, an assembly file with the actual syscall stubs
- for the details, the generated assembly code does the following:
- tell the assembler that the symbol we are defining is a global,
- define a label with the name of the syscall (fwait, fwake),
- load the number of the syscall (the macro SYS_fwait or SYS_wake you defined just before in kernel/syscall.h) into register a7,
- execute the instruction ecall to trigger an environment call of the number defined by content of the register a7 (an environment call is a generalization of a syscall: one privilege level calls into the immediately more privileged level underneath, so in the case of a syscall, the user mode calls into the supervisor mode, i.e., the kernel),
- return with ret;
- then add the declarations of these user-side functions to the file user/user.h:
- they are the signatures of the assembler functions generated by user/usys.pl in user/usys.S,
- the signatures were given in the previous question: int fwait(int slot, int val) and int fwake(int slot, int val).
The test part in userland allowing to test the new system calls is now ready. Since you have not yet associated any code with these system calls in the kernel, calls must fail miserably with "unknown sys call 22" and "unknown sys call 23 " messages. Check that this is the case by compiling, by running xv6 (make qemu) and running your ftest command.
First, we take a look at what's going on in userland. To do this, open the file user/ftest.asm. This file contains the assembly code of your command ftest:
- In that file, find the code corresponding to the call fwait(1, 42). You should see the setting up of arguments (li a1,42 and li a0, 1), and a call to fwait(): the instruction pair auipc ra,0x0; jalr 894(ra) is used to perform a jump relative to the current PC of 894 bytes. If you add the address (in hexadecimal) of the auipc instruction to the offset given to jalr (which might by different in your assembly code), you should get the address of the label fwait: written in the comment of the jalr instruction.
- We can then follow the flow of execution by looking for this label (or its address in hexadecimal) in the file. You should see the assembly code for the fwait() function that you wrote in usys.S.
From the call to ecall, the RISC-V platform switches to supervisor mode, and we can follow the code flow:
- the CPU continues its execution in the function uservec (in kernel/trampoline.S), that was installed as the trap vector when the kernel switched to the user process (the CPU knows it must jump there when a trap from userspace occurs);
- this function saves the execution state of the user process to a memory area (the trapframe), then restores the kernel context before jumping to the C function usertrap (kernel/trap.c), two steps that are non trivial:
- A first major issue when handling traps in RISC-V, is that the supervisor code does not have any general-purpose register available (because they may all contain values for the user process). But a register is needed to hold the address of the trapframe where to store the execution state. The kernel uses the special register sscratch, that it sets before switching to the user process, to the address of the trapframe. Then, is swaps the contents of a0 and sscratch with csrw sscratch, a0, so the user process's value of a0 is saved into sscratch, and then loads the address of the trapframe to a0.
- A second major issue, is that the CPU jumps to the interrupt vector in supervisor mode, but keeps the page table of the process that trapped. But to handle the trap, the kernel code must restore the kernel's page table. It means extra care must be put into ensuring the code can execute both in the user process page table and in the kernel page table. This is the purpose of this "trampoline" code: it is mapped by the kernel at the same virtual addresses in all user processes, and in the kernel page table.
- the usertrap() function checks the content of the register scause to determine the cause of the trap, if this is a syscall, it delegates to syscall() (kernel/syscall.c);
- the syscall() generic function checks the syscall number in the register a7 (remember the li a7, SYS_fwait in userland), and calls the matching syscall implementation (or displays the error message you saw in the previous test if it does not exist);
- it also sets the return value in the register a0 to the value returned by the syscall implement (or -1 because the call was unsuccessful);
- finally, the execution flow unstacks the calls, goes through the function usertrapret to prepare the switch back to the user process, goes back to the return trampoline function userret (in kernel/trampoline.S) that restores the context of the user process, and eventually returns to the user process itself.
To navigate more easily in the sources of xv6, we invite you to use the ctags tool (sudo apt-get install exuberant-ctags) which allows you to index source code and perform research.
- To generate an index, run the command make tags which invokes etags *.S *.c (you can also modify the Makefile to also index *.h files).
- With emacs:
- To find the definition of a function (or of a type), position the cursor on the type, then do M-. ("META dot", so the Esc key then the . key)
- To go back, type M-,
- With VIM:
- To find the definition of a function or a type, position the cursor on the function, then type C-] (Ctrl and ])
- To go back, type C-t
In the kernel/syscall.c file, modify the syscalls table to match respectively the system call number SYS_fwait to its kernel implementation sys_fwait(), and the system call number SYS_fwake to its kernel implementation sys_fwake(). Also, add the signatures of these functions just before the definition of the syscalls structure, as it is done for sys_uptime() for instance.
In proc.c, which takes care of managing the processes and who therefore owns the list of processes, add the functions sys_fwait() and sys_fwake(). Each of these functions must display a message using the function kernel printf then return 0. The printf on the kernel side behaves much like the printf function that you know, but it is no the same implementation, as it can only print on the console (no notion of stdout).
Check that the messages are displayed when you run the command ftest.
Implementing system calls (∼ 1h)
As the xv6 kernel can run in parallel on multiple cores, you must acquire a lock before accessing slots. Indeed, it is quite possible that another core executes fwake while fwait is being executed, so the fwake would be missed and the process that called fwait would be stuck waiting.
To put the lock in place:
- declare a variable struct spinlock slots_lock; in proc.c;
- initialize this lock in the function procinit() like other existing locks;
In sys_wait, you must be able to find the structure proc corresponding to the current process. For this you can call myproc(). Next, as long as slots[slot] is not equal to val (where slot and val are the arguments of sys_wait), it should :
- set the wait_slot field of the current process to 1;
- set the state of the current process to SLEEPING, which makes the process ineligible;
- call sched() so that xv6 runs another process (xv6 chooses a process in RUNNABLE state and therefore ignore our process),
- reset wait_slot to 0.
Be careful with the locks:
- The operations on the proc structure must be done while holding the process's lock (including calling sched()).
- The lock on slots should be released once the lock on the process is taken, and taken again before releasing the latter.
- extract the slot and val arguments;
- acquire the lock slots_lock;
- assign the value val to slots[slot];
- browse the process table (from proc[0] to proc[NPROC]), and for each of them, return it in the RUNNABLE state if it is in the SLEEPING state and its wait_slot variable is true (do not forget to take the process's lock);
- release the lock slots_lock.
Very important question for the future!
Compare the code of your functions sys_fwait()/sys_fwake() with the code of the functions sleep() and wakeup() in the same file proc.c.
You should notice that these functions generalize the principle of waiting. sleep pauses a process while waiting for an event on chan(), and wakeup() wakes up processes that are waiting for an event on chan. We could therefore have implemented our functions as follows:
In order to see the code of the solution, execute the following sequence of instructions in a terminal:
And to see only the added code: