OpenMP - Labs
Hello world
Write a sequential C program that prints "Hello world".
- Compile and execute it.
Modify your program in order to print the message
"Hello World" by each thread attending the computation.
- How many threads are launched by default?
- Fix the number of threads to launch at 4.
- Fill the mesage with the id of the thread actually doing the printing.
Check that the bottom-up compatibilty of your program
with its initial sequential version by compiling it withour the
OpenMP option. Modify your program to protect the OpenMP library
calls if necessary.
Variable scopes
Play with the private et firstprivate clauses:
- Declare and initialize two integers.
- In a parallel region with one private variable and one firstprivate variable, print each of the sum of the current thread id with each of thoses variables.
Play with the shared clause :
- In the sequential region, declare and initilaize a third variable.
- In the parallel region with this third variable declared as shared, print it and verify its value.
Directive for
Loop parallelization
- Implement a sequential SAXPY program in C where a scalar is added to each cell of a array.
- Parallelize it thanks to OpenMP.
Scheduling
- Thanks to a omp_get_schedule function call, determine which is the scheduling policy of your system by default.
- Play with the others available scheduling policies by printing the thread identifiers and the loop iteration number. Observe their distribution.
Synchronization
- Dupplicate the loop with a static scheduling and a modifier fixed to array size / 2.
- Print a message "After loop" after this second parallel section.
- Fix your OMP_NUM_THREADS environment variable to 4 and observe the scheduling with and without the nowait clause.
Reduction in a parallel loop
- In a new loop, compute the sum of the array elements by accumulate it in a variable sum.
- Parallelize it thanks to OpenMP.
- Compare the sequetial and parallel time span by using the function double omp_get_wtime(void).
Directive Critical
Even if it is inefficient, implement the same reduction of
the elements of the same array with a critical directive.
Stencil - Homogeneous computation load
A stencil consists of passing a filter over
the elements of a data structure in order to update their value
according their neighboring elements.
In this exercice, the objective is to parallelize a 5-point stencil
that updates each element of a matrix with the sum of 5 elements :
the current element, the element to its left, the element to its
right, the element above and the element below.
Starting from this sequential code, implement a
parallel OpenMP version.
Evaluate the performance obtained by varying the running
thread number and compare it with the performance of the
sequential version.
Experiment the collapse clause to refine the grain of your
your parallelization.
What do you notice in terms of performance? Play with
different scheduling strategies to achieve a efficient
parallelism.
In order to optimize computational load and access to cached
cached data, modify the algorithm by creating calculation tiles.