The second topic is an N-Body simulation consisting in simulating the evolution of a dynamic system of N bodies considering their mass, positions and initial speed.


N-Body simulation are extensively used in astrophysics for studying how galaxies composed of millions of bodies behave. For instance, the Max Planck Society's Supercomputing Center simulated how 10 billion particles interact. This kind of simulation is extremly computing demanding ( this simulation run on the whole supercomputer for a month).
Multiple algorithms has been designed for N-Body simulations. In this project, we focus on two of them:

Brute-force implementation

The naive algorithm (in nbody_brute_force.c) computes, for each time step, the force applied to each particle, and computes its new position.
The force applied to particle p1 is computed by summing the force that all the other particles apply to p1.
for(i=0; i < nparticles; i++) {
    compute_force(p1, p[i]->x_pos, p[i]->y_pos, p[i]->mass);
The complexity of the force computation is thus O(n^2) for each time step.
Once the forces are computed, the new position of each particle can be computed:
for(i=0; i < nparticles; i++) {
    move_particle(&particles[i], step);

Barnes-Hut implementation

In order to reduce the complexity of the simulation, the Barnes-Hut algorithm (see nbody_barnes_hut.c) uses approximations during the phase that computes the forces applied to a particle. The main idea of this algorithm is to group particles that are close to each other into one big "virtual" particle. When computing the forces applied to a particle p1, if a group of particles is far enough from p1, the cumulated force of the particles is approximated by the force of the virtual particle whose position is at the barycenter of the particles and whose mass is the sum of the particles mass.
To do that, the space (in this case, a 2 dimension space) is recursively split in 4, forming a quad-tree. Each node of the quad-tree corresponds to a region of the 2D space. Each node is split recursively into 4 sub node until the nodes contain at most one particle. The result is a tree where the leafs contain particles. Here is an example of a tree containing 8 particles:
When creating the tree, the barycentre of each node is computed. Thus, the computation of the forces applied to a particle p1 is done recursively starting from the root of the tree. For each node n, the following algorithm is applied:
The value of θ is chosen arbitrarily and defines the approximation of the algorithm. For this project, we assume that θ = 2.
Once the computation of the force applied to the particles is complete, the new position of the particles is computed, and a new tree corresponding to the new position is created.

Source Code

The source code for this project is available here: nbody.tgz. This archive contains:

The two applications perform an N-Body simulation between N particles for a fixed duration T. Both N and T can be specify at runtime as first and second arguments of command-line binary. Here are two examples for brute force and Barnes Hut versions:
 $ ./nbody_brute_force 1000 2
nparticles: 1000
T_FINAL: 2.000000
Simulation took 6.344124 s to complete

$ ./nbody_barnes_hut 1000 2
nparticles: 1000
T_FINAL: 2.000000
Simulation took 0.583025 s to complete
By default, the position of the particles is displayed in an X window during the computation. To measure raw performance, you should disable the display by uncommenting the # DISPLAY = line in the Makefile.
The position of the particles at the end of the simulation can be dumped into a file named particles.log. To enable this feature, uncomment the #DUMP = -DDUMP_RESULT line in the Makefile. This feature is mandatory to check if your parallel version is correct (by comparing the sequential dump and the parallel one).

Parallelism Hint

The main source of parallelism will be based on the processing of each particle. But the way to make the code parallel will be different from one version to another (brute force and Barnes Hut).