CSC 5001– High Performance Systems

Portail informatique

SLURM - Documentation

Ssh configuration

Creating ssh keys

First, you need an ssh key. If you don't have one, generate it with:

$ ssh-keygen$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/student/.ssh/id_rsa): Created directory '/home/student/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/student/.ssh/id_rsa Your public key has been saved in /home/student/.ssh/id_rsa.pub The key fingerprint is: SHA256:9a0YRBGllulLZFnbb+nRycVZRIxJbItVoq9/fQYIGcM student@9b380233fc88 The key's randomart image is: +---[RSA 3072]----+ | .++oooB=| | .E* +*o+| | @+o+.oo| | Bo..oo.=| | S +...o=+| | . +.oo..| | o o o.| | . =| | .o.| +----[SHA256]-----+
This generates a pair of ssh keys that consist of:
  • /home/student/.ssh/id_rsa: a private key (that you should never share !)
  • /home/student/.ssh/id_rsa.pub: a public key (that you can share)

Connection from a laptop to the lab machines

The lab machines are only accessible from the TSP network. To connect directly to one of the lab machines, set up a SSH proxy jump. Add the following lines to the file $HOME/.ssh/config (you may need to create this file).
ServerAliveInterval 300 SendEnv LANG LC_* Host * ForwardAgent yes ForwardX11 yes ForwardX11Trusted yes Host tsp User trahay_f Hostname ssh1.imtbs-tsp.eu Host arcadia-slurm-controller HostName 157.159.104.130 User trahay_f ProxyJump tsp
Of course, you have to replace trahay_f with your TSP login. Now, you should be able to connect to the cluster frontend by running ssh arcadia-slurm-controller. If the machine asks for your password, you should copy your ssh public key with the following command:
$ ssh-copy-id arcadia-slurm-controller /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/student/.ssh/id_rsa.pub" /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys trahay_f@ssh1.imtbs-tsp.eu's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'arcadia-slurm-controller'" and check to make sure that only the key(s) you wanted were added.

Submitting Slurm jobs

First, connect to arcadia-slurm-controller where you can request compute nodes, eg:
[my_laptop] $ ssh -X arcadia-slurm-controller [arcadia] $ srun --x11 --account=csc_5001 --partition=starfighter --qos=normal --time=02:00:00 --cpus-per-task=16 --mem=32G --gres=gpu:1 --pty bash [starfighter-slurm-node-01-1] $

Requesting nodes

Request one node in interactive mode srun --x11 --account=csc_5001 --partition=starfighter --qos=normal --time=02:00:00 --cpus-per-task=16 --mem=32G --gres=gpu:1 --pty bash
Request one node in interactive mode (sharing with other students) srun --x11 --account=csc_5001 --partition=starfighter --qos=normal --time=02:00:00 --cpus-per-task=16 --mem=32G --oversubscribe --pty bash
Running a command on one node srun --x11 --account=csc_5001 --partition=starfighter --qos=normal --time=02:00:00 --cpus-per-task=16 --mem=32G --gres=gpu:1 command
Running a command on 4 nodes srun --x11 --account=csc_5001 --partition=starfighter --qos=normal --time=02:00:00 -N 4 --cpus-per-task=16 --mem=32G --gres=gpu:1 command