Cortex

Specifications

Hardware

IBM AC-922 consisting of 2 NUMA Nodes with the following configuration:

CPU	RAM	GPU
Power9 16 cores SMT4	128GB + 32GB	Nvidia Tesla V100 32GB SXM2

Storage available to users is provided by a 6TB NVMe card mounted in /home.

Operating System

Ubuntu 18.04.2 Bionic Beaver ppc64le

cortex> uname -a
Linux cortex.interne.mines-paristech.fr 4.15.0-66-generic #75-Ubuntu SMP 
Tue Oct 1 05:24:20 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux

IBM Power9 and PowerAI documentation

Configuration

Users

Accounts on the server are divided into two types depending on the level of access the user is granted: regular user have usual UNIX accounts and get a login shell on the global system while dockerized users work in a dedicated docker container.

Regular

Regular users are not subject to any resource limitation and are added to the group docker.

Dockerized

These users only have access to the machine through a Docker container and therefore do not see the global system:

The login shell is set to /usr/bin/dockerized_login which logs them directly in their own container, for instance when connecting with ssh.
The home directory is located at /home/dockerized/export and subject to disk quota; the directory is mounted under /home in the container.
The scratch directory used for computations and data sets is located at /home/dockerized/data, linked from /data and subject to disk quota; the directory is mounted unser /data in the container.

Default directories in containers

Home directory	Scratch directory
`/home/${USER}`	`/data/${USER}`

Default resource limitations

CPU	RAM	Container space	Home directory	Scratch directory	GPU
2 threads	32GB	100GB	16GB	32GB	None

GPU access is granted on per user request.

Job Scheduler

Computational jobs are managed using Slurm.

The following setup is in place:

User containers only act as submit nodes authenticated using munge to the control host cortex.
Submitted jobs are dispatched, based on the user, to specific Docker containers dedicated as compute nodes (their resource is limited by default).
Job accounting is enabled on the control host for each user and group.

Administration

User management

Dockerized

Accounts are created with useradd the usual way together with their home directory.
Users have no password, only ssh key exchange is allowed, the keys are not writable by the user.
A corresponding Docker container is instantiated with the user name as identifier.
The user account is created in the container and the uid and gid are mapped.

Disk quota

Disk quota is enabled in /home and is limited to 16GB per user using BTRFS qgroups.

Enable quota on a volume

cortex# btrfs quota enable /home

List qgroups on a volume

cortex# btrfs qgroup show /home

List subvolumes of a volume

cortex# btrfs subvolume list /home

Create a new subvolume

cortex# btrfs subvolume create /home/dockerized/export/user

Set quota on subvolumes (here for user mig0)

cortex> pwd
/home/dockerized/export

cortex> btrfs subvolume list .
ID 257 gen 3086 top level 5 path @home
ID 6278 gen 3073 top level 257 path dockerized/data/mig0
ID 6279 gen 3072 top level 257 path dockerized/data/mig1
ID 6280 gen 3072 top level 257 path dockerized/data/mig2
ID 6281 gen 3072 top level 257 path dockerized/data/mig3
ID 6282 gen 3072 top level 257 path dockerized/data/mig4
ID 6283 gen 3072 top level 257 path dockerized/data/mig5
ID 6284 gen 3072 top level 257 path dockerized/data/mig6
ID 6285 gen 3084 top level 257 path dockerized/export/mig0
ID 6286 gen 3085 top level 257 path dockerized/export/mig1
ID 6287 gen 3085 top level 257 path dockerized/export/mig2
ID 6288 gen 3085 top level 257 path dockerized/export/mig3
ID 6289 gen 3085 top level 257 path dockerized/export/mig4
ID 6290 gen 3085 top level 257 path dockerized/export/mig5
ID 6291 gen 3085 top level 257 path dockerized/export/mig6

cortex> sudo btrfs qgroup limit 32g 0/6278 /home

cortex> sudo btrfs qgroup limit 16g 0/6285 /home

Docker containers

Docker containers are only managed by members of the group docker. They can be created for a user named ${USER} and with primary group ${GROUP} from an existing image named ${DOCKER_IMAGE} using:

cortex> docker create \
        --hostname ${USER} \
        --name ${USER} \
        --cpus 2 \
        --memory 16g \
        --runtime=nvidia \
        --storage-opt size=100G \
        -v ${DATADIR}:/data \
        -v ${WORKDIR}:/home/${USER} \
        -w /home/${USER} \
        -it ${DOCKER_IMAGE} /bin/bash

The container can be started with:

cortex> docker start ${USER}

The user should be created in the container and the uid and guid with:

cortex> docker exec -it ${USER} groupadd -g $(id -g ${USER}) ${GROUP}
cortex> docker exec -it ${USER} useradd -b /home -g ${GROUP} -u $(id -u ${USER}) -s /bin/bash ${USER}

GPU resources

Available GPU resources can be inspected with the Nvidia utility:

cortex> nvidia-smi 
Sun Nov 17 20:34:12 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.01    Driver Version: 418.87.01    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000004:04:00.0 Off |                    0 |
| N/A   29C    P0    37W / 300W |      0MiB / 32480MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000035:03:00.0 Off |                    0 |
| N/A   27C    P0    36W / 300W |      0MiB / 32480MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Jobs can be submitted to the slurm queue using GPU resources with the GRES specification flag:

cortex> srun --gres=gpus:1 ./exe

for example for 1 GPU.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search