...
Variables can be export
’d to facilitate copy/paste commands in the next sections of this guide, or source an arbitrary file, for example : . /root/facilitate
or source /root/facilitate
.
Code Block export MGMT_SERVER_IP=173.31.23.23 export SLURM_CONF_DIR=/opt/slurm/etc
Compute Environment
An AMI will be needed. The steps below give an example of how an AMI could be created in the AWS Console leveraging the existing slurm cluster and its resources and configuration. Or if the compute resources in the slurm cluster are based on an existing AMI, that can be used as well.
...
To create an AMI from a slurm compute node:
Allocate a compute node:
Code Block salloc -N 1 -J ami-creation --no-shell --exclusive --nodelist=<NodeNanme>
When the job is allocated, gather some information on the node running the job:
The
salloc
command above should have output a JOB_ID.The
squeue
command should show the JOB_ID running on particular node.Issue the following command to capture information about that node:
Code Block scontrol show node <NODENAME>
Look for the
NodeAddr=
field in the output to find the Private IPv4 Address of the node running theami-creation
job.
Navigate to AWS console EC2 Instances page and search for the Private IPv4 Address.
Select that instance and click “Create an image” from the Actions button in the upper right corner.
AMI Creation requires a variable amount of time to complete: it may take 15 minutes or longer in some regions at certain times of day. Information requested from the AWS Console to fulfill this request is fairly intuitive: Name the AMI in a memorable way and generally accept the default values provided by the prompts.
NOTE : this action will reboot the node and kill our job, which is expected.
...