Document toolboxDocument toolbox

(v2.2.0.2) Checking Slurm Environments

To ensure optimal setup of Infrastructure Optimizer, please make a note of the following information that will be used during installation and integration:

Slurm Cluster Reference

Slurm Installation

  • SLURM_CONF_DIR : directory where slurm.conf is located

  • SLURM_BIN_DIR : directory where slurm’s binaries are located, usually on users' PATH

Compute Environment

  • AMI of workload-capable compute node

    • Allocate a compute node: salloc -N 1 -J ami-creation --no-shell --exclusive --nodelist=<NodeNanme>

    • When the job is allocated, gather some information on the node running the job:

      • The salloc command above should have output a JOB_ID.

      • The squeue command should show the JOB_ID running on particular node.

      • Issue the following command to capture information about that node:

        • scontrol show node <NODENAME>

        • Look for the NodeAddr= field in the output to find the Private IPv4 Address of the node running the ami-creation job.

      • Navigate to AWS console EC2 Instances page and search for the Private IPv4 Address.

      • Select that instance and click “Create an image” from the Actions button in the upper right corner.

      • AMI Creation requires a variable amount of time to complete: it may take 15 minutes or longer in some regions at certain times of day. Information requested from the AWS Console to fulfill this request is fairly intuitive: Name the AMI in a memorable way and generally accept the default values provided by the prompts.

      • NOTE : this action will reboot the node and kill our job, which is expected.