Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Scroll Documents: Update page title prefix

...

  1. Navigate to the exostellar directory where the configuration assets reside:

    1. Code Block
      cd ${SLURM_CONF_DIR}/exostellar
  2. Setup a timestamp folder in case there’s a need to rollback:

    1. Code Block
      PREVIOUS_DIR=$( date +%Y-%m-%d_%H-%M-%S )
      mkdir ${PREVIOUS_DIR}
  3. Place the contents of the exostellar directory in the timestamp directory:

    1. Code Block
      mv * ${PREVIOUS_DIR}
  4. Make a new json folder and copy the env.json and profile.json:

    1. Code Block
      mkdir json
      cp ${PREVIOUS_DIR}/json/env0.json ${PREVIOUS_DIR}/json/profile0.json json/
      cd json
      mv env0.json env1.json
      mv profile0.json profile1.json
  5. Edit env1.json as needed, e.g.:

    1. Add more pools if you need more CPU-core or Memory options availble in the partition.

    2. Increase the node count in pools.

    3. Environment Configuration Information for reference.

  6. Likely, profile0.json may not need any modification.

    1. Profile Configuration Information for reference.

  7. Validate the JSON asset with jq:

    1. Code Block
      jq . env1.json
  8. You will see well-formatted JSON if jq can read the file, indicating no errors. If you see an error message, that means the JSON is not valid.

  9. When the JSON is valid, the file can be pushed to the MGMT_SERVER:

    1. Code Block
      curl -d "@env1.json" -H 'Content-Type: application/json' -X PUT http://${MGMT_SERVER_IP}:5000/v1/env
  10. If the profile was changed, validate it with the quick jq test.

    1. Code Block
      jq . profile1.json
  11. Push the changes live:

  12. Code Block
    curl -d "@profile1.json" -H 'Content-Type: application/json' -X PUT http://${MGMT_SERVER_IP}:5000/v1/profile
  13. Grab the assets from the MGMT_SERVER:

  14. Code Block
    curl -X GET http://${MGMT_SERVER_IP}:5000/v1/xcompute/download/slurm -o slurm.tgz
    1. If the EnvName was changed (above in Edit the Slurm Environment JSON for Your Purposes - Step 2 ), then the following command can be used with your CustomEnvironmentName :

    2. Code Block
      curl -X GET http://${MGMT_SERVER_IP}:5000/v1/xcompute/download/slurm?envName=CustomEnvironmentName -o slurm.tgz
  15. Unpack them into the exostellar folder:

  16. Code Block
    tar xf slurm.tgz -C ../
    cd ..
    mv assets/* .
    rm assets
  17. Edit resume_xspot.sh and add sed command snippet ( | sed "s/XSPOT_NODENAME/$host/g" ) for every pool:

    • Code Block
      user_data=$(cat /opt/slurm/etc/xvm16-_user_data | base64 -w 0)
    • becomes:

    • Code Block
      user_data=$(cat /opt/slurm/etc/exostellar/xvm16-_user_data | sed "s/XSPOT_NODENAME/$host/g" | base64 -w 0)
  18. Introducing new nodes into a slurm cluster requires restart of the slurm control deamon:

    1. Code Block
      systemctl restart slurmctld
  19. Integration steps are complete and a job submission to the new partition is the last validation:

    1. As a user, navigate to a valid job submission directory and launch a job as normal, but be sure to specifiy the new partition:

      1. sbatch -p NewPartitionName < job-script.sh

...