Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Copy default-lsf-env.json to something convenient like env0.json.

    1. Code Block
      cp default-lsf-env.json env0.json
  2. Note Line numbers listed below reference the above example file. Once changes start being made on the system, the line numbers may change.

  3. Line 2 : "EnvName" is set to lsf by default, but you can specify something unique if needed.

  4. Lines 5-17 can be modified for a single pool of identical compute resources or they can be duplicated and then modified for each “hardware” configuration or “pool” you choose. When duplicating, be sure to add a comma after the brace on line 17, except when it is the last brace, or the final pool declaration.

    1. PoolName: This will be the apparent hostnames of the compute resources provided for slurmLSF.

      1. It is recommended that all pools share a common trunk or base in each PoolName.

    2. Priority: LSF will treat all pools as having equal priority and then make scheduling decisions based on alphabetical naming. It may be beneficial to set smaller nodes with a lower priority, something like:

      1. 2-core nodes : Priority=10

      2. 4-core nodes : Priority=100

      3. 8-core nodes : Priority=1000

      4. So that jobs are scheduled on the smallest node that fulfills the resource requirements of the job.

    3. PoolSize: This is the maximum number of these compute resources.

    4. ProfileName: This is the default profile name, az1: If this is changed, you will need to carry the change forward.

    5. CPUs: This is the targeted CPU-core limit for this "hardware" configuration or pool.

    6. ImageName: This is tied to the AMI that will be used for your compute resources. This name will be used in subsequent steps.

    7. MaxMemory: This is the targeted memory limit for this "hardware" configuration or pool.

    8. MinMemory: reserved for future use; can be ignored currently.

    9. UserData: This string is a base64 encoded version of user_data.

      1. To generate it:

        1. cat user_data.sh | base64 -w 0

      2. To decode it:

        1. echo "<LongBase64EncodedString>" | base64 -d

      3. It’s not required to be perfectly fine-tuned at this stage; it will be refined and corrected later.

      4. You may format user_data.sh in the usual ways:

        1. Code Block
          #cloud-config
          
          runcmd:
            - [sh, -c, "set -x"]
            - [sh, -c, "hostname $( echo ip-$ (hostname -I |sed 's/\./-/g' |sed 's/ //g' ) )"]
            - [sh, -c, "echo root:AAAAAA |chpasswd"]
            - [sh, -c, "sed -i.orig '3d' /etc/hosts"]
            - [sh, -c, "echo >> /etc/hosts"]
            - [sh, -c, "echo -e \"$( hostname -I )\t\t\t$( hostname )\" >> /etc/hosts"]
            - [sh, -c, "sed -i 's/awshost/xiohost/g' /opt/lsf/conf/lsf.conf"]
            - [sh, -c, "source /opt/lsf/conf/profile.lsf"]
            - [sh, -c, "lsadmin limstartup"]
            - [sh, -c, "lsadmin resstartup"]
            - [sh, -c, "badmin hstartup"]

          or

        2. Code Block
          #!/bin/bash
          
          set -x
          
          #hostname XSPOT_NODENAME
          hostname $( echo ip-$ (hostname -I |sed 's/\./-/g' |sed 's/ //g' ) )
          
          echo root:AAAAAA |chpasswd
          
          #delete an /etc/hosts entry
          sed -i.orig '3d' /etc/hosts
          
          #add an /etc/hosts self-identifying entry
          echo >> /etc/hosts
          echo -e "$( hostname -I )\t\t\t$( hostname )" >> /etc/hosts
          
          #change resource designation
          sed -i 's/awshost/xiohost/g' /opt/lsf/conf/lsf.conf
          . /opt/lsf/conf/profile.lsf
          lsadmin limstartup
          lsadmin resstartup
          badmin hstartup
    10. VolumeSize: reserved for future use; can be ignored currently.

  5. All other fields/lines in this asset can be ignored.

...

If the EnvName was changed (above in Edit the Slurm LSF Environment JSON for Your Purposes - Step 2 ), then the following command can be used with your CustomEnvironmentName :

...

  1. The AMI-ID should be based on a slurm LSF compute node from your cluster, capable of running your workloads.

  2. The AMI should be created by this account.

  3. The AMI should not have product codes.

  4. The Image Name was specified in the environment set up previously and will be used in this command.

  5. Additionally, we can pass -s script.sh if troubleshooting is required.

Validation of Migratable VM Joined to Your

...

LSF Cluster

The script test_createVm.sh exists for a quick validation that new compute resources can successfully connect and register with the scheduler.

...