Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The following manual steps for EAR will soon be replaced updated with a simplified workflow for command line users and alternatively, the Mangement Console (Web UI) will be able to replace most of these steps, as well.

Connect to Your LSF Head Node

To integrate Infrastructure Optimizer to your LSF cluster we need to update the configuration and add assets to perform the integration tasks. To complete this root or sudo access on the LSF Master node is required.

  1. Get a shell on the head node and navigate to the LSF_RC_TOP CONF_DIR directory.

    1. Code Block
      cd $LSF_TOP/conf/resource_connectorRC_CONF_DIR
  2. Make subdirectories here:

    1. Code Block
      mkdir -p exostellar/json exostellar/conf
      exostellar/scripts
      cd exostellar/json

Pull Down the Default LSF Environment Assets as a JSON Payload:

  1. The packages for jq and curl are required:

    1. Code Block
      yum install -y jq curl
    2. CentOS EoL: You may need to ensure yum can still function due to recent End of Life cycle from RedHat for CentOS 7. The following command is an exmple mitigation for an internet-dependent yum repository:

    3. Code Block
      sed -i -e 's|^mirrorlist=|#mirrorlist=|g' -e 's|^#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-*.repo
      yum clean all
  2. Pull down from the MGMT_SERVER default assets for customization:

    1. Code Block
      curl -X GET http://${MGMT_SERVER_IP}:5000/v1/env/lsf | jq > default-lsf-env.json
  3. The asset will look like:

Expand
titledefault-lsf-env.json
Code Block
{
    "Description": "LSF environment",
    "EnvName": "lsf",
    "HeadAddress": "<HeadAddress>",
    "Pools": [
        {
            "PoolName": "xvm2-",
            "Priority": "10",
            "PoolSize": 10,
            "ProfileName": "az1",
            "VM": {
                "CPUs": 2,
                "MaxMemory": 6000,
                "ImageName": "Compute",
                "UserData": "IyEvYmluL2Jhc2gKCnNldCAteAoKI2hvc3RuYW1lIFhTUE9UX05PREVOQU1FCmhvc3RuYW1lICQoIGVjaG8gaXAtJCAoaG9zdG5hbWUgLUkgfHNlZCAncy9cLi8tL2cnIHxzZWQgJ3MvIC8vZycgKSApCgplY2hvIHJvb3Q6QUFBQUFBIHxjaHBhc3N3ZAoKI2RlbGV0ZSBhbiAvZXRjL2hvc3RzIGVudHJ5CnNlZCAtaS5vcmlnICczZCcgL2V0Yy9ob3N0cwoKI2FkZCBhbiAvZXRjL2hvc3RzIHNlbGYtaWRlbnRpZnlpbmcgZW50cnkKZWNobyA+PiAvZXRjL2hvc3RzCmVjaG8gLWUgIiQoIGhvc3RuYW1lIC1JIClcdFx0XHQkKCBob3N0bmFtZSApIiA+PiAvZXRjL2hvc3RzCgojY2hhbmdlIHJlc291cmNlIGRlc2lnbmF0aW9uCnNlZCAtaSAncy9hd3Nob3N0L3hpb2hvc3QvZycgL29wdC9sc2YvY29uZi9sc2YuY29uZgouIC9vcHQvbHNmL2NvbmYvcHJvZmlsZS5sc2YKbHNhZG1pbiBsaW1zdGFydHVwCmxzYWRtaW4gcmVzc3RhcnR1cApiYWRtaW4gaHN0YXJ0dXAK"
            }
        }
    ],
    "Type": "lsf",
    "Id": "cbbbaa7a-7ffa-4c1a-a6fc-701d18c63e7b"
}

Edit the LSF Environment JSON for Your Purposes:

  1. Copy default-lsf-env.json to something convenient like env0.json.

    1. Code Block
      cp default-lsf-env.json env0.json
  2. Note Line numbers listed below reference the above collapsible example file above. Once changes start being made on the system, the line numbers may change.

  3. Line 2 : "EnvName" is set to lsf by default, but you can specify something unique if needed.

    1. NOTE: Currently, - characters are not supported in values forEnvName.

  4. Lines 5-17 can be modified for a single pool of identical compute resources or they can be duplicated and then modified for each “hardware” configuration or “pool” you choose. When duplicating, be sure to add a comma after the brace on line 17, except when it is the last brace, or the final pool declaration.

    1. PoolName: This will be the apparent hostnames of the compute resources provided for LSF.

      1. It is recommended that all pools share a common trunk or base in each PoolName.

    2. Priority: When equal, LSF will treat all pools as having equal priority and then make scheduling decisions based on alphabetical naming. It may be beneficial to set smaller nodes with a higher priority so that jobs are placed on the smallest fitting node, something like:

      1. 2-core nodes : Priority=1000

      2. 4-core nodes : Priority=100

      3. 8-core nodes : Priority=10

      4. So that jobs are scheduled on the smallest node that fulfills the resource requirements of the job.

    3. PoolSize: This is the maximum number of these compute resources.

    4. ProfileName: This is the default profile name, az1: If this is changed, you will need to carry the change forward.

    5. CPUs: This is the targeted CPU-core limit for this "hardware" configuration or pool.

    6. ImageName: This is tied to the AMI that will be used for your compute resources. This name will be used in subsequent steps.

    7. MaxMemory: This is the targeted memory limit for this "hardware" configuration or pool.

    8. PoolSize: This is the maximum number of these compute resources.

    9. ProfileName: This is the default profile name, az1: If this is changed, you will need to carry the change forward.

    10. CPUs: This is the targeted CPU-core limit for this "hardware" configuration or pool.

    11. ImageName: This is tied to the AMI that will be used for your compute resources. This name will be used in subsequent steps.

    12. MaxMemory: This is the targeted memory limit for this "hardware" configuration or pool.

    13. MinMemory: reserved for future use; can be ignored currently.

    14. UserData: This string is a base64 encoded version of user_data.

      1. To generate it:

        1. cat user_data.sh | base64 -w 0

      2. To decode it:

        1. echo "<LongBase64EncodedString>" | base64 -d

      3. It’s not required to be perfectly fine-tuned at this stage; it will be refined and corrected later.

      4. Exampleuser_data.sh:

        1. Code Block
          #!/bin/bash
          
          set -x
          
          IP=$( hostname -I |awk '{print $1}' )
          NEW_HOSTNAME=ip-$( echo ${IP} |sed 's/\./-/g' )
          hostname ${NEW_HOSTNAME}
          
          echo >> /etc/hosts
          echo -e "${IP}\t\t${NEW_HOSTNAME}" >> /etc/hosts
          
          . /opt/lsf/conf/profile.lsf
          
          lsadmin limstartup
          lsadmin resstartup
          badmin hstartup
    15. VolumeSize: reserved for future use; can be ignored currently.

  5. All other fields/lines in this asset can be ignored currently.

Validate and Push the Customized Environment to the MGMT_SERVER

  1. Validate the JSON asset with jq:

    1. Code Block
      jq . env0.json
    2. You will see well-formatted JSON if jq can read the file, indicating no errors. If you see an error message, that means the JSON is not valid.

  2. When the JSON is valid, the file can be pushed to the MGMT_SERVER:

    1. Code Block
      curl -d "@env0.json" -H 'Content-Type: application/json' -X PUT http://${MGMT_SERVER_IP}:5000/v1/env

Pull Down the Default Profile Assets as a JSON Payload:

  1. The default is named az1.

    1. Code Block
      curl -X GET http://${MGMT_SERVER_IP}:5000/v1/profile/az1 |jq > default-profile.json
  2. Copy it to faciliatate customization, leaving the default for future reference.

    1. Code Block
      cp default-profile.json profile0.json
  3. The asset will look like this:

Expand
titledefault-profile.json
  1. Code Block
    {
      "AvailabilityZone": "us-east-2c",
      "Controller": {
        "IdentityRole": "arn:aws:iam::270000099005:instance-profile/io-apc05-ExostellarInstanceProfile-CmxXXD1CZAId",
        "InstanceTags": [
          {
            "Key": "exostellar.xspot-role",
            "Value": "xspot-controller"
          }
        ],
        "InstanceType": "c5d.xlarge",
        "SecurityGroupIds": [
          "sg-016bd6ead636fa5bb"
        ],
        "SubnetId": "subnet-02d2d57c0673d6a5a",
        "VolumeSize": 100,
        "ImageId": "ami-0d4c57b22746fe832"
      },
      "LogPath": "/xcompute/logs",
      "MaxControllers": 10,
      "ProfileName": "az1",
      "Region": "us-east-2",
      "Worker": {
        "IdentityRole": "arn:aws:iam::270000099005:instance-profile/io-apc05-ExostellarInstanceProfile-CmxXXD1CZAId",
        "InstanceTags": [
          {
            "Key": "exostellar.xspot-role",
            "Value": "xspot-worker"
          }
        ],
       "InstanceTypes": [
          "m5:0",
          "m6i:1"
        ],
        "SecurityGroupIds": [
          "sg-016bd6ead636fa5bb"
        ],
        "SpotFleetTypes": [
          "m5:1",
          "m5d:0",
          "m5n:0",
          "m6i:2"
        ],
        "SubnetId": "subnet-02d2d57c0673d6a5a",
        "ImageId": "ami-09559faf3fc003160"
      },
      "Xspot": {
        "EnableHyperthreading": true,
        "EnableBalloon": true
      },
      "XspotVersion": "xspot-2.2.0.1",
      "Id": "eb78d0e0-24a9-4dfd-a5ae-d7abcb9edcbd",
      "NodeGroupName": "wlby3xy1",
      "Status": "idle"
    }

Edit the Profile JSON for Your Purposes:

  1. Tagging instances created by the backend is controlled by two sections, depending on the function of the asset:

    1. Controllers are On-Demand instances that manage other instances. By default, they are tagged as seen on lines 6-9, above, and 1-4 below.

      1. Code Block
              {
                "Key": "exostellar.xspot-role",
                "Value": "xspot-controller"
              }
      2. To add additional tags, duplicate lines 1-4 as 5-8 below (as many times as you need), noting that an additional comma is added on line 4.

      3. Code Block
              {
                "Key": "exostellar.xspot-role",
                "Value": "xspot-controller'
              },
              {
                "Key": "MyCustomKey",
                "Value": "MyCustomValue"
              }
      4. Don’t forget the comma between tags.

    2. Workers will be created by Controllers as needed and they can be On-Demand/Reserved instances or EC2 Spot. By default, they are tagged as seen on lines 26-30, above, and 1-4 below:

      1. Code Block
              {
                "Key": "exostellar.xspot-role",
                "Value": "xspot-worker"
              }
      2. Add as many tags as needed.

      3. Code Block
              {
                "Key": "exostellar.xspot-role",
                "Value": "xspot-worker"
              },
              {
                "Key": "MyCustomKey",
                "Value": "MyCustomValue"
              }
      4. Don’t forget the comma between tags.

  2. Note Line numbers listed below reference the above collapsible example file above. Once changes start being made on the system, the line numbers may change.

  3. Line 11 - InstanceType: Controllers do not generally require large instances.

    1. In terms of performance, these On-Demand Instances can be set as c5.xlarge or m5.xlarge with no adverse effect.

  4. Line 20 - MaxControllers : This will define an upper bound for your configuration.

    1. Controllers Each controller will manage upto 80 workers.

    2. The default upper bound is 800 nodes joining your production cluster: notice line 20 "MaxControllers": 10,.

    3. If you plan to autoscale past 800 nodes joining your production cluster, MaxControllers should be increased.

    4. If you want to lower that upper bound, MaxControllers should be decreased.

  5. Line 21 - ProfileName: This is used for your logical tracking, in the event you configure multiple profiles.

  6. Lines 31-34 - InstanceTypes here in the Worker section, this refers to On-Demand instances – if there is no spot EC2 Spot availability, what instances do you want to run on.

  7. Lines 38-43 - SpotFleetTypes : here in the Worker section, this refers to EC2 Spot instance types – because of the discounts, you may be comfortable with a much broader range of instance types.

    1. More types and families here, means more opportunities for cost optimization.

    2. Priorities can be managed by appending a : and an integer, e.g. m5:1 is a higher priority than c5:0.

  8. Line 48 - EnableHyperthreading: Set true for hyperthreaded cores and false to disable hyperthreading.

  9. Line 49 - EnableBalloon: This will always be set to true. It increases migration efficiency. Setting to false may be useful in a troubleshooting scenario, but is not recommended under normal cicrcumstances.

  10. Line 52 - NodeGroupName : This string appears in Controller Name tagging <profile>-NGN-count

  11. All other field/lines can be ignored in the asset.

Validate and Push the Customized Profile to the MGMT_SERVER

  1. Validate the profile with the quick jq test.

    1. Code Block
      jq . profile0.json
  2. Push the changes live.

    1. Code Block
      curl -d "@profile0.json" -H 'Content-Type: application/json' -X PUT http://${MGMT_SERVER_IP}:5000/v1/profile

Download Scheduler Assets from the Management Server

Code Block
curl -X GET http://${MGMT_SERVER_IP}:5000/v1/xcompute/download/lsf -o lsf.tgz

...

Code Block
tar xf lsf.tgz -C ../
cd ..
mv assets/* .
rmdir assets
chown lsfadmin.root -R ../exostellar

Ensure lsb.modules is prepared for Resource Connector

  1. schmod_demand plugin must be enabled in ${LSF_TOP}/conf/lsbatch/<cluster-name>/configdir/lsb.modules.

Code Block
Begin PluginModule
SCH_PLUGIN                      RB_PLUGIN                    SCH_DISABLE_PHASES
...
schmod_demand                   ()                              ()
...
End PluginModule

Add New Queue and Resource Definitions to LSF for Resource Connector

  1. Add a new queue to lsb.queues:

    1. Code Block
      cd ${LSF_TOP}/conf/lsbatch/<ClusterName>/configdir
      vi lsb.queues
  1. Insert a Queue Declaration such as below:

Expand
titleQueue Declaration
Code Block
Begin Queue
QUEUE_NAME              = xio
PRIORITY                = 90
NICE                    = 20
FAIRSHARE               = USER_SHARES[[default,10]]
RC_DEMAND_POLICY        = THRESHOLD[ [1, 1] [10, 60] [100] ]
RC_HOSTS                = xiohost
RES_REQ                 = xiohost
RC_ACCOUNT              = xio
DESCRIPTION             = xspot
NEW_JOB_SCHED_DELAY     = 0
REQUEUE_EXIT_VALUES     = 99
End Queue
  1. Insert a Queue Declaration as in the collapsible example above.

    1. NOTE: The xiohost strings are required in the queue declaration: this is not modifiable.

  2. Add resources to the Resource Definitions in

...

  1. ${LSF_TOP}/conf

...

  1. /lsf.shared

...

  1. :

    1. Code Block
      Begin Resource
      RESOURCENAME  TYPE    INTERVAL INCREASING  DESCRIPTION        # Keywords
      ...
         xiohost    Boolean    ()       ()       (instances from Infrastructure Optimizer)
         rc_account String     ()       ()       (account name for the external hosts)
         templateID   String   ()       ()       (template ID for the external hosts)
      ...
      End Resource
    2. Resource Connector

...

    1. leverages xiohost, rc_account, and templateID when provisioning compute capacity.

    2. NOTE: The xiohost string required in this file; this is not modifiable.

Add Required Lines to lsf.conf

Code Block
#exostellar

...


LSB_RC_EXTERNAL_HOST_FLAG=xiohost
LSF_LOCAL_RESOURCES="[resource xiohost] [type X86_64]"
LSF_DYNAMIC_HOST_WAIT_TIME=2
LSF_DYNAMIC_HOST_TIMEOUT=10m
ENABLE_DYNAMIC_HOSTS=Y
LSF_REG_FLOAT_HOSTS=Y
EBROKERD_HOST_CLEAN_DELAY=5
LSF_MQ_BROKER_HOSTS=head        #equivalent to LSF Master, in this example
LSB_RC_EXTERNAL_HOST_IDLE_TIME=2
EGO_DEFINE_NCPUS=threads 

...

The values assigned for variables with TIME and DELAY may be tuned for the best timing scenario of your cluster and assets. The LSF Admin may opt for different timing than above.

Compute AMI Import

During Prepping the LSF Integration an AMI for compute nodes was identified or created. This step will import that AMI into Infrastructure Optimizer. Ideally, this AMI is capable of booting quickly.

Code Block
./parse_helper.sh -a <AMI-ID> -i <IMAGE_NAME>
  1. The AMI-ID should be based on a LSF compute node from your cluster, capable of running your workloads.

  2. The AMI should be created by this account.

  3. The AMI should not have product codes.

  4. The Image Name was specified in the environment set up previously and will be used in this command.

  5. Additionally, we can pass -s script.sh if troubleshooting is required.

Validation of Migratable VM Joined to Your LSF Cluster

The script test_createVm.sh exists for a quick validation that new compute resources can successfully connect and register with the scheduler.

Code Block
./test_createVm.sh -h xvm0 -i <IMAGE_NAME> -u user_data.sh
  1. The hostname specified with -h xvm0 is arbitrary.

  2. The Image Name specified with -i <IMAGE_NAME> should correspond to the Image Name from the parse_helper.sh command and the environment setup earlier.

  3. The -u user_data.sh is available for any customization that may be required: temporarily changing a password to faciliate logging in, for example.

  4. The test_createVm.shscript will continuously output updates until the VM is created. When the VM is ready, the script will exit and you’ll see all the fields in the output are now filled with values:

    1. Code Block
      Waiting for xvm0... (4)
      NodeName: xvm0
      Controller: az1-qeuiptjx-1
      Controller IP: 172.31.57.160
      Vm IP: 172.31.48.108
  5. This step is meant to provide a migratable VM so that sanity checking may occur:

    1. Have network mounts appeared as expected?

    2. Is authentication working as intended?

    3. What commands are required to finish bootstrapping?

    4. Et cetera.

  6. Lastly, LSF services should be started at the end of bootstrapping.

    1. It may take 5 minutes or longer for the LSF services to register with the LSF Master Host.

    2. When the RC Execution Host is properly registered, it will be visible via the lshost command.

  7. To remove this temporary VM:

    1. Replace VM_NAME with the name of the VM , -h xvm0 example above.

    2. Code Block
      languagenone
      curl -X DELETE  http://${MGMT_SERVER_IP}:5000/v1/xcompute/vm/VM_NAME
  8. The above steps may need to be iterated through several times. When totally satisfied, stash the various commands required for successful bootstrapping and overwrite the user data scripts in the ${LSF_TOP}/conf/resource_connector/exostellar/conf directory.

  9. There will be a per-pool user_data script in that folder. It can be overwritten at any time a change is needed and the next time a node is instantiated from that pool, the node will get the changes.

  10. A common scenario is that all the user_data scripts are identical, but it could be beneficial for different pools to have different user_data bootstrapping assets.

    You might find that adding these settings to lsf.conffaciliates reading output from lshosts and bhosts commands. Note: this is optional and the changes may have unintended consequences in your environment.

    1. Code Block
      LSB_BHOSTS_FORMAT="HOST_NAME:47 status:13 max:-8 njobs:-8 run:-8 ssusp:-8 ususp:-8 comments"
      LSF_LSHOSTS_FORMAT="HOST_NAME:47 res:13 nprocs:-8 ncores:-8 nthreads:-8 ncpus:-8 maxmem:-8:S maxswp:8:S server:7 type"
  11. The values assigned for variables with TIME and DELAY may be tuned for the best timing scenario of your cluster and assets. The LSF Admin may opt for different timing than above.

NOTE: When adding a new provider to an already functioning Resource Connector, the LSB_RC_EXTERNAL_HOST_FLAG variable takes white-space separated strings, so line 3 from the above might be changed as follows when the awshost provider is already configured:

LSB_RC_EXTERNAL_HOST_FLGAG="awshost xiohost"

Update the Host Providers for Resource Connector

${LSF_TOP}/10.1/resource_connectorand ${LSF_TOP}/conf/resource_connector folders appear very similar and can be a source of confusion. This tutorial has focused on a fully patched (fix pack 14) LSF 10.1 installation. Exostellar assets were downloaded into ${LSF_TOP}/10.1/resource_connector/exostellar.

Both locations, under 10.1 or under conf, may contain a hostProviders.json. To simplify and reduce confusion, you may want to consider linking them:

Code Block
ln -s ${LSF_TOP}/10.1/resource_connector/hostProviders.json ${LSF_TOP}/conf/resource_connector/hostProviders.json

The hostProviders.json needs an entry for any host providers Resource Connector will make available to LSF. An Exostellar-only example will look like this:

Code Block
{
    "providers":[
        {
            "name": "xio",
            "type": "xioProv",
            "confPath": "resource_connector/exostellar",
            "scriptPath": "resource_connector/exostellar"
        }
    ]
}

and an example for both AWS and Exostellar providers in Resrouce Connector:

Code Block
{
    "providers":[
        {
            "name": "aws",
            "type": "awsProv",
            "confPath": "resource_connector/aws",
            "scriptPath": "resource_connector/aws"
        },
        {
            "name": "xio",
            "type": "xioProv",
            "confPath": "resource_connector/exostellar",
            "scriptPath": "resource_connector/exostellar"        
        }
    ]
}

Python Dependencies

Install required python components: yum -y install python3 python3-requests

Restart Select Services on the LSF Master

Following directions from the files modified:

  1. Code Block
    su lsfadmin -c "lsadmin reconfig"
  2. Code Block
    su lsfadmin -c "badmin mbdrestart"
  3. Code Block
    su lsfadmin -c "badmin reconfig"
  4. Validate configuration changes with several additional commands:

    1. Verify your new queue exists:

      1. Code Block
        su lsfadmin -c "bqueues"
      2. You should see the queue you defined in the list of available queues.

    2. Verify Resource Connector’s xioProvider has templates available for job run:

      1. Code Block
        su lsfadmin -c "badmin rc view -c templates -p xio"
      2. Resource Connector cycles every 30 seconds by default, and this command may need to be rerun after 30 seconds.

      3. You should now see as many templates defined in the output as PoolNames were configured in earlier steps.

LSF and Resource Connector are now configured for use with Exostellar’s Infrastructure Optimizer.

Compute AMI Import

During Prepping the LSF Integration an AMI for compute nodes was identified or created. This step will import that AMI into Infrastructure Optimizer. Ideally, this AMI is capable of booting quickly.

Note: If SELINUX is not already disabled in the target AMI, it will need to be disabled. Step 5. below offers a script argument to make the change.

Code Block
./parse_helper.sh -a <AMI-ID> -i <IMAGE_NAME>
  1. The AMI-ID should be based on a LSF compute node from your cluster, capable of running your workloads.

  2. The AMI should be created by this account.

  3. The AMI should not have product codes.

  4. The Image Name was specified in the environment set up previously and will be used in this command.

  5. Additionally, we can pass -s script.sh if troubleshooting is required.

Validation of Migratable VM Joined to Your LSF Cluster

The script test_createVm.sh exists for a quick validation that new compute resources can successfully connect and register with the scheduler.

Code Block
./test_createVm.sh -h xvm0 -i <IMAGE_NAME> -u user_data.sh
  1. The hostname specified with -h xvm0 is arbitrary.

  2. The Image Name specified with -i <IMAGE_NAME> should correspond to the Image Name from the parse_helper.sh command and the environment setup earlier.

  3. The -u user_data.sh is available for any customization that may be required: temporarily changing a password to faciliate logging in, for example.

  4. The test_createVm.shscript will continuously output updates until the VM is created. When the VM is ready, the script will exit and you’ll see all the fields in the output are now filled with values:

    1. Code Block
      Waiting for xvm0... (4)
      NodeName: xvm0
      Controller: az1-qeuiptjx-1
      Controller IP: 172.31.57.160
      Vm IP: 172.31.48.108
  5. With the Vm IP above, ssh to the node and to inspect the compute node. This step is meant to provide a migratable VM so that sanity checking may occur:

    1. Have network mounts appeared as expected?

    2. Is authentication working as intended?

    3. What commands are required to finish bootstrapping?

    4. Et cetera.

  6. Iterate and validate as many times as required to satisfy all requirements.

  7. Lastly, LSF services should be started at the end of bootstrapping.

    1. It may take 5 minutes or longer for the LSF services to register with the LSF Master Host.

    2. When the RC Execution Host is properly registered, it will be visible via the lshost command.

  8. To remove this temporary VM:

    1. Replace VM_NAME with the name of the VM , -h xvm0 example above.

    2. Code Block
      languagenone
      curl -X DELETE  http://${MGMT_SERVER_IP}:5000/v1/xcompute/vm/VM_NAME
  9. When totally satisfied, place final versions of the user_data scripts in the ${LSF_RC_CONF_DIR}/exostellar/conf directory.

    1. There needs to be a per-pool user_data script in that folder. It can be overwritten at any time a change is needed and the next time a node is instantiated from that pool, the node will get the changes.

    2. A common scenario is that all the user_data scripts are identical, but it could be beneficial for different pools to have different user_data bootstrapping assets. If they are identical, links can be placed in the ${LSF_RC_CONF_DIR}/exostellar/conf directory instead of individual files.

    3. The auto-generated user_data scripts are initially located in ${LSF_RC_CONF_DIR}/exostellar/scripts. After copying them to ${LSF_RC_CONF_DIR}/exostellar/conf and modifying them, be sure to set their permissions so LSF can use them, e.g.:

      Code Block
      chown lsfadmin.root ${LSF_RC_CONF_DIR}/exostellar/conf/*_user_data.sh

Validate Integration with LSF

  1. Integration steps are complete and a job submission to the new queue is the last validation:

    1. As a user, navigate to a valid job submission directory and launch a job as normal, but be sure to specifiy the new queue:

      1. bsub -q NewQueueName xio < job-script.sh