The following manual steps for EAR will soon be replaced updated with a simplified workflow for command line users and alternatively, the Mangement Console (Web UI) will be able to replace most of these steps, as well.
...
Code Block #exostellar LSB_STDOUT_DIRECT=Y #optional LSB_RC_EXTERNAL_HOST_FLAG=xiohost LSF_LOCAL_RESOURCES="[resource xiohost] [type X86_64]" LSF_DYNAMIC_HOST_WAIT_TIME=2 LSF_DYNAMIC_HOST_TIMEOUT=10m ENABLE_DYNAMIC_HOSTS=Y LSF_REG_FLOAT_HOSTS=Y EBROKERD_HOST_CLEAN_DELAY=5 LSF_MQ_BROKER_HOSTS=head #equivalent to LSF Master, in this example LSB_RC_EXTERNAL_HOST_IDLE_TIME=2 EGO_DEFINE_NCPUS=threads
The values assigned for variables with
TIME
andDELAY
may be tuned for the best timing scenario of your cluster and assets. The LSF Admin may opt for different timing than above.
...
The hostname specified with
-h xvm0
is arbitrary.The Image Name specified with
-i <IMAGE_NAME>
should correspond to the Image Name from theparse_helper.sh
command and the environment setup earlier.The
-u user_data.sh
is available for any customization that may be required: temporarily changing a password to faciliate logging in, for example.The
test_createVm.sh
script will continuously output updates until the VM is created. When the VM is ready, the script will exit and you’ll see all the fields in the output are now filled with values:Code Block Waiting for xvm0... (4) NodeName: xvm0 Controller: az1-qeuiptjx-1 Controller IP: 172.31.57.160 Vm IP: 172.31.48.108
With the
Vm IP
above,ssh
to the node and to inspect the compute node. This step is meant to provide a migratable VM so that sanity checking may occur:Have network mounts appeared as expected?
Is authentication working as intended?
What commands are required to finish bootstrapping?
Et cetera.
Iterate and validate as many times as required to satisfy all requirements.
Lastly, LSF services should be started at the end of bootstrapping.
It may take 5 minutes or longer for the LSF services to register with the LSF Master Host.
When the RC Execution Host is properly registered, it will be visible via the
lshost
command.
To remove this temporary VM:
Replace VM_NAME with the name of the VM ,
-h xvm0
example above.Code Block language none curl -X DELETE http://${MGMT_SERVER_IP}:5000/v1/xcompute/vm/VM_NAME
The above steps may need to be iterated through several times. When totally satisfied, stash the various commands required for successful bootstrapping and overwrite the user data scripts in the
${LSF_TOP}/conf/resource_connector/exostellar/conf
directory.There will be a per-pool
user_data
script in that folder. It can be overwritten at any time a change is needed and the next time a node is instantiated from that pool, the node will get the changes.A common scenario is that all the
user_data
scripts are identical, but it could be beneficial for different pools to have differentuser_data
bootstrapping assets.
...
Integration steps are complete and a job submission to the new queue is the last validation:
As a user, navigate to a valid job submission directory and launch a job as normal, but be sure to specifiy the new queue:
bsub -q NewQueueName xio < job-script.sh