Document toolboxDocument toolbox

v2.4.0.0 Customizing Profiles for Slurm Integration


Edit the Profile JSON for Your Purposes:

  1. Tagging instances created by the backend is controlled by two sections, depending on the function of the asset:

    1. Controllers are On-Demand instances that manage other instances. By default, they are tagged as seen on lines 6-9, above, and 1-4 below.

      1. { "Key": "exostellar.xspot-role", "Value": "xspot-controller" }
      2. To add additional tags, duplicate lines 1-4 as 5-8 below (as many times as you need), noting that an additional comma is added on line 4.

      3. { "Key": "exostellar.xspot-role", "Value": "xspot-controller' }, { "Key": "MyCustomKey", "Value": "MyCustomValue" }
      4. Don’t forget the comma between tags.

    2. Workers will be created by Controllers as needed and they can be On-Demand/Reserved instances or EC2 Spot. By default, they are tagged as seen on lines 26-30, above, and 1-4 below:

      1. { "Key": "exostellar.xspot-role", "Value": "xspot-worker" }
      2. Add as many tags as needed.

      3. Don’t forget the comma between tags.

  2. Note Line numbers listed below reference the above example file. Once changes start being made on the system, the line numbers may change.

  3. Line 11 - InstanceType: Controllers do not generally require large instances.

    1. In terms of performance, these On-Demand Instances can be set as c5.xlarge or m5.xlarge with no adverse effect.

  4. Line 20 - MaxControllers : This will define an upper bound for your configuration.

    1. Controllers will manage up to 80 workers.

    2. The default upper bound is 800 nodes joining your production cluster: notice line 20 "MaxControllers": 10,.

    3. If you plan to autoscale past 800 nodes joining your production cluster, MaxControllers should be increased.

    4. If you want to lower that upper bound, MaxControllers should be decreased.

  5. Line 21 - ProfileName: This is used for your logical tracking, in the event you configure multiple profiles.

  6. Lines 31-34 - InstanceTypes here in the Worker section, this refers to On-Demand instances – if there is no EC2 Spot availability, what instances do you want to run on.

  7. Lines 38-43 - SpotFleetTypes : here in the Worker section, this refers to EC2 Spot instance types – because of the discounts, you may be comfortable with a much broader range of instance types.

    1. More types and families here, means more opportunities for cost optimization.

    2. Priorities can be managed by appending a : and an integer, e.g. m5:1 is a higher priority than c5:0.

  8. Line 48 - EnableHyperthreading: Set to False to disable hyperthreading.

  9. Line 53 - NodeGroupName : This string appears in Controller Name tagging <profile>-NGN-count

  10. All other field/lines can be ignored in the asset.