/
(v2.3.0.0) AMI Parsing Error Code Reference Slurm

(v2.3.0.0) AMI Parsing Error Code Reference Slurm

Parsing an AMI and validating the image’s ability to connect with the HPC scheduler can be challenging task depending on the requirements of the environment.

Note: If SELINUX is not already disabled in the target AMI, it will need to be disabled.

Error Codes from Parsing AMIs

The following table lists error code and information about them.

Code

Error

How To Debug

Code

Error

How To Debug

1

Catch all error code for unhandled errors. Added as a safety net in case any unhandled errors occur.

If available, check the logs or script output for more details on the error. This applies to all error codes.

100

Could not get metadata, such as SGs, subnet, AZ, and/or could not determine the AMI ID of the AL2 AMI.

101

Could not create instance from the AMI that should be parsed.

Potential issues:

  • ssh-keygen was not able to generate a public key from a private key. If a --private- key-file was specified ensure that the key file is valid.

  • The userdata could not be generated properly, if a --public-key-file was specified ensure it exists.

  • The EC2 run-instances call failed. This could be due to a lack of permissions or invalid parameters such as the specified AMI ID or --key-name

102

Could not get root volume ID of primary instance.

Ensure required IAM permissions for the script are granted.

103

Could not boot secondary AL2 instance for EBS based image.

See error code 101, the run-instances call could have failed.

104

Could not wait for instances to reach ready state.

  • Ensure required IAM permissions for the script are granted.

  • Alternatively there could be an issue with the instances not being able to reach a ready state. In the AWS console look for instances tagged parseAMI to inspect the instances further.

105

Failed to ssh to primary instance.

  • Ensure that network traffic is allowed between instances assigned the security groups of the instance running the script

  • In rare cases SSH to the instance will fail due to timeout if the AMI is very slow to boot up, or if it slow to respond to SSH calls

106

Could not get /proc/cmdline of primary instance.

Some AMIs might mount the /home directory after booting, eclipsing the SSH keys used by default. If that is the case retry running the script with the private_key_file and username options that match the user for the mounted directory.

107

The wildcard generated from the BOOT_IMAGE parameter to determine the initrd of the AMI is not valid.

Similar to error code 115, inspect the log output for init_wcard. If the wild card is not a path an issue with the underlying BOOT_IMAGE parameter or an issue with extracting the path properly.

108

Command to find initrd files with the init wildcard failed.

See 107

109

No valid init file could be found.

See 107. If the wildcard is valid inspect an instance booted from the AMI and verify that valid init images are present in the /boot directory.

110

Could not change init file permissions.

Check log output to see why changing the permission failed.

111

Could not copy init file from booted instance to instance running the script.

Make sure permissions to copy to the local directory are granted.

112

Running the script with commands to update the instance failed.

Required

  • Ensure that the current user has permission to write files.

Optional

  • Ensure that the specified script exists and can be executed within 5 min.

113

Could not get snapshot of root volume of the booted instance.

  • Ensure required IAM permissions are granted.

  • There could be an issue with the booted instance(s) and the root volume. In the AWS console look for instances tagged parseAMI to inspect the instances further.

114

Could not save example create-vm command to file.

Ensure the current user has permission to write files.

115

Could not find the BOOT_IMAGE parameter in the /proc/cmdline output.

Inspect the log output to see what is logged for boot_params, alternatively inspect the /proc/cmdline of an instance of the AMI to see whether the parameter does not exist or there was an issue picking it up. It could indicate an issue with the underlying AMI.

116

Caught interrupt signal that terminated the script execution.

N/A

117

Could not create JSON manifest file containing all the image information.

  • Ensure required IAM permissions are granted.

  • Ensure the current user has permission to write files.

118

Could not generate tmp SSH key to gain access to the booted instance.

Verify that keys can be generated on the current instance with openssl.