You are viewing an old version of this page. View the current version.
Compare with Current
View Page History
Version 1
Current »
Example User Data Scripts
Whether you’ve rolled your own slurm cluster or you’re relying on AWS Parallel Cluster (APC), you might find these examples helpful.
APC Example with a Base CentOS 7 AMI
Longer user_data.sh
#!/bin/bash
set -x
hostname XSPOT_NODENAME
#echo 'root:AAAAAAA' |chpasswd
#sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/g' /etc/sshd/sshd_config
#sed -i 's/UsePAM yes/UsePAM no/g' /etc/sshd/sshd_config
#sed -i 's/#PermitRootLogin yes/PermitRootLogin yes/g' /etc/ssh/sshd_config
#echo 'ssh-rsa AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA root@apchead' >> /root/.ssh/authorized_keys
#systemctl restart sshd
APCHEAD=172.31.60.46
#mounting APC NFS dirs
mkdir -p /home /opt/parallelcluster/shared /opt/intel /opt/slurm
for i in /home /opt/parallelcluster/shared /opt/intel /opt/slurm ; do
echo Mounting ${APCHEAD}:${i} ${i}
mount -t nfs ${APCHEAD}:${i} ${i}
echo Mounting ${APCHEAD}:${i} ${i} : SUCCESS.
done
#mounting EFS
mkdir /exoefs
echo 'fs-AAAAAAAAAAAAAAAAAA.efs.us-east-1.amazonaws.com:/ /exoefs nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=30,retrans=2,noresvport,_netdev 0 0' >> /etc/fstab
mount -a
#add krs, slurm401, munge402 users
groupadd -g 899 exo
useradd -u 1001 -g 899 krs
groupadd -g 401 slurm
groupadd -g 402 munge
useradd -g 401 -u 401 slurm
useradd -g 402 -u 402 munge
rpm -ivh /opt/parallelcluster/shared/munge/x86_64/munge-0.5.14-1.el7.x86_64.rpm
cp -p /opt/parallelcluster/shared/munge/munge.key /etc/munge/
chown munge.munge /etc/munge /var/log/munge
mkdir -p /var/spool/slurmd
chown slurm.slurm /var/spool/slurmd
sleep 5
systemctl start munge
if [[ $? -ne 0 ]]; then
sleep 10
systemctl start munge
fi
SLURM_BIN_PATH=/opt/slurm/bin
SLURM_SBIN_PATH=/opt/slurm/sbin
SLURM_CONF_DIR=/opt/slurm/etc
${SLURM_BIN_PATH}/scontrol update nodename=XSPOT_NODENAME nodeaddr=`hostname -I | cut -d" " -f1`
#systemctl start slurmd
${SLURM_SBIN_PATH}/slurmd -f ${SLURM_CONF_DIR}/slurm.conf -N XSPOT_NODENAME
Note: Capturing an AMI from a running compute resource booted by AWS Parallel Cluster (APC) may be a very tedious task. APC nodes are not the most efficient pathway to generating an AMI for parsing due the complexities inherent in the APC management of its resources. It would be more expeditious to start with a base image that contains no APC dependencies and then to add those dependenices via a user_data.sh
as above.
Example with a non-APC-compute-node-based AMI
Simple user_data.sh
#!/bin/bash
set -x
hostname XSPOT_NODENAME
SLURM_BIN_PATH=/usr/bin
${SLURM_BIN_PATH}/scontrol update nodename=XSPOT_NODENAME nodeaddr=`hostname -I | cut -d" " -f1`
systemctl start slurmd