[lustre-discuss] Assistance deploying Slurm HPC cluster with Lustre file system based on Google Cloud Platform (GCP)

Eyal Estrin eyale at hotmail.com
Sun Aug 4 06:18:20 PDT 2019


Hi all,
1. I am trying to deploy Slurm HPC cluster based on Google Cloud Platform, with Lustre file system, as instructed below:​
   https://codelabs.developers.google.com/codelabs/hpc-slurm-on-gcp/#0​
   https://cloud.google.com/blog/products/storage-data-transfer/introducing-lustre-file-system-cloud-deployment-manager-scripts​
   https://github.com/GoogleCloudPlatform/deploymentmanager-samples/tree/master/community/lustre​
​
2. I have created VPC Peering between the Slurm network and the Lustre cluster network​
​
3. I have created Firewall rules for allowing all ports and protocols between the Slurm network and the Lustre cluster network​
​
4. I have added DNS records for all the Lustre cluster machines inside the Slurm master node /etc/hosts​
​
5. I have installed the following Lustre client pre-requirements on the Slurm master node:​
   sudo yum install kernel kernel-devel kernel-headers kernel-abi-whitelists kernel-tools kernel-tools-libs kernel-tools-libs-devel​
​
6. I have created the /etc/yum.repos.d/lustre.repo with the following content:​
[lustre-server]​
name=CentOS-$releasever - Lustre​
baseurl=https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el7/server/​
gpgcheck=0​
​
[e2fsprogs]​
name=CentOS-$releasever - Ldiskfs​
baseurl=https://downloads.hpdd.intel.com/public/e2fsprogs/latest/el7/​
gpgcheck=0​
​
[lustre-client]​
name=CentOS-$releasever - Lustre​
baseurl=https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el7/client/​
gpgcheck=0​
​
7. I have installed the Lustre client packages on the Slurm master node, using the following command:​
   sudo yum install e2fsprogs lustre-client​
​
8. I used the following commands to create a mount point for the Lustre file system from within the Slurm master node:​
   sudo mkdir -p /lustre​
   sudo chmod 777 -R /lustre​
​
9. Due to the fact that on the Slurm master node on Google Cloud Platform, my logged-in account is not Root account, but a Google G Suite account, the only way to perform mount and create a test file inside the mount point /lustre, is to use the following Sudo commands:​
    sudo mount -t lustre lustre-mds1:/lustre /lustre​
    sudo touch /lustre/1.txt​
​
I have couple of problems with the above process:​
A. Even though the mount point (/lustre) has chmod of 777, the folder is still owned by Root user and group, and I am still unable to write files into the /Lustre mount point​ - How do I allow Google G Suite accounts the privilege to read/write/delete files from the /Lustre mount point?

B. How do I add the following packages as part of the Slurm deployment package on both the Slurm master node and on all Slurm compute nodes (https://github.com/SchedMD/slurm-gcp)?​
   sudo yum install kernel kernel-devel kernel-headers kernel-abi-whitelists kernel-tools kernel-tools-libs kernel-tools-libs-devel​
   sudo yum install e2fsprogs lustre-client​
   Note: For the Lustre client installation, I need to add the /etc/yum.repos.d/lustre.repo with specific content (as instructed here: http://wiki.lustre.org/Installing_the_Lustre_Software)​



Thanks,

Eyal Estrin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190804/93e80777/attachment.html>


More information about the lustre-discuss mailing list