<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
Hi Thomas,<br>
<br>
It sounds like you are running into this issue: <br>
<a class="moz-txt-link-freetext" href="https://jira.whamcloud.com/browse/LU-14121">https://jira.whamcloud.com/browse/LU-14121</a><br>
<br>
I think I ran into the same issue as you or at least something
similar on our slurm cluster using Lustre 2.15.x (servers and
clients).<br>
As I haven't had the spare cycles or equipment to dig into what was
going on, I have been using admin=1 and the legacy root squash
mechanism for our cluster nodes as mentioned in the jira ticket.<br>
<br>
<br>
Thanks,<br>
David<br>
<br>
<br>
<div class="moz-cite-prefix">On 1/9/2025 12:58 PM, Thomas Roth
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:92873aba-20ee-4b1a-89ae-9ae0e3f61760@googlemail.com">Ja
ja,
<br>
I have an Admin nodemap comprising all Lustre servers and a
handful of administrative clients, and this nodemap has both admin
and trusted set to 1.
<br>
<br>
No, by now I rather think that because the Slurm demon,
slurmstepd, is running as root, it comes in as user 99 on the
batch nodes, and when the job wants to write output to, say,
/lustre/A/B/C/, and A,B,C are not world-readable (actually octal
'5'), slurmstepd can't step into the output directory and the job
will fail.
<br>
<br>
<br>
Regards,
<br>
Thomas
<br>
<br>
On 1/9/25 1:10 PM, Sebastien Buisson wrote:
<br>
<blockquote type="cite">Hi,
<br>
<br>
As explained in the Lustre Operations Manual in this section:
<br>
<a class="moz-txt-link-freetext" href="https://urldefense.com/v3/__https://doc.lustre.org/lustre_manual.xhtml*idm139831573757696__;Iw!!PvDODwlR4mBZyAb0!SF1EJmHJokm42L888JwiZfsoKpgqKkTF25wvx8PcIkUgF3OktC0ll3zzI-gYrNeFHg_bhBFf2L6C2aLMG0NZ8acRKQ$">https://urldefense.com/v3/__https://doc.lustre.org/lustre_manual.xhtml*idm139831573757696__;Iw!!PvDODwlR4mBZyAb0!SF1EJmHJokm42L888JwiZfsoKpgqKkTF25wvx8PcIkUgF3OktC0ll3zzI-gYrNeFHg_bhBFf2L6C2aLMG0NZ8acRKQ$</a>
it is required to define a nodemap that matches all server
nodes, with admin and trusted to 1.
<br>
Have you?
<br>
<br>
Cheers,
<br>
Sebastien.
<br>
<br>
Le 9 janv. 2025 à 13:03, Thomas Roth
<a class="moz-txt-link-rfc2396E" href="mailto:dibbegucker@googlemail.com"><dibbegucker@googlemail.com></a> a écrit :
<br>
<br>
[Vous ne recevez pas souvent de courriers de
<a class="moz-txt-link-abbreviated" href="mailto:dibbegucker@googlemail.com">dibbegucker@googlemail.com</a>. D?couvrez pourquoi ceci est
important ?
<a class="moz-txt-link-freetext" href="https://urldefense.com/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!PvDODwlR4mBZyAb0!SF1EJmHJokm42L888JwiZfsoKpgqKkTF25wvx8PcIkUgF3OktC0ll3zzI-gYrNeFHg_bhBFf2L6C2aLMG0O_RiJ-1g$">https://urldefense.com/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!PvDODwlR4mBZyAb0!SF1EJmHJokm42L888JwiZfsoKpgqKkTF25wvx8PcIkUgF3OktC0ll3zzI-gYrNeFHg_bhBFf2L6C2aLMG0O_RiJ-1g$</a>
]
<br>
<br>
Hi all,
<br>
<br>
we have just switched on nodemap on our 2.12 cluster, with all
batch clients being trusted=1 but admin=0, so bascially
root-squashing.
<br>
<br>
The batch system is done by Slurm.
<br>
<br>
Now all jobs are failing, when the user's directory on Lustre is
not world-readable ("permission denied").
<br>
<br>
RW - Access in the shell is not a problem.
<br>
<br>
<br>
<br>
Any site running Slurm and having encountered a similar issue?
<br>
<br>
<br>
Regards,
<br>
Thomas
<br>
<br>
<br>
Perhaps I should add that I have used the default nodemap for
this, to avoid having to specify many hundreds of non-contiguous
batch node IP ranges.
<br>
_______________________________________________
<br>
lustre-discuss mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a>
<br>
<a class="moz-txt-link-freetext" href="https://urldefense.com/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!PvDODwlR4mBZyAb0!SF1EJmHJokm42L888JwiZfsoKpgqKkTF25wvx8PcIkUgF3OktC0ll3zzI-gYrNeFHg_bhBFf2L6C2aLMG0MB_ZRRXw$">https://urldefense.com/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!PvDODwlR4mBZyAb0!SF1EJmHJokm42L888JwiZfsoKpgqKkTF25wvx8PcIkUgF3OktC0ll3zzI-gYrNeFHg_bhBFf2L6C2aLMG0MB_ZRRXw$</a>
<br>
</blockquote>
<br>
_______________________________________________
<br>
lustre-discuss mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a>
<br>
<a class="moz-txt-link-freetext" href="https://urldefense.com/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!PvDODwlR4mBZyAb0!SF1EJmHJokm42L888JwiZfsoKpgqKkTF25wvx8PcIkUgF3OktC0ll3zzI-gYrNeFHg_bhBFf2L6C2aLMG0MB_ZRRXw$">https://urldefense.com/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!PvDODwlR4mBZyAb0!SF1EJmHJokm42L888JwiZfsoKpgqKkTF25wvx8PcIkUgF3OktC0ll3zzI-gYrNeFHg_bhBFf2L6C2aLMG0MB_ZRRXw$</a>
</blockquote>
<br>
</body>
</html>