[lustre-discuss] Lustre chown and create operations failing for certain UIDs

Russell Dekema dekemar at umich.edu
Fri Jan 11 06:18:15 PST 2019


Good morning,

I've been bitten by that one in the past too. This time I checked, and
the md5sum of /etc/passwd is identical on the MDS nodes and the
compute/login nodes.

Cheers,
Rusty D.

On Thu, Jan 10, 2019 at 7:16 PM Harr, Cameron <harr1 at llnl.gov> wrote:
>
> Russell,
>
> Your symptoms are a little different from what I see when the MDS node's
> passwd file is incomplete, but did you verify the affected_user has a
> proper /etc/passwd entry on the MDS node(s)?
>
> On 1/10/19 12:14 PM, Russell Dekema wrote:
> > We've got a Lustre system running lustre-2.5.42.28.ddn8 and are having
> > a problem with it that none of us here have ever seen before. We are
> > wondering if anyone here has seen this or has any idea what might be
> > causing it.
> >
> > (I have redacted the example affected username and its corresponding
> > UID in this message for privacy reasons, and replaced them with the
> > strings 'affected_username' and 'uid_of_the_affected_username'.)
> >
> > Of the 4,463 UIDs on our system, there are 24 UIDs for which Lustre
> > returns an error when those UIDs are used as the owner of files in a
> > chown or create operation, as in:
> >
> > # chown affected_username testfile
> > chown: changing ownership of ‘testfile’: No such process
> >
> > This occurs both when root tries to chown a file for the affected UID
> > and when the user corresponding to that UID tries to create a file in
> > a directory to which they otherwise have the proper permissions.
> >
> > If, as root, we run "strace chown affected_username testfile", the
> > system call that fails appears to be:
> >
> > fchownat(AT_FDCWD, "testfile", uid_of_the_affected_username, -1, 0) =
> > -1 ESRCH (No such process)
> >
> > Another unusual thing we have noticed about these users is that we are
> > unable to look up their quotas on the Lustre filesystem, even though
> > quota lookups work fine for other users:
> >
> > $ sudo lfs quota -u affected_username /scratch
> > usr quotas are not enabled.
> >
> > If anyone has any ideas about what might be causing this or what we
> > might try in order to fix or further diagnose it, I'll be glad to hear
> > them. We do have a ticket open with our vendor but wanted to see if
> > anyone else had heard of this while we await their response.
> >
> > Sincerely,
> > Rusty Dekema
> > University of Michigan
> > Advanced Research Computing - Technology Services
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list