[lustre-discuss] Group lookup behaviour (v2.1.5) with ldap/sssd seemingly dependent on client version

Matt Raso-Barnett matt at rasobarnett.com
Mon Mar 21 03:37:26 PDT 2016


Hello,
I've recently had some strange behaviour when trying to fix a problem
with secondary group lookups against an ldap database not working in
Lustre. The Lustre server version is 2.1.5, and has been running
seemingly without this problem for some time, however we've recently
noticed after upgrading the lustre clients from version 2.4.1 to
2.5.41 (the latest Intel IEE release), that group lookups are no
longer working.

We've solved our immediate problem by going through the docs and found
that there is a Redhat advisory about the getgrent() function (used by
the l_getidentity program in Lustre 2.1.5) not returning ldap groups
with sssd unless the 'enumeration' setting is enabled
(https://access.redhat.com/solutions/657713).

So enabling this fixes the problem. I also understand that more recent
versions of Lustre now use the getgrouplist() function which doesn't
have this problem with sssd.

However what we find perplexing is that we have nodes running older
Lustre clients (I've tested with client versions 2.4.1 and 2.1.5) that
do not show any problems with giving access to files/directories
restricted to ldap groups. So to be clear, before fixing sssd, if
l_getidentity on the MDS doesn't return any ldap groups, clients
running 2.5.41 do not allow access to directories restricted to ldap
groups, but clients running 2.4.1 and 2.1.5 *do* allow access to these
same directories.

I've always thought that group lookups were *only* done by the MDS, so
I don't understand why the client version makes a difference here.

I wondered if anyone who understands this area of the codebase would
mind explaining the process by which group lookups are handled by both
the client and the server? I know that the MDT caches this information
for some time (by default I think it's 600 seconds) - does the client
cache this information also?

Is there a simple explanation to this that we are missing?

Kind regards,

Matt Raso-Barnett
HPCS, University of Cambridge


More information about the lustre-discuss mailing list