[Lustre-discuss] problem with too many (default) ACLs on a directory

Frederik Ferner frederik.ferner at diamond.ac.uk
Wed May 12 05:15:57 PDT 2010


Hi,

we are having problems with ACLs at the moment. As far as we understand
this is what has happened.

We have a directory with 33 default ACLs on it in addition to 32 other ACLs.

Our problem started when a user created a subdirectory in the directory
with the 33 default ACLs. This worked but the new directory now is
inaccessible. The number of ACLs on the initial directory does not seem
to matter.

I hope this will be clearer with a short example when I managed to
reproduce it on our test file system:

[bnh65367 at cs04r-sc-serv-06 testdir]$ getfacl .
# file: .
# owner: bnh65367
# group: bnh65367
user::rwx
group::rwx
group:dls_sysadmin:rwx
mask::rwx
other::r-x
default:user::rwx
default:group::rwx
default:group:dls_sysadmin:rwx
default:group:ltest1:r--
default:group:ltest2:r--
default:group:ltest3:r--
default:group:ltest4:r--
default:group:ltest5:r--
default:group:ltest6:r--
default:group:ltest7:r--
default:group:ltest8:r--
default:group:ltest9:r--
default:group:ltest10:r--
default:group:ltest11:r--
default:group:ltest12:r--
default:group:ltest13:r--
default:group:ltest14:r--
default:group:ltest15:r--
default:group:ltest16:r--
default:group:ltest17:r--
default:group:ltest18:r--
default:group:ltest19:r--
default:group:ltest20:r--
default:group:ltest21:r--
default:group:ltest22:r--
default:group:ltest23:r--
default:group:ltest24:r--
default:group:ltest25:r--
default:group:ltest26:r--
default:group:ltest27:r--
default:group:ltest28:r-x
default:mask::rwx
default:other::r-x
[bnh65367 at cs04r-sc-serv-06 testdir]$ mkdir testdir3
[bnh65367 at cs04r-sc-serv-06 testdir]$ ls -ld testdir3
ls: testdir3: Numerical result out of range
[bnh65367 at cs04r-sc-serv-06 testdir]$ ls -l
total 8
drwxrwxr-x+ 2 bnh65367 bnh65367 4096 May 12 11:23 testdir
?---------  ? ?        ?           ?            ? testdir2
?---------  ? ?        ?           ?            ? testdir3
-rw-rw-r--+ 1 bnh65367 bnh65367    0 May 12 11:24 testfile
[bnh65367 at cs04r-sc-serv-06 testdir]$ stat testdir3
stat: cannot stat `testdir3': Numerical result out of range
[bnh65367 at cs04r-sc-serv-06 testdir]$

(The other entries in there have been created when the number of default
ACLs was only 32.)

We were also not able to create any files in that directory:
[bnh65367 at cs04r-sc-serv-06 testdir]$ touch testfile3
touch: cannot touch `testfile3': Numerical result out of range


The following log entry on the MDS seems related, no error on the client
that I could find.

May 12 12:50:56 cs04r-sc-mds02-01 kernel: LustreError:
3329:0:(handler.c:732:mds_pack_posix_acl()) buflen 260, get acl: -34
May 12 12:50:56 cs04r-sc-mds02-01 kernel: LustreError:
3329:0:(handler.c:732:mds_pack_posix_acl()) Skipped 3 previous similar
messages

(-34 is -ERANGE)

We found bug #17636 which seems related but not quite the same issue and
is apparently fixed in the version we are using. In a test we were able
to apply up to 32 ACLs to a file, the 33th ACL failed with the message
"Operation not supported".

Does anyone have any idea how we could get access to these directories
back? Just removing some of the ACLs did not work as it seems setfacl
stats the directory first or something:

[bnh65367 at cs04r-sc-serv-06 testdir]$ setfacl -x group:ltest1: testdir3
setfacl: testdir3: Numerical result out of range
[bnh65367 at cs04r-sc-serv-06 testdir]$


This is with Lustre 1.6.7.2.ddn3.5 on client and MDS, both are running
RHEL5 if it makes a difference.

Kind regards,
Frederik

-- 
Frederik Ferner
Computer Systems Administrator		phone: +44 1235 77 8624
Diamond Light Source Ltd.		mob:   +44 7917 08 5110
(Apologies in advance for the lines below. Some bits are a legal
requirement and I have no control over them.)




More information about the lustre-discuss mailing list