[Lustre-discuss] Strange bug: no handle for file close

Arne Brutschy arne.brutschy at ulb.ac.be
Fri Jul 3 03:41:24 PDT 2009


Hi all,

I recently created a test installation of lustre on our cluster (rocks
4.2.1, CentOS 4.7, lustre 1.6.7.2). The setup is quite simple, 1 MGS/MDS
and 4 OSS, each with a single target.

I migrated each user's homedir from our raid5-nfs-shared head node to
the lustre mount. I didn't had any problems, everything went fine (I
used the automounter for easy transition). The setup has been running
fine for a week. Now I added a new user -- still with the old scripts,
so I added the user and tried to migrate it afterwards.

The result: the user cannot log in. Bash reports something like
"identifier removed", apparently the user cannot read any file from his
home. Strangely, I can read and write all files fine when I'm root. I
can revert the migration and the data is fine (the user can log in).

On the MDS, I found the following messages in the log:

> LustreError: 3584:0:(mds_open.c:1567:mds_close()) @@@ no handle for file close ino 30863131: cookie 0xafec72814d4ff48a  req at f5f41400 x1257636/t0 o35->bb2a441c-fe74-2223-8d98-c2e40170b718@:0/0 lens 296/560 e 0 to 0 dl 1246615645 ref 1 fl Interpret:/0/0 rc 0/0
> LustreError: 3583:0:(mds_open.c:1567:mds_close()) @@@ no handle for file close ino 34451538: cookie 0xafec728167e99feb  req at f3caee00 x1319105/t0 o35->98f257a6-4c8a-7f0b-25fc-d02a17efc2a6@:0/0 lens 296/560 e 0 to 0 dl 1246615646 ref 1 fl Interpret:/0/0 rc 0/0
> LustreError: 3583:0:(mds_open.c:1567:mds_close()) Skipped 12 previous similar messages
> LustreError: 3584:0:(mds_open.c:1567:mds_close()) @@@ no handle for file close ino 30862821: cookie 0xafec728146cb237c  req at f35de600 x433424/t0 o35->3e8da5ff-20e6-19dd-f975-4837dc86654a at NET_0x200000affffd3_UUID:0/0 lens 296/560 e 0 to 0 dl 1246615648 ref 1 fl Interpret:/0/0 rc 0/0
> LustreError: 3584:0:(mds_open.c:1567:mds_close()) Skipped 170 previous similar messages
> LustreError: 3584:0:(mds_open.c:1567:mds_close()) @@@ no handle for file close ino 30863197: cookie 0xafec728150bfaa7b  req at f6f6aa00 x1207616/t0 o35->b058f8e1-dc62-9e9d-f480-a38c8fe5f36d@:0/0 lens 296/560 e 0 to 0 dl 1246615651 ref 1 fl Interpret:/0/0 rc 0/0
> LustreError: 3584:0:(mds_open.c:1567:mds_close()) Skipped 18 previous similar messages
> LustreError: 3583:0:(mds_open.c:1567:mds_close()) @@@ no handle for file close ino 34451538: cookie 0xafec728168149d03  req at f3779c00 x1072035/t0 o35->afec1fd6-045f-7d1e-50e5-bbd95a94f117 at NET_0x200000affffc4_UUID:0/0 lens 296/560 e 0 to 0 dl 1246615656 ref 1 fl Interpret:/0/0 rc 0/0
> LustreError: 3583:0:(mds_open.c:1567:mds_close()) Skipped 548 previous similar messages
> LustreError: 3584:0:(mds_open.c:1567:mds_close()) @@@ no handle for file close ino 34451538: cookie 0xafec728168b5e2b6  req at f7a4662c x318943/t0 o35->8b9a4c7c-d0ca-5a07-2fed-76b2c29a3953@:0/0 lens 296/560 e 0 to 0 dl 1246615664 ref 1 fl Interpret:/0/0 rc 0/0
> LustreError: 3584:0:(mds_open.c:1567:mds_close()) Skipped 174 previous similar messages
> LustreError: 5072:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-116)  req at f5da4600 x458268/t0 o35->e99372df-e6db-3adb-0884-d238b1ef8a4e@:0/0 lens 296/560 e 0 to 0 dl 1246615672 ref 1 fl Interpret:/0/0 rc -116/0
> LustreError: 5072:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 1291 previous similar messages
> LustreError: 5126:0:(mds_open.c:1567:mds_close()) @@@ no handle for file close ino 30863061: cookie 0xafec72814a9a70e8  req at f37cbe00 x3264550/t0 o35->90ff7cc4-8f14-5c0e-90ce-1e7fb80533ce@:0/0 lens 296/560 e 0 to 0 dl 1246615680 ref 1 fl Interpret:/0/0 rc 0/0
> LustreError: 5126:0:(mds_open.c:1567:mds_close()) Skipped 435 previous similar messages
> LustreError: 5148:0:(mds_open.c:1567:mds_close()) @@@ no handle for file close ino 34524483: cookie 0xafec728168c287d4  req at f3690c00 x383668/t0 o35->cc70305b-4ca6-dc64-6f55-97299fc52fd5@:0/0 lens 296/560 e 0 to 0 dl 1246615720 ref 1 fl Interpret:/0/0 rc 0/0
> LustreError: 5148:0:(mds_open.c:1567:mds_close()) Skipped 51 previous similar messages
> LustreError: 3546:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-43)  req at f5f46c00 x3532/t0 o36->25465654-6df2-739c-a30a-e215b53e324e@:0/0 lens 344/296 e 0 to 0 dl 1246616320 ref 1 fl Interpret:/0/0 rc 0/0
> LustreError: 3546:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 440 previous similar messages

For each attempted access, the list grows. The inodes are the ones of
the files of the users homedir. The OSS log no error.

Anyone a clue why this happens? And why only with this user? All other
users are working fine.

Cheers
Arne
-- 
Arne Brutschy
Ph.D. Student                    Email    arne.brutschy(AT)ulb.ac.be
IRIDIA CP 194/6                  Web      iridia.ulb.ac.be/~abrutschy
Universite' Libre de Bruxelles   Tel      +32 2 650 3168
Avenue Franklin Roosevelt 50     Fax      +32 2 650 2715
1050 Bruxelles, Belgium          (Fax at IRIDIA secretary)




More information about the lustre-discuss mailing list