[Lustre-discuss] possibly corrupt stat() returns

John White jwhite at lbl.gov
Tue May 31 15:25:17 PDT 2011


Hello Folks,
	We've got this perl script that performs our purges based on the atime returned from a stat() call.  Over the weekend, it would appear, our script got back millions of corrupted or misreported atimes and, lucky for us, unlinked a whole bunch of files (on the order of 45TB).  The only indication anything might have happened was the following sitting in the dmesg of the lustre client that houses the purge script:
*snip*
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) failure -2 inode 181017071
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) Skipped 19 previous similar messages
LustreError: 13381:0:(file.c:3312:ll_inode_revalidate_fini()) failure -2 inode 161973341
LustreError: 13381:0:(file.c:3312:ll_inode_revalidate_fini()) Skipped 25 previous similar messages
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) failure -2 inode 162433196
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) Skipped 32 previous similar messages
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) failure -2 inode 174530765
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) Skipped 33 previous similar messages
*snip*

Any ideas or experience with poorly reported atimes under lustre?


----------------
John White
High Performance Computing Services (HPCS)
(510) 486-7307
One Cyclotron Rd, MS: 50B-3209C
Lawrence Berkeley National Lab
Berkeley, CA 94720




More information about the lustre-discuss mailing list