[Lustre-discuss] Lustre locking up on login/interactive nodes

Brock Palen brockp at umich.edu
Mon Jul 21 09:04:54 PDT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Jul 21, 2008, at 11:51 AM, Brian J. Murrell wrote:
> On Mon, 2008-07-21 at 11:43 -0400, Brock Palen wrote:
>> Every so often lustre locks up. It will recover eventually. The
>> process show this self's in 'D'  Uninterruptible IO Wait.  This case
>> it was 'ar' making an archive.
>>
>> Dmesg then shows:
>
> Syslog is usually a better place to get messages from as it gives some
> context as to the time of the messages.

Ok will keep in mind. Looks the same though, Its odd though, if I  
login to the same machine I can move to that directory list the files  
etc.  read files on those OST's  and notice this was eviction by the  
MDS,

I see no lost network connections or network errors.  Strange not  
good not good at all.
The syslog data is the same, its below:

Brock


Jul 21 11:38:39 nyx-login1 kernel: Lustre: nobackup-MDT0000- 
mdc-00000101fc467800: Connection to service nobackup-MDT0000 via nid  
141.212.30.184 at tcp was lost; in progress operations using this  
service will wait for recovery to complete.Jul 21 11:38:39 nyx-login1  
kernel: LustreError: 167-0: This client was evicted by nobackup- 
MDT0000; in progress operations using this service will fail.Jul 21  
11:38:39 nyx-login1 kernel: LustreError: 17575:0:(client.c: 
519:ptlrpc_import_delay_req()) @@@ IMP_INVALID  req at 0000010189e2f400  
x912452/t0 o101->nobackup-MDT0000_UUID at 141.212.30.184@tcp:12 lens  
488/768 ref 1 fl Rpc:P/0/0 rc 0/0Jul 21 11:38:39 nyx-login1 kernel:  
LustreError: 17575:0:(mdc_locks.c:423:mdc_finish_enqueue())  
ldlm_cli_enqueue: -108Jul 21 11:38:39 nyx-login1 kernel: LustreError:  
27076:0:(client.c:519:ptlrpc_import_delay_req()) @@@ IMP_INVALID   
req at 00000101ed528a00 x912464/t0 o101->nobackup- 
MDT0000_UUID at 141.212.30.184@tcp:12 lens 440/768 ref 1 fl Rpc:/0/0 rc  
0/0Jul 21 11:38:39 nyx-login1 kernel: LustreError: 27076:0: 
(mdc_locks.c:423:mdc_finish_enqueue()) ldlm_cli_enqueue: -108Jul 21  
11:38:39 nyx-login1 kernel: LustreError: 27489:0:(file.c: 
97:ll_close_inode_openhandle()) inode 12653753 mdc close failed: rc =  
- -108Jul 21 11:38:39 nyx-login1 kernel: LustreError: 27489:0:(file.c: 
97:ll_close_inode_openhandle()) inode 12195682 mdc close failed: rc =  
- -108Jul 21 11:38:39 nyx-login1 kernel: LustreError: 27489:0:(file.c: 
97:ll_close_inode_openhandle()) Skipped 46 previous similar  
messagesJul 21 11:38:39 nyx-login1 kernel: Lustre: nobackup-MDT0000- 
mdc-00000101fc467800: Connection restored to service nobackup-MDT0000  
using nid 141.212.30.184 at tcp.Jul 21 11:38:39 nyx-login1 kernel:  
LustreError: 11-0: an error occurred while communicating with  
141.212.30.184 at tcp. The mds_close operation failed with -116Jul 21  
11:38:39 nyx-login1 kernel: LustreError: 11-0: an error occurred  
while communicating with 141.212.30.184 at tcp. The mds_close operation  
failed with -116Jul 21 11:38:39 nyx-login1 kernel: LustreError:  
26930:0:(file.c:97:ll_close_inode_openhandle()) inode 11441446 mdc  
close failed: rc = -116Jul 21 11:38:39 nyx-login1 kernel:  
LustreError: 26930:0:(file.c:97:ll_close_inode_openhandle()) Skipped  
113 previous similar messages

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)

iD8DBQFIhLOqMFCQB4Bvz5QRAgWvAJ9HhQAo9JZdcS2iyMFb19HzcgkwcQCdGosB
sHaligENGxnJHdMu5116D5U=
=GOlg
-----END PGP SIGNATURE-----



More information about the lustre-discuss mailing list