[Lustre-discuss] LustreError: 5920:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-19)

Roger Spellman Roger.Spellman at terascala.com
Sun Aug 8 12:19:11 PDT 2010


Hi,

We have a customer that is down right now, and is not able to run jobs.

The file systems that was up and running fine for weeks.  Then, on 8/1, one of the OSTs was having IB problems.  I unmounted the OST, fixed the IB problem (reseated the cable), then tried mounting the OST, but the mount never completed.  

I tried a soft reboot, but that failed.  So, I did another hard reboot.

When it came back up, I ran tunefs.lustre, and it gave this strange output:

checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata
   Read previous values:
Target:
Index:      unassigned
Lustre FS:  lustre
Mount type: ldiskfs
Flags:      0x70
              (needs_index first_time update )
Persistent mount opts:
Parameters:
tunefs.lustre: exiting with 22 (Invalid argument)


I realized that something was corrupted.  So, I mounted it as ldiskfs, removed the last_rcvd file,and ran:

tunefs.lustre --verbose --erase-param --mgsnode=192.168.2.11 at o2ib --mgsnode=192.168.2.12 at o2ib
--writeconf --fsname=tslstr --ost --index=1 /dev/mapper/map0

I then ran tunefs.lustre on this node, and it looked as expected, namely:

checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata
   Read previous values:
Target:     tslstr-OST0001
Index:      1
Lustre FS:  tslstr
Mount type: ldiskfs
Flags:      0x142
              (OST update writeconf )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=192.168.2.11 at o2ib mgsnode=192.168.2.12 at o2ib


   Permanent disk data:
Target:     tslstr-OST0001
Index:      1
Lustre FS:  tslstr
Mount type: ldiskfs
Flags:      0x142
              (OST update writeconf )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=192.168.2.11 at o2ib mgsnode=192.168.2.12 at o2ib

exiting before disk write.


I remounted the OST, and I am now getting:


LDISKFS-fs: file extents enabled
LDISKFS-fs: mballoc enabled
Lustre: MGC192.168.2.11 at o2ib: Reactivating import
LustreError: 137-5: UUID 'tslstr-OST0001_UUID' is not available  for connect (no target)
LustreError: 5920:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-19)
req at ffff81011207c800 x913142/t0 o8-><?>@<?>:0/0 lens 304/0 e 0 to 0 dl 1281120645 ref 1 fl
Interpret:/0/0 rc -19/0
LustreError: 137-5: UUID 'tslstr-OST0001_UUID' is not available  for connect (no target)
LustreError: 5921:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-19)
req at ffff81011207c400 x913148/t0 o8-><?>@<?>:0/0 lens 304/0 e 0 to 0 dl 1281120670 ref 1 fl
Interpret:/0/0 rc -19/0

The customer is unable to make any progress right now.  Any help would be greatly appreciated.

Thanks.

Roger

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100808/4d5206a5/attachment.htm>


More information about the lustre-discuss mailing list