[Lustre-discuss] LustreError: 5920:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-19)
Roger Spellman
Roger.Spellman at terascala.com
Sun Aug 8 12:19:11 PDT 2010
Hi,
We have a customer that is down right now, and is not able to run jobs.
The file systems that was up and running fine for weeks. Then, on 8/1, one of the OSTs was having IB problems. I unmounted the OST, fixed the IB problem (reseated the cable), then tried mounting the OST, but the mount never completed.
I tried a soft reboot, but that failed. So, I did another hard reboot.
When it came back up, I ran tunefs.lustre, and it gave this strange output:
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata
Read previous values:
Target:
Index: unassigned
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x70
(needs_index first_time update )
Persistent mount opts:
Parameters:
tunefs.lustre: exiting with 22 (Invalid argument)
I realized that something was corrupted. So, I mounted it as ldiskfs, removed the last_rcvd file,and ran:
tunefs.lustre --verbose --erase-param --mgsnode=192.168.2.11 at o2ib --mgsnode=192.168.2.12 at o2ib
--writeconf --fsname=tslstr --ost --index=1 /dev/mapper/map0
I then ran tunefs.lustre on this node, and it looked as expected, namely:
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata
Read previous values:
Target: tslstr-OST0001
Index: 1
Lustre FS: tslstr
Mount type: ldiskfs
Flags: 0x142
(OST update writeconf )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=192.168.2.11 at o2ib mgsnode=192.168.2.12 at o2ib
Permanent disk data:
Target: tslstr-OST0001
Index: 1
Lustre FS: tslstr
Mount type: ldiskfs
Flags: 0x142
(OST update writeconf )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=192.168.2.11 at o2ib mgsnode=192.168.2.12 at o2ib
exiting before disk write.
I remounted the OST, and I am now getting:
LDISKFS-fs: file extents enabled
LDISKFS-fs: mballoc enabled
Lustre: MGC192.168.2.11 at o2ib: Reactivating import
LustreError: 137-5: UUID 'tslstr-OST0001_UUID' is not available for connect (no target)
LustreError: 5920:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-19)
req at ffff81011207c800 x913142/t0 o8-><?>@<?>:0/0 lens 304/0 e 0 to 0 dl 1281120645 ref 1 fl
Interpret:/0/0 rc -19/0
LustreError: 137-5: UUID 'tslstr-OST0001_UUID' is not available for connect (no target)
LustreError: 5921:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-19)
req at ffff81011207c400 x913148/t0 o8-><?>@<?>:0/0 lens 304/0 e 0 to 0 dl 1281120670 ref 1 fl
Interpret:/0/0 rc -19/0
The customer is unable to make any progress right now. Any help would be greatly appreciated.
Thanks.
Roger
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100808/4d5206a5/attachment.htm>
More information about the lustre-discuss
mailing list