[Lustre-discuss] OSTs hanging while running IOR

Wed Sep 9 15:30:18 PDT 2009

Im attaching the messages (only the error part) file so we don't have these mail formatting problems.

------

Can you provide a bit more of the log before the above so we can see what the stack trace is in reference to?  Also, try to
eliminate the white-space between lines.  Are you getting any other errors or messages from Lustre prior to that?

Perhaps you are getting some messages saying that various operations are "slow"?

>> Even beeing slow, the OST should respond right ? It "hangs".

Have you tuned these OSSes with respect to the number of OST threads needed to drive (and not over-drive) your disks?  The lustre
iokit is useful for that tuning.

>> Ok, tuning for performance is okay, but hanging with 20 nodes (IOR MPI).. strange right ?

b.

-----

I'm using 3 raid 5 with 8 disks each and 256 OST threads on each OSS.

root at a02n00:~# cat /etc/mdadm.conf
ARRAY /dev/md10 level=raid5 num-devices=8 devices=/dev/dm-0,/dev/dm-1,/dev/dm-2,/dev/dm-3,/dev/dm-4,/dev/dm-5,/dev/dm-6,/dev/dm-7
ARRAY /dev/md11 level=raid5 num-devices=8
devices=/dev/dm-8,/dev/dm-9,/dev/dm-10,/dev/dm-11,/dev/dm-12,/dev/dm-13,/dev/dm-14,/dev/dm-15
ARRAY /dev/md12 level=raid5 num-devices=8
devices=/dev/dm-16,/dev/dm-17,/dev/dm-18,/dev/dm-19,/dev/dm-20,/dev/dm-21,/dev/dm-22,/dev/dm-23

All my OSTs were created with internal journal (for test pourposes).

mkfs.lustre --r --ost --fsname=work --mkfsoptions="-b 4096 -E stride=32,stripe-width=224 -m 0" --mgsnid=a03n00 at o2ib
--mgsnid=b03n00 at o2ib /dev/md[10|11|12]

Im using separete mdt and mgs:

# MGS
mkfs.lustre --fsname=work --r --mgs --mkfsoptions="-b 4096 -E stride=4,stripe-width=4 -m 0" --mountfsoptions=acl
--failnode=b03n00 at o2ib /dev/sdb1

# MDT
mkfs.lustre --fsname=work --r --mgsnid=a03n00 at o2ib --mgsnid=b03n00 at o2ib --mdt --mkfsoptions="-b 4096 -E stride=4,stripe-width=40 -m
0" --mountfsoptions=acl --failnode=b03n00 at o2ib /dev/sdc1

I'm using these packages on server:
----------
root at a03n00:~# rpm -aq | grep -i lustre
lustre-modules-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
lustre-client-modules-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
lustre-ldiskfs-3.0.9-2.6.18_128.1.14.el5_lustre.1.8.1
kernel-lustre-headers-2.6.18-128.1.14.el5_lustre.1.8.1
kernel-lustre-2.6.18-128.1.14.el5_lustre.1.8.1
lustre-client-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
kernel-lustre-devel-2.6.18-128.1.14.el5_lustre.1.8.1
lustre-1.8.1-2.6.18_128.1.14.el5_lustre.1.8.1
kernel-ib-1.4.1-2.6.18_128.1.14.el5_lustre.1.8.1
----------
On client Ive compiled kernel 2.6.18-128.el5 without INFINIBAND support.
Then compiled OFED 1.4.1 and after that compile patchless client.
For the patchless client, compiled with:
--ofa-kernel=/usr/src/ofa_kernel
----------

* THE ERROR

Using: 

root at b00n00:~# mpirun -hostfile ./lustre.hosts -np 20 /hpc/IOR -w -r -C -i 2 -b 1G -t 512k -F -o /work/stripe12/teste

for example starts "hanging" the OSTs and the filesystem "hangs". 
Any atempt to rm or read a file (or df -kh) hangs and keeps forever (not even kill -9 solves).

With that.. I cannot umount my OSTs on the OSSs.
And I have to "reboot" the server, and my raids starts resyncing.

Tinoco