[Lustre-discuss] lustre client 1.6.5.1 hangs

Brian J. Murrell Brian.Murrell at Sun.COM
Thu Jul 10 10:35:57 PDT 2008


On Thu, 2008-07-10 at 10:25 +0200, Heiko Schroeter wrote:
> Hello,

Hi.

> OST lustre mkfs:
> "mkfs.lustre --param="failover.mode=failout" --fsname 
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Given this (above) parameter setting...

> scia --ost --mkfsoptions='-i 2097152 -E stride=16 -b 
> 4096' --mgsnode=mds1lustre at tcp0 /dev/sdb"

> The following procedure hangs a client:
> 1) copy files to the lustre system
> 2) do a 'du -sh /mnt/testfs/willi' while copying
> 3) unmount an OST (here OST0003) while copying

Do you expect that the copy and du (which are both running at the same
time while you unmount the OST, right?) should both get EIOs?

> Deactivating/Reactivating or remounting the OST does not have any effect on 
> the 'du' job. The 'du' job (#29665 see process list below) and the 
> correpsonding lustre thread (#29694) cannot be killed manually.

That latter process (ll_sa_29665) is statahead at work.

> What is the proper way (besides avoiding the use of 'du') to reactivate the 
> client file system ?

Well, in fact the du and the copy should both EIO when they get to
trying to write to the unmounted OST.

Can you get a stack trace (sysrq-t) on the client after you have
unmounted the OST and processes are hung/blocked?

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20080710/dcaeb63f/attachment.pgp>


More information about the lustre-discuss mailing list