[Lustre-discuss] lustre client 1.6.5.1 hangs

Heiko Schroeter schroete at iup.physik.uni-bremen.de
Thu Jul 10 23:24:31 PDT 2008


Am Donnerstag, 10. Juli 2008 19:35:57 schrieben Sie:

Hi.
>
> > OST lustre mkfs:
> > "mkfs.lustre --param="failover.mode=failout" --fsname
>
>                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Given this (above) parameter setting...

Is 'failout' not ok ? Actually we like to use it because we like to use the 
lustre system as a huge expandable data archive system. If one OST breaks 
down and destroys the data on it we can restore them.

> > scia --ost --mkfsoptions='-i 2097152 -E stride=16 -b
> > 4096' --mgsnode=mds1lustre at tcp0 /dev/sdb"
> >
> > The following procedure hangs a client:
> > 1) copy files to the lustre system
> > 2) do a 'du -sh /mnt/testfs/willi' while copying
> > 3) unmount an OST (here OST0003) while copying
>
> Do you expect that the copy and du (which are both running at the same
> time while you unmount the OST, right?

Right.

> ) should both get EIOs? 

Actually i do expect the client not tho hang any job that acesses the file 
systerm in this moment. If that needs an EIO and KILL of that process this is 
fine by me.

> > What is the proper way (besides avoiding the use of 'du') to reactivate
> > the client file system ?
>
> Well, in fact the du and the copy should both EIO when they get to
> trying to write to the unmounted OST.
>
> Can you get a stack trace (sysrq-t) on the client after you have
> unmounted the OST and processes are hung/blocked?

I will get this done today. If the output is very large can i zip it and 
attach it ?

Thank you.
Heiko



More information about the lustre-discuss mailing list