[Lustre-discuss] Stalled autofs + lustre summary

Heiko Schröter schroete at iup.physik.uni-bremen.de
Mon Nov 23 05:36:57 PST 2009


Am Freitag 20 November 2009 14:15:17 schrieb Brian J. Murrell:

> > When lustre is *NOT* mounted a user can stall the client mount with 'ls /lustre_automount/myfile' (no asterik after myfile !)
> 
> IOW, an invalid filename?

Yes, this behaviour is 100% reproducable with the lustre/autofs versions mentioned.

> > lustre: 1.6.6
> > vanilla-kernel 2.6.22.19
> 
> Ideally on one of the platforms you can download binary RPMs from us for
> (i.e. RHEL5 or SLES10)?

An upgrade to 1.8.x is scheduled for Jan/Feb 2010. Until then i cannot interupt the system because of some important deadlines coming up.
We are bundled to the Gentoo Distro. So a RHEL5/SLES10 Kernel probably won't help.
Installing lustre from an rpm or so would probably not work because of beeing compiled against different libs.

Are there any "killer" options needed within the kernel which are crucial for lustre+autofs ?
Would it make any difference to only update a client ? This could be done quite easily.

> > Nov 19 17:43:10 quadcore2 LustreError: 25321:0:(lib-move.c:111:lnet_try_match_md()) Matching packet from 12345-192.168.16.122 at tcp, match 776 length 1336 too big: 1272 left, 1272 allowed
> 
> I think this is the key to this issue.  There was one or more bugs
> around this symptom fixed in the 1.6.6-1.6.7 time frame.

Is it known if that is fixed in 1.8.x.x ?

We turned of autofs+lustre last week (week 47) and since then we don't have any problems with the fs.

Thanks and Regards
Heiko



More information about the lustre-discuss mailing list