[Lustre-discuss] Issues with Lustre Client 1.8.4 and Server 1.8.1.1

Robin Humble robin.humble+lustre at anu.edu.au
Wed Oct 13 16:21:16 PDT 2010


On Wed, Oct 13, 2010 at 02:33:35PM -0700, Jagga Soorma wrote:
>Doing a ps just hangs on the system and I need to just close and reopen a
>session to the effected system.  The application (gsnap) is running from the
>lustre filesystem and doing all IO to the lustre fs.  Here is a strace of
>where ps hangs:

one possible cause of hung processes (that's not Lustre related) is the
VM tying itself in knots. are your clients NUMA machines?
is /proc/sys/vm/zone_reclaim_mode = 0?

I guess this explanation is a bit unlikely if your only change is the
client kernel version, but you don't say what you changed it from and
I'm not familiar with SLES, so the possibility is there, and it's an
easy fix (or actually a dodgy workaround) if that's the problem.

--
Dr Robin Humble, HPC Systems Analyst, NCI National Facility



More information about the lustre-discuss mailing list