[Lustre-discuss] Memory (?) problem with 1.8.1

Mon Oct 12 17:44:02 PDT 2009

Do you have OSS readcache on?

Check out
https://bugzilla.lustre.org/show_bug.cgi?id=20778
and
https://bugzilla.lustre.org/show_bug.cgi?id=18571

David

David Simas wrote:
> Hello,
> 
> We have a Lustre 1.8.1 file system about 60 TB in size running on
> RHEL 5 x86_64.  (I can provide hardware details if anyone thinks
> they'd be relevant.)  We are seeing memory problems after several
> days of sustained I/O into that file system.  We are writing from
> a small number of clients (4 - 5) at an average rate of 50 MB/s, with
> peaks of 350 MB/s.  We read all the data at least twice before deleting
> them.  During this operation, we notice the value of "buffers"
> reported in '/proc/meminfo' on the OSSs involved increasing monotonically
> until it apparently take up all the system's memory - 32 GB.  Then 'kswapd'
> starts consuming a large amount of CPU, the load increases (100+), and the
> system, including Lustre, slows to crawl and becomes quite useless.  If we
> stop Lustre I/O at this point, 'kswapd' and the system load calm down, but
> the "buffers" value does not decrease.  Any I/O on the system will then
> (dd if=/dev/urandom of=/tmp/test ...) will cause 'kswapd' to run away
> again.  We have observed the monotonically increasing "buffers" condition
> with non-Lustre I/O on systems running the Lustre 1.8.1 kernel
> (2.6.18-128.1.14.el5_lustre.1.8.1), but we haven't gotten them to point
> where 'kswapd' goes wild.
> 
> Has anyboy else seen anything like this?
> 
> David Simas
> SLAC
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss