[lustre-discuss] MDS crashing: unable to handle kernel paging request at 00000000deadbeef (iam_container_init+0x18/0x70)

Wed Apr 13 09:47:12 PDT 2016

> On Apr 12, 2016, at 6:46 PM, Mark Hahn <hahn at mcmaster.ca> wrote:
> 
> all our existing Lustre MDSes run happily with vm.zone_reclaim_mode=0,
> and making this one consistent appears to have resolved a problem
> (in which one family of lustre kernel threads would appear to spin,
> "perf top" showing nearly all time spent in spinlock_irq.  iirc.)
> 
> might your system have had a *lot* of memory?  ours tend to be fairly modest (32-64G, dual-socket intel.)

I have 64 GB on my servers.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu