[lustre-discuss] MDS crashing: unable to handle kernel paging request at 00000000deadbeef (iam_container_init+0x18/0x70)

Mohr Jr, Richard Frank (Rick Mohr) rmohr at utk.edu
Wed Apr 13 11:16:09 PDT 2016


> On Apr 13, 2016, at 8:02 AM, Tommi T <tommi_t77 at yahoo.com> wrote:
> 
> We had to use lustre-2.5.3.90 on the MDS servers because of memory leak.
> 
> https://jira.hpdd.intel.com/browse/LU-5726

Mark,

If you don’t have the patch for LU-5726, then you should definitely try to get that one.  If nothing else, reading through the bug report might be useful.  It details some of the MDS OOM problems I had and mentions setting vm.zone_reclaim_mode=0.  It also has Robin Humble’s suggestion of setting "options libcfs cpu_npartitions=1” (which is something that I started doing as well).

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu



More information about the lustre-discuss mailing list