[lustre-discuss] MDS crashing: unable to handle kernel paging request at 00000000deadbeef (iam_container_init+0x18/0x70)
Mohr Jr, Richard Frank (Rick Mohr)
rmohr at utk.edu
Wed Apr 13 11:16:09 PDT 2016
> On Apr 13, 2016, at 8:02 AM, Tommi T <tommi_t77 at yahoo.com> wrote:
>
> We had to use lustre-2.5.3.90 on the MDS servers because of memory leak.
>
> https://jira.hpdd.intel.com/browse/LU-5726
Mark,
If you don’t have the patch for LU-5726, then you should definitely try to get that one. If nothing else, reading through the bug report might be useful. It details some of the MDS OOM problems I had and mentions setting vm.zone_reclaim_mode=0. It also has Robin Humble’s suggestion of setting "options libcfs cpu_npartitions=1” (which is something that I started doing as well).
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
More information about the lustre-discuss
mailing list