[lustre-discuss] MDS crashing: unable to handle kernel paging request at 00000000deadbeef (iam_container_init+0x18/0x70)
Mohr Jr, Richard Frank (Rick Mohr)
rmohr at utk.edu
Tue Apr 12 14:53:53 PDT 2016
> On Apr 12, 2016, at 4:49 PM, Mark Hahn <hahn at mcmaster.ca> wrote:
> Our problem seems to correlate with an unintentional creation of a tree of >500M files. Some of the crashes we've had since then appeared
> to be related to vm.zone_reclaim_mode=1. We also enabled quotas right after the 500M file thing, and were thinking that inconsistent
> quota records might cause this sort of crash.
Have you set vm.zone_reclaim_mode=0 yet? I had an issue with this on my file system a while back when it was set to 1.
Senior HPC System Administrator
National Institute for Computational Sciences
More information about the lustre-discuss