[lustre-discuss] anybody is using lbug lnet upcall script?
Alexander Zarochentsev
alexander.zarochentsev at seagate.com
Mon Jul 25 12:23:21 PDT 2016
Hello,
there is a LBUG lnet upcall called right after LBUG() and before
entering panic().
It is asynchronous but due to memory allocation inside kernel's
call_usermodehelper() it may hang causing panic() not invoked. in
result a sysadmin cannot get a crash dump for debugging another
issue. The fact that lnet_upcall is pointing to non existing path by
default doesn't change anything, the system may hang anyway.
I filed LU-8418 with a patch for that. The patch allows to skip any
attempt to call lnet upcall and make getting a crash dump more
reliable.
The question is whether default libcfs_lnet_upcall() should be changed
to not calling lnet_upcall. The patch contains such a change.
The idea behind changing the default behavior was that lustre source
doesn't contain any implementation of lnet upcall but the default
value of "lnet_upcall" points to nowhere. I think very few Lustre
installations uses own lbug lnet upcall script but other installs just
uses the default settings and non-working lnet upcall with a potential
risk of not calling panic() after LBUG() -- i.e. system does not
reboot or does not produce a crash dump when expected.
It is interesting does anybody really use lbug lnet upcall script nowdays?
Thanks,
--
Alexander Zarochentsev
Seagate Technology, LLC
www.seagate.com
More information about the lustre-discuss
mailing list