[lustre-discuss] anybody is using lbug lnet upcall script?

Alexander Zarochentsev alexander.zarochentsev at seagate.com
Mon Jul 25 12:23:21 PDT 2016


Hello,

there is a LBUG lnet upcall called right after LBUG() and before
entering panic().
It is asynchronous but due to memory allocation inside kernel's
call_usermodehelper() it may hang causing panic() not invoked. in
result a  sysadmin cannot  get a crash dump for debugging another
issue. The fact that lnet_upcall is pointing to non existing path by
default doesn't change anything, the system may hang anyway.

I filed LU-8418 with a patch for that. The patch allows to skip any
attempt to call lnet upcall and make getting a crash dump more
reliable.

The question is whether default libcfs_lnet_upcall() should be changed
to not calling lnet_upcall.  The patch contains such a change.

The idea behind changing the default behavior was that lustre source
doesn't contain any implementation of lnet upcall but the default
value of "lnet_upcall" points to nowhere. I think very few Lustre
installations uses own lbug lnet upcall script but other installs just
uses the default settings and non-working lnet upcall with a potential
risk of not calling panic() after LBUG() -- i.e. system does not
reboot or does not produce a crash dump when expected.

It is interesting does anybody really use lbug lnet upcall script nowdays?

Thanks,

-- 
Alexander Zarochentsev
Seagate Technology, LLC
www.seagate.com


More information about the lustre-discuss mailing list