[lustre-devel] lnet_upcall on LBUG & LU-8418

Oleg Drokin oleg.drokin at intel.com
Thu Sep 1 10:37:24 PDT 2016


On Aug 30, 2016, at 11:16 AM, Patrick Farrell wrote:

> Hello,
> 
> Currently, on LBUG, Lustre tries to call a usermode helper at '/usr/lib/lustre/lnet_upcall'.  This is for some sort of binary that a user would like executed before the LBUG itself (by default, a panic) happens.  Lustre does not include an lnet_upcall script, so by default, the call fails.

I think this i a throwback to prehistoric times when BUG was not causing panic
by default to try and copy stuff off the node with no local storage
before it's killed.
Modern crashdumping more or less superseded is.

> Unfortunately, in extremely low memory situations, the attempt to make this call can hang, resulting in a node which is in an invalid state but will not actually panic.  This is quite problematic as it can, for example, prevent failover or dump collection (for debugging purposes), depending on how a system is configured.
> 
> LU-8418 (from Alexander Zarochentsev) is looking to disable this by default.  As Andreas Dilger pointed out in the patch review (http://review.whamcloud.com/#/c/21440/), this would break any existing users who had put their script in that location.
> 
> But I suspect no one is actually using this feature.
> 
> So:
> Do you use (or know of anyone using) the lnet_upcall feature to call a binary before LBUG?  (I'm looking for end user uses; if a developer is using it, I think it's reasonable to ask them to set it manually.)

I think it's unused so it should be safe to kill it, but let's see if anybody shows up
indeed.

Bye,
    Oleg


More information about the lustre-devel mailing list