[Lustre-discuss] LustreError: server_bulk_callback

Andreas Dilger adilger at sun.com
Fri Sep 26 02:15:15 PDT 2008


On Sep 24, 2008  17:22 -0600, Nathan Dauchy wrote:
> We have 4 OSS nodes and 2 MDS nodes configured in HA pairs, running
> 2.6.18-53.1.14.el5_lustre.1.6.5smp, and using the o2ib network
> transport.  We had multiple failovers recently (possibly due to hardware
> problems, but no root cause yet) and managed to get things back again to
> what I _thought_ was a normal state.
> 
> However, in the system log we are seeing many "server_bulk_callback"
> error messages at the rate of ~6 per second.  Interestingly, they only
> come from one HA pair of OSS nodes:
> 
> Sep 24 23:03:14 lfs-oss-0-3 kernel: LustreError:
> 20694:0:(events.c:361:server_bulk_callback()) event type 4, status -103,
> desc ffff81019fce6000
> Sep 24 23:03:14 lfs-oss-0-3 kernel: LustreError:
> 20694:0:(events.c:361:server_bulk_callback()) event type 2, status -103,
> desc ffff81019fce6000
> Sep 24 23:03:16 lfs-oss-0-2 kernel: LustreError:
> 27257:0:(events.c:361:server_bulk_callback()) event type 4, status -103,
> desc ffff8101b52b8000
> Sep 24 23:03:16 lfs-oss-0-2 kernel: LustreError:
> 27257:0:(events.c:361:server_bulk_callback()) event type 2, status -103,
> desc ffff8101b52b8000
> 
> Can anyone direct me to documentation to decipher these messages?
> What does "server_bulk_callback" do, and does "status -103" indicate a
> severe problem for event types 2 and 4?

All Lustre error numbers are from /usr/include/asm/errno.h.  In this
case, -103 = -ECONNABORTED.  My guess would be some kind of networking
issue being hit by LNET, because that isn't an error used by the Lustre
filesystem itself.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list