[Lustre-discuss] Rx failures

Bernd Schubert bs_lists at aakef.fastmail.fm
Thu Feb 11 15:31:52 PST 2010


On Thursday 11 February 2010, Ulrich Sibiller wrote:
> Ulrich Sibiller schrieb:
> > Feb 10 13:33:24 hpc9master02 kernel: LustreError:
> > 4475:0:(lib-move.c:2436:LNetPut()) Error sending PUT to
> > 12345-192.168.60.239 at o2ib: -113
> >
> > Feb  2 16:08:19 hpc9oss1 kernel: Lustre:
> > 7937:0:(o2iblnd_cb.c:2220:kiblnd_passive_connect()) Conn stale
> > 192.168.60.226 at o2ib [old ver: 12, new ver: 12]
> >
> > Feb  2 15:59:27 hpc9mds1 kernel: Lustre:
> > 5008:0:(o2iblnd_cb.c:2232:kiblnd_passive_connect()) Conn race
> > 192.168.60.226 at o2ib
> 
> For the records: Finally I found the source of these problems: We had two
>  IPoIB interfaces in the fabric using the same IP address
>  (192.168.60.226)...

I guess next time you should run a lnet_selftest and "lctl ping".


Greetings from Tübingen,
Bernd


-- 
Bernd Schubert
DataDirect Networks



More information about the lustre-discuss mailing list