[lustre-discuss] REJ reason: stale connection

suresh babu sureshbabu at msystechnologies.com
Tue Jun 23 03:50:06 PDT 2015


Dear Team,

I have 4 OST targets mounted in two OSS nodes each. Totally 8 OST targets
mounted in two OSS nodes. Whenever I poweroff/restart any OSS node, OST
targets corresponding to thst node will failover to other OSS node.

I am using stonith_admin(fencing) command to restart the OSS nodes. Eg:
stonith_admin -B node1. Once the node successfully gets restarts then OST
targets specific to that node will be mounted in it properly. But
sometimes, I am getting the below error after the node1 restarts and OST
targets not at all getting mounted:

scsi host7: ib_srp: Connection failed
scsi host8: ib_srp: Connection failed
scsi host9: ib_srp: Connection failed
scsi host10: ib_srp: Connection failed
scsi host11: ib_srp: Connection failed
scsi host12: ib_srp: Connection failed

scsi host7:   REJ reason: stale connection
scsi host7: ib_srp: retrying stale connection
scsi host7:   REJ reason: stale connection
scsi host7: ib_srp: retrying stale connection
scsi host7:   REJ reason: stale connection
scsi host7: ib_srp: retrying stale connection
scsi host7:   REJ reason: stale connection
scsi host7: ib_srp: giving up on stale connection
scsi host8:   REJ reason: stale connection
scsi host8: ib_srp: retrying stale connection
scsi host8:   REJ reason: stale connection
scsi host8: ib_srp: retrying stale connection
scsi host8:   REJ reason: stale connection
scsi host8: ib_srp: retrying stale connection
scsi host8:   REJ reason: stale connection
scsi host8: ib_srp: giving up on stale connection
scsi host9:   REJ reason: stale connection
scsi host9: ib_srp: retrying stale connection
scsi host9:   REJ reason: stale connection
scsi host9: ib_srp: retrying stale connection
scsi host9:   REJ reason: stale connection
scsi host9: ib_srp: retrying stale connection
scsi host9:   REJ reason: stale connection
scsi host9: ib_srp: giving up on stale connection
scsi host10:   REJ reason: stale connection
scsi host10: ib_srp: retrying stale connection
scsi host10:   REJ reason: stale connection
scsi host10: ib_srp: retrying stale connection
scsi host10:   REJ reason: stale connection

Could I get help on avoiding these stale connections.

Regards,
Suresh Babu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20150623/7972417e/attachment.htm>


More information about the lustre-discuss mailing list