[Lustre-discuss] 1.8.1(-ish) client vs. 1.6.7.2 server

Brian J. Murrell Brian.Murrell at Sun.COM
Wed Jul 15 08:59:54 PDT 2009


On Wed, 2009-07-15 at 11:22 -0400, Robin Humble wrote:
> all kernels all compiled with the rhel5 kernel tree's standard OFED.
> I think 1.3.2 is what's in rhel5.3/centos5.3?

Yeah, something like that IIRC.

> the error messages are just on the initial mount of the first lustre fs.
> subsequent mounts of other lustre fs's don't get any messages, so it
> seems like it's just an extremely noisy protocol/version negotiation
> the first time the 1.8.1 lnet fires up and tries to talk to 1.6.7.2
> servers??

Maybe one of our LNET experts might have some additional information to
offer.

> another data point is that the above errors don't happen with
> 2.6.18-128.1.14.el5 patched with 1.8.0.1 and using the same in-kernel
> OFED, so it's probably something that's happened between 1.8.0.1 and
> 1.8.1-pre.
> or I guess it could be a rhel change between 2.6.18-128.1.14.el5 and
> 2.6.18-128.1.16.el5, but that seems less likely.
> I can spin up a 2.6.18-128.1.14.el5 with b_release_1_8_1 if you like...

Yeah, it would be a great troubleshooting addition to see if the same
kernel on the clients and servers with the different lustre versions has
the same problem.  This would isolate the problem either to or away from
a problem with the difference in OFED stacks.

> cool. thanks for the explanation.

NP.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090715/2e608c5d/attachment.pgp>


More information about the lustre-discuss mailing list