[Lustre-discuss] 1.8.1(-ish) client vs. 1.6.7.2 server

Robin Humble robin.humble+lustre at anu.edu.au
Tue Jul 21 08:05:57 PDT 2009


I added this to bugzilla.
  https://bugzilla.lustre.org/show_bug.cgi?id=20227

cheers,
robin

On Wed, Jul 15, 2009 at 01:09:33PM -0400, Robin Humble wrote:
>On Wed, Jul 15, 2009 at 08:46:12AM -0400, Robin Humble wrote:
>>I get a ferocious set of error messages when I mount a 1.6.7.2
>>filesystem on a b_release_1_8_1 client.
>>is this expected?
>
>just to annotate the below a bit in case it's not clear... sorry -
>should have done that in the first email :-/
>
>10.8.30.244 is MGS and one MDS, 10.8.30.245 is the other MDS in the
>failover pair. 10.8.30.201 -> 208 are OSS's (one OST per OSS), and the
>fs is mounted in the usual failover way eg.
>  mount -t lustre 10.8.30.244 at o2ib:10.8.30.245 at o2ib:/system /system
>
>from the below (and other similar logs) it kinda looks like the client
>fails and then renegotiates with all the servers.
>
>cheers,
>robin
>--
>Dr Robin Humble, HPC Systems Analyst, NCI National Facility
>
>>  Lustre: 13800:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.8.30.244 at o2ib failed: 5
>>  Lustre: 13799:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.8.30.244 at o2ib failed: 5
>>  Lustre: 615:0:(o2iblnd_cb.c:2384:kiblnd_reconnect()) 10.8.30.244 at o2ib: retrying (version negotiation), 12, 11, queue_dep: 8, max_frag: 256, msg_size: 4096
>>  Lustre: MGC10.8.30.244 at o2ib: Reactivating import
>>  Lustre: 13797:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.8.30.245 at o2ib failed: 5
>>  Lustre: 13798:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.8.30.245 at o2ib failed: 5
>>  Lustre: 615:0:(o2iblnd_cb.c:2384:kiblnd_reconnect()) 10.8.30.245 at o2ib: retrying (version negotiation), 12, 11, queue_dep: 8, max_frag: 256, msg_size: 4096
>>  Lustre: Client system-client has started
>>  Lustre: 13798:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.8.30.201 at o2ib failed: 5
>>  ... last message repeated 17 times ...
>>  Lustre: 615:0:(o2iblnd_cb.c:2384:kiblnd_reconnect()) 10.8.30.201 at o2ib: retrying (version negotiation), 12, 11, queue_dep: 8, max_frag: 256, msg_size: 4096
>>  Lustre: 615:0:(o2iblnd_cb.c:2384:kiblnd_reconnect()) 10.8.30.202 at o2ib: retrying (version negotiation), 12, 11, queue_dep: 8, max_frag: 256, msg_size: 4096
>>  Lustre: 13798:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.8.30.203 at o2ib failed: 5
>>  Lustre: 615:0:(o2iblnd_cb.c:2384:kiblnd_reconnect()) 10.8.30.203 at o2ib: retrying (version negotiation), 12, 11, queue_dep: 8, max_frag: 256, msg_size: 4096
>>  Lustre: 615:0:(o2iblnd_cb.c:2384:kiblnd_reconnect()) 10.8.30.204 at o2ib: retrying (version negotiation), 12, 11, queue_dep: 8, max_frag: 256, msg_size: 4096
>>  Lustre: 13797:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.8.30.205 at o2ib failed: 5
>>  Lustre: 615:0:(o2iblnd_cb.c:2384:kiblnd_reconnect()) 10.8.30.205 at o2ib: retrying (version negotiation), 12, 11, queue_dep: 8, max_frag: 256, msg_size: 4096
>>  Lustre: 615:0:(o2iblnd_cb.c:2384:kiblnd_reconnect()) 10.8.30.206 at o2ib: retrying (version negotiation), 12, 11, queue_dep: 8, max_frag: 256, msg_size: 4096
>>  Lustre: 615:0:(o2iblnd_cb.c:2384:kiblnd_reconnect()) 10.8.30.207 at o2ib: retrying (version negotiation), 12, 11, queue_dep: 8, max_frag: 256, msg_size: 4096
>>  Lustre: 615:0:(o2iblnd_cb.c:2384:kiblnd_reconnect()) 10.8.30.208 at o2ib: retrying (version negotiation), 12, 11, queue_dep: 8, max_frag: 256, msg_size: 4096
>>  Lustre: 13800:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.8.30.208 at o2ib failed: 5
>>
>>looks like it succeeds in the end, but only after a struggle.
>>
>>I don't have any problems with 1.8.1 <-> 1.8.1 or 1.6.7.2 <-> 1.6.7.2.
>>
>>servers are rhel5 x86_64 2.6.18-92.1.26.el5 1.6.7.2 + bz18793 (group
>>quota fix).
>>client is rhel5 x86_64 patched 2.6.18-128.1.16.el5-b_release_1_8_1 from
>>cvs 20090712131220 + bz18793 again.
>>
>>BTW, should I be using cvs tag v1_8_1_RC1 instead of b_release_1_8_1?
>>I'm confused about which is closest to the final 1.8.1 :-/
>>
>>cheers,
>>robin
>>--
>>Dr Robin Humble, HPC Systems Analyst, NCI National Facility
>>_______________________________________________
>>Lustre-discuss mailing list
>>Lustre-discuss at lists.lustre.org
>>http://lists.lustre.org/mailman/listinfo/lustre-discuss
>_______________________________________________
>Lustre-discuss mailing list
>Lustre-discuss at lists.lustre.org
>http://lists.lustre.org/mailman/listinfo/lustre-discuss



More information about the lustre-discuss mailing list