[lustre-discuss] Interoperability 2.12.7 client <-> 2.12.8 server

Hans Henrik Happe happe at nbi.dk
Mon Mar 7 00:05:22 PST 2022


Hi Thomas,

They should work together, but there are other requirements that need to 
be fulfilled:

https://wiki.lustre.org/Lustre_2.12.8_Changelog

I guess your servers are CentOS 7.9 as required for 2.12.8.

I had an issue with Rocky 8.5 and the latest kernel with 2.12.8. While 
RHEL 8.5 is supported there was something new after 
4.18.0-348.2.1.el8_5, which caused problems. I found an LU fixing it 
post 2.12.8 (can't remember the number), but downgrading to 
4.18.0-348.2.1.el8_5 was the quick fix.

Cheers,
Hans Henrik

On 03.03.2022 08.40, Thomas Roth via lustre-discuss wrote:
> Dear all,
>
> this might be just something I forgot or did not read thoroughly, but 
> shouldn't a 2.12.7-client work with 2.12.8 - servers?
>
> The 2.12.8-changelog has the standard disclaimer
>> Interoperability Support:
>>    Clients & Servers: Latest 2.10.X and Latest 2.11.X
>
>
>
> I have this test cluster that I upgraded recently to 2.12.8 on the 
> servers.
>
> The fist client I attached now is a fresh install of rhel 8.5 (Alma).
> I installed 'kmod-lustre-client' and `lustre-client` from 
> https://downloads.whamcloud.com/public/lustre/lustre-2.12.8/el8.5.2111/
> I copied a directory containing ~5000 files - no visible issues
>
>
> The next client was also installed with rhel 8.5 (Alma), but now using 
> 'lustre-client-2.12.7-1' and 'lustre-client-dkms-2.12.7-1' from
> https://downloads.whamcloud.com/public/lustre/lustre-2.12.7/el8/client/RPMS/x86_64/ 
>
>
> As on my first client, I copied a directory containing ~5000 files. 
> The copy stalled, and the OSTs exploded in my face
>
>> kernel: LustreError: 23345:0:(events.c:310:request_in_callback()) 
>> event type 2, status -103, 
> service ost_io
>> kernel: LustreError: 
>> 40265:0:(pack_generic.c:605:__lustre_unpack_msg()) message length 0 
>> too small 
> for magic/version check
>> kernel: LustreError: 
>> 40265:0:(sec.c:2217:sptlrpc_svc_unwrap_request()) error unpacking 
>> request from 
> 12345-10.20.2.167 at o2ib6 x1726208297906176
>> kernel: LustreError: 23345:0:(events.c:310:request_in_callback()) 
>> event type 2, status -103, 
> service ost_io
>
>
> The latter message is repeated ad infinitum.
>
> The client log blames the network:
>> Request sent has failed due to network error
>>  Connection to was lost; in progress operations using this service 
>> will wait for recovery to complete
>
>> LustreError: 181316:0:(events.c:205:client_bulk_callback()) event 
>> type 1, status -103, desc0000000086e248d6
>> LustreError: 181315:0:(events.c:205:client_bulk_callback()) event 
>> type 1, status -5, desc 
> 00000000e569130f
>
>
>
> There is also a client running Debian 9 and Lustre 2.12.6 (compiled 
> from git) - no trouble at all.
>
>
> The I switched those two rhel8.5-clients: reinstalled the OS, gave the 
> first one the 2.12.7 -packages, the second on the 2.12.8 - and the 
> error followed: again the client running with 
> 'lustre-client-dkms-2.12.7-1' immedeately ran into trouble, causing 
> the same error messages in the logs.
> So this is not a network problem in the sense of broken hardware etc.
>
>
> What did I miss?
> Some important Jira I did not read?
>
>
> Regards
> Thomas
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20220307/6b4c38a8/attachment.html>


More information about the lustre-discuss mailing list