[lustre-discuss] Pools disappeared after upgrade to Lustre 2.15.5
Etienne Aujames
eaujames at ddn.com
Tue Aug 6 09:00:56 PDT 2024
Hi,
*Question 1: Is it expected behavior that pools disappeared after the Luster upgrade?*
No this should not happen. Pool will disappear only if you do a --writeconf on the targets. You should check the kernel logs to see what happened.
Maybe you have some kind of configuration corruption/mismatch. You can dump MGS configurations with:
[root at mgs ~] debugfs -c -R 'dump CONFIGS/cluster-client /tmp/cluster-client' /dev/<mgt_device> && llog_reader /tmp/cluster-client
[root at mgs ~] debugfs -c -R 'dump CONFIGS/cluster-MDT0000 /tmp/cluster-MDT0000' /dev/<mgt_device> && llog_reader /tmp/cluster-MDT0000
...
*Question 2: Is it safe to ignore warning messages "OST ... not found in pool ..." when adding OST to the pool.*
Sometime, it takes time to synchronize all clients and targets (because one or several nodes are unresponsive). But this can hide a MGS communication issue too. You should check the kernel messages.
You can verify the clients states with "lfs pool_list cluster.ssd" and the MDT states with "lctl pool_list cluster.ssd" (or lctl get_param lov.cluster-*.pools.ssd for both).
Those kinds of behavior will be fixed by 53202 <https://review.whamcloud.com/c/fs/lustre-release/+/53202>: LU-17308 <https://jira.whamcloud.com/browse/LU-17308> mgs: move pool_cmd check to the kernel.
Etienne
On 8/1/24 14:00, Pavlo Khmel via lustre-discuss wrote:
> Hi,
>
> I upgraded Luster from 2.12.8 to 2.15.5 (server and clients). After upgrade, I found that all Lustre pools disappeared. There were 2 pools:
> - cluster.ssh
> - cluster.hdd
>
> [root at mds1 ~]# lctl pool_list cluster
> Pools from cluster:
>
> [root at mds1 ~]# lctl pool_list cluster.ssd
> Pool: cluster.ssd
> lctl pool_list: cannot open /proc/fs/lustre/lov/cluster-MDT0000-mdtlov/pools/ssd: No such file or directory (2)
>
> [root at mds1 ~]# ls -la /proc/fs/lustre/lov/cluster-MDT0000-mdtlov/pools/
> total 0
> dr-xr-xr-x 2 root root 0 Aug 1 10:49 .
> dr-xr-xr-x 4 root root 0 Aug 1 10:49 ..
>
> On the client side "lfs getstripe" still shows pool names.
>
> [root at login2 ~]# lfs getstripe /cluster
> . . .
> /cluster/home
> stripe_count: 1 stripe_size: 1048576 pattern: raid0 stripe_offset: -1 pool: hdd
> /cluster/apps
> stripe_count: 1 stripe_size: 1048576 pattern: raid0 stripe_offset: -1 pool: ssd
>
> I created pools again:
>
> [root at mds1 ~]# lctl pool_new cluster.ssd
> Pool cluster.ssd created
>
> [root at mds1 ~]# lctl pool_add cluster.ssd OST[0-3]
> Warning, OST cluster-OST0000_UUID not found in pool cluster.ssd
> OST cluster-OST0001_UUID added to pool cluster.ssd
> . . .
>
> Adding OSTs to the pool shows a warings message "OST ... not found in pool ..." sometimes.
>
> Question 1: Is it expected behavior that pools disappeared after the Luster upgrade?
> Question 2: Is it safe to ignore warning messages "OST ... not found in pool ..." when adding OST to the pool.
>
> Best regards,
> Pavlo Khmel
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20240806/0767437f/attachment.htm>
More information about the lustre-discuss
mailing list