[lustre-discuss] Supporting more than 4 parity OST’s

Patrick Farrell pfarrell at ddn.com
Wed Apr 8 11:26:13 PDT 2026


Stepan,

You say OSS here, I think you mean OST - OSS is "object storage server", OST is "object storage target".

That aside, your note is clear.

So, 24 OSTs on 6 OSSes, so 4 per.  Your setup looks fine - you are likely hitting a bug.

Here, this is the most recent version - not all the bugs are fixed, but it is worth giving this one a try:
https://review.whamcloud.com/c/fs/lustre-release/+/64970

This includes an EC performance improvement and a number of bug fixes.

If you still hit that issue, can you please gather debug logs like this on the client:

lctl set_param *debug=-1 debug_mb=10000
lctl clear
lfs mirror resync [file]
lctl dk > /tmp/log

Then tar up the log file - it will be too big for the mailing list, but you could put it on google drive?

Also, how are you taking the nodes down?  What did you do to 'lose' them - shut them down, or ...?  Just helpful background.

Thanks!
________________________________
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf of Stepan Beskrovnyy via lustre-discuss <lustre-discuss at lists.lustre.org>
Sent: Wednesday, April 8, 2026 9:18 AM
To: lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org>
Subject: [lustre-discuss] Supporting more than 4 parity OST’s

Hello everyone!

In my diploma project I use WIP EC branch, and I need to have more parity oss’s than current supported. (More than 4, in my case - 8).

I use cluster topology with 24 oss’s laid on 6 servers, 4 oss’s per node each. (Due to performance issues I cant use less, I did a topic about client performance limitation, because of it I try to avoid this by adding oss’s). I want to lose 2 of 6 nodes(requirement from the professor). In this case I need 8 parity oss’s with scheme n+8(16+8 for example)

I found a way to use custom EC layouts with —ec-expert flag and set 16+8 layout. Then I killed 8 random oss’s and mirror resync funtion falls with error “resource temporary unavailable”.

Is there any way to use layouts with more than 4 oss’s and use resync on them? I know, that EC is very WIP branch, but is anyone knows or tested this feature with this layouts yet ? I’m ready to use at my own risk and will fix bugs if I would see some, but help with the way how to set up 16+8 layout and rebuild file on it.

Thanks,
Stepan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20260408/78a7f773/attachment.htm>


More information about the lustre-discuss mailing list