[lustre-discuss] Supporting more than 4 parity OST’s
Andreas Dilger
adilger at thelustrecollective.com
Wed Apr 8 17:43:39 PDT 2026
On Apr 8, 2026, at 08:18, Stepan Beskrovnyy <bsm099 at gmail.com> wrote:
>
> Hello everyone!
>
> In my diploma project I use WIP EC branch, and I need to have more parity oss’s than current supported. (More than 4, in my case - 8).
>
> I use cluster topology with 24 oss’s laid on 6 servers, 4 oss’s per node each. (Due to performance issues I cant use less, I did a topic about client performance limitation, because of it I try to avoid this by adding oss’s). I want to lose 2 of 6 nodes(requirement from the professor). In this case I need 8 parity oss’s with scheme n+8(16+8 for example)
>
> I found a way to use custom EC layouts with —ec-expert flag and set 16+8 layout. Then I killed 8 random oss’s and mirror resync funtion falls with error “resource temporary unavailable”.
>
> Is there any way to use layouts with more than 4 oss’s and use resync on them? I know, that EC is very WIP branch, but is anyone knows or tested this feature with this layouts yet ? I’m ready to use at my own risk and will fix bugs if I would see some, but help with the way how to set up 16+8 layout and rebuild file on it.
It is important to note that FLR-EC does not have "parity OSS" (or OSTs), but rather "parity *stripes*" (or "parity objects"). Each file will store data on different OST objects and the EC parity will be stored on separate OST objects (in a different "failure_domain", later in the patch series).
Since this is done on a per-file basis, the data and parity will naturally be "declustered" across OSTs, so if an OST is offline then the parity read workload will be distributed across all of the OSTs in the filesystem.
As for "16+8" it _should_ work, as Patrick mentioned, but probably has not been tested very much (or at all) yet. The format will support up to "255+15" data+parity (in '--ec-expert' mode) EC layouts, but more practically it will use "raid sets" to cover subsets of widely-striped files. So if a single file is striped across 1024 OSTs, a raid set may cover "16+2" or "16+3" data+parity stripes, and there would be 1024/16 = 64 raid sets in the file. This significantly reduces the overhead of parity generation and reconstruction. Otherwise, with a 255+15 EC it would need to read 254 data + 1 parity stripe to reconstruct a 1MB chunk of data, and the CPU overhead gets quite expensive.
Thanks for testing, and keep in touch.
Cheers, Andreas
---
Andreas Dilger
Principal Lustre Architect
adilger at thelustrecollective.com
More information about the lustre-discuss
mailing list