[lustre-devel] RFC: Spill device for Lustre OSD
Jinshan Xiong
jinshan.xiong at gmail.com
Mon Nov 3 16:04:15 PST 2025
> On Nov 3, 2025, at 15:14, Oleg Drokin <green at whamcloud.com> wrote:
>
> On Mon, 2025-11-03 at 21:59 +0000, Day, Timothy via lustre-devel wrote:
>>
>>
>> This begs the question: if we're already doing this work to support
>> writing Lustre objects to any arbitrary filesystem via VFS and we're
>> only
>> intending to support OSTs with this proposal, why not implement
>> an OST-only VFS OSD and handle tiering in the filesystem layer?
>
> The problem with pure VFS is it does not actually provide us what we
> want.
> So OSD talks to the underlying FS via VFS + some more stuff (we do have
> the hidden mount for ldiskfs after all).
> The "more stuff" is things like expanded transaction boundaries beyond
> what posix requires so we can update more than one thing.
> If Linux VFS provided all these abilities we would not need to really
> know much about the underlying disk fs I suspect.
>
> But currently it's just a way to add OSTs, not move objects laterally
> from one OST to another and hence this proposal I imagine - where OSTs
> would grow "warts" for less wanted data.
>
> I am not sure it's a much better idea than the already existing HSM
> capabilities we have that would allow you to have "offline" objects
> that would be pulled back in when used, but are otherwise just visible
> in the metadata only.
> The underlying capabilities are pretty rich esp. if we also take into
> account the eventual WBC stuff.
The major problem of current HSM is that it has to have dedicated clients to move data. Also, scanning the entire Lustre file system takes very long time so it resorts to databases in order to make correct decisions about which file should be released. By the time, the two system will be out of sync. That makes it practically unusable.
>
> If the argument is "but OSTs know best what stuff is used" (which I am
> not sure I buy, after all before you could use something off OSTs you
> need to open a file I would hope) even then OSTs could just signal a
> list of "inactive objects" that then a higher level system would take
> care of by relocatiing somewhere more sensical and changing the layout
> to indicate those objects now live elsewhere.
>
> The plus here is you don't need to attach this "Wart" to every OST and
> configure it everywhere and such, but rather have a central location
> that is centrally managed.
>
More information about the lustre-devel
mailing list