[lustre-devel] RFC: Spill device for Lustre OSD

Day, Timothy timday at amazon.com
Tue Nov 4 15:37:32 PST 2025


>> If we can implement a no-transaction osd-vfs, that would expose a
>> lot of flexibility for other reasons as well. Possibly the osd-vfs
>> could
>> implement a journal or other logging layer internally to make up for
>> lack of transactions, whether initially or at a later stage?
>
>That's actually an interesting idea, but probably not very practical.
>The journal itself is probably not really feasible with VFS api alone
>because you need to touch journal together with whatever it you are
>modifying, unless you update the journal first and then everything
>else, but that is likely going to be slow due to all the overhead?
>That's probably why all the journaling filesystems hide the journal
>inside themselves out of reach for the VFS api.
>Of course VFS api could probably be extended eventually if there's a
>good justification, but who knows how long it'll take and how the final
>agreed upon implementation would actually look like.

Databases do this kind of journaling and they use normal filesystem
APIs. And they can do this with a similar performance profile as a
Lustre OSS or MDS. OSD is pretty much a database on top of a normal
filesystem. So I think it's possible.

If we had a no-atomic-transaction osd-vfs, we could perhaps use it
for stand-alone MGS. Before ending each write transaction, the osd-vfs
could fsync() the whole filesystem. This isn't feasible for MDS or OSS, of
course. But performance demands on MGS are low enough (I suspect)
that this would work.



More information about the lustre-devel mailing list