<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
On Jan 26, 2024, at 12:57, Saisha Kamat via lustre-devel <<a href="mailto:lustre-devel@lists.lustre.org" class="">lustre-devel@lists.lustre.org</a>> wrote:
<div>
<blockquote type="cite" class="">
<div class="">
<div class=""><br class="">
I am a Ph.D. student at UNC-Charlotte, focusing on research related to<br class="">
the Lustre File System. As part of my project, I am investigating<br class="">
scenarios involving the direct modification of xattr metadata on the<br class="">
Lustre disk, without unmounting the Lustre servers.<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
It would be helpful to know what the high-level goal of your research is?</div>
<div>Is this some type of fault injection mechanism, or are you trying to store</div>
<div>useful data directly into the xattr, or something else? Note that there</div>
<div>have already been a few papers published about this. If you are looking</div>
<div>for research ideas related to Lustre I could definitely give you a few, please</div>
<div>contact me if interested. Doubly so if you actually implement something</div>
<div>that is useful at the end of your Ph.D. and not a throw-away project.</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div class="">To achieve this, I have attempted to open the MDS (Metadata Server)<br class="">
disk partition as a file descriptor, locate the target file and its<br class="">
xattr, and write a faulty value. However, I have encountered an<br class="">
unexpected issue where my changes appear to be saved to memory and are<br class="">
not being synchronized with the disk.<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
In general, this is also a good way to corrupt the filesystem. If the xattr</div>
<div>is stored directly in the inode (as most of them are) then you will also be</div>
<div>overwriting the live inode that is also in memory. In many cases, whatever</div>
<div>was written directly to disk will be overwritten and lost when the inode is</div>
<div>flushed from memory.</div>
<div><br class="">
</div>
<div>Alternately, if the inode is already in memory, the xattr will be read from</div>
<div>RAM (either from the client cache, or from the MDS cache.</div>
<div><br class="">
</div>
<div>If you create a large xattr it will be written to a separate block, which</div>
<div>would at least avoid massive filesystem corruption.</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div class="">After completing the write operation, when I read the same xattr<br class="">
again, it reflects the corrupted value. Strangely, when using the<br class="">
"getfattr" command, the original, correct value is displayed. This<br class="">
discrepancy has raised doubts about whether Lustre permits direct<br class="">
modifications to its metadata on the disk.<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
The xattr contents are also cached on the client, and direct writes</div>
<div>to the storage would not invalidate that cache because they bypass</div>
<div>all of the proper access controls and locking.</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div class="">Furthermore, I observed that even after unmounting and remounting the<br class="">
Lustre file system, the xattr continues to display the corrupted value<br class="">
upon reading, whereas "getfattr" still returns the original, correct<br class="">
value.<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
That really depends on how you modified the "xattr" and where "getfattr"</div>
<div>is actually getting the data from. I suspect you aren't doing what you</div>
<div>think you are doing.</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div class="">Please help me understand whether Lustre allows direct modifications<br class="">
to its metadata on the disk and if there are any inherent limitations<br class="">
or considerations that I should be aware of.<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
No, of course Lustre and ext4 do not "allow" this. Just like any filesystem</div>
<div>doesn't "allow" you to run "dd if=/dev/zero of=/dev/sda1" and erase the</div>
<div>data from the partition.</div>
<div><br class="">
</div>
<div>
<blockquote type="cite" class="">
<div class="">
<div class="">Additionally, any recommendations or alternative approaches for<br class="">
simulating faulty conditions for testing purposes would be highly<br class="">
valuable to my research.<br class="">
</div>
</div>
</blockquote>
<br class="">
</div>
<div>That really depends on what your research is trying to achieve. Lustre</div>
<div>depends on reliable (RAID) storage underneath the MDT and OST. It</div>
<div>is possible to use ldiskfs (ext4) or ZFS as underlying storage, and they</div>
<div>have different reliability vs. performance properties. If you are testing</div>
<div>to directly corrupt on-disk storage then you are really testing those disk</div>
<div>filesystems, and Lustre does not add additional data redundancy layers</div>
<div>on top of them for metadata today, though there are *some* types of</div>
<div>internal metadata redundancy that can help recover from storage errors</div>
<div>(e.g. LFSCK can rebuild the Lustre file layout after errors on the MDT,</div>
<div>along with some types of directory breakage from the "link" xattr).</div>
<div><br class="">
</div>
<div>ZFS should be able to withstand such data/metadata block corruption</div>
<div>up to a certain level without any errors, until it just refuses to work at all.</div>
<div>ldiskfs would *not* be able to handle outright corruption of the on-disk</div>
<div>data (which is why you use RAID underneath it), but most corruption</div>
<div>would be localized and the filesystem would generally continue to work</div>
<div>(modulo the broken bits) even in the face of massive corruption. Kind</div>
<div>of like the difference between digital and analog audio signals.</div>
<br class="">
<div class="">
<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<div>Cheers, Andreas</div>
<div>--</div>
<div>Andreas Dilger</div>
<div>Lustre Principal Architect</div>
<div>Whamcloud</div>
<div><br class="">
</div>
<div><br class="">
</div>
<div><br class="">
</div>
</div>
</div>
</div>
</div>
</div>
<br class="Apple-interchange-newline">
</div>
<br class="Apple-interchange-newline">
<br class="Apple-interchange-newline">
</div>
<br class="">
</body>
</html>