<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hello Peter,</div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
This is indeed this bug, and as the ticket said, this is not fixed in 2.15.x, only 2.16.0</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I've no idea if this will ever make 2.15.x</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Aurélien</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>De :</b> lustre-discuss <lustre-discuss-bounces@lists.lustre.org> de la part de Peter Grandi <pg@lustre.list.sabi.co.UK><br>
<b>Envoyé :</b> jeudi 2 janvier 2025 13:45<br>
<b>À :</b> list Linux fs Lustre <lustre-discuss@lists.Lustre.org><br>
<b>Objet :</b> [lustre-discuss] LBUG: 2.5.16, EL8 Linux 4.18.0-553.30.1 in 'll_truncate_inode_pages</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">External email: Use caution opening links or attachments<br>
<br>
<br>
Relatively rarely across a 200-machine cluster I get an LBUG on the<br>
clients which seems triggered by specific access patterns (most jobs do<br>
not trigger it) and looks quite similar to:<br>
<br>
<a href="https://jira.whamcloud.com/browse/LU-16637">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjira.whamcloud.com%2Fbrowse%2FLU-16637&data=05%7C02%7Cadegremont%40nvidia.com%7Cf5d4d3826d8a48b637a508dd2b2bc296%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638714189150988150%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=D4tRyR%2BoN2euGk8W5zjL1Y3ksJ6j0iFuJ7%2FS%2BYidmIg%3D&reserved=0</a><br>
<a href="http://lists.lustre.org/pipermail/lustre-devel-lustre.org/2023-April/011016.html">
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.lustre.org%2Fpipermail%2Flustre-devel-lustre.org%2F2023-April%2F011016.html&data=05%7C02%7Cadegremont%40nvidia.com%7Cf5d4d3826d8a48b637a508dd2b2bc296%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638714189151026923%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=Qwm%2Blb7Wb4nGaO%2Fb6wW6LPUuW22DsLlwnUGG2R5wtik%3D&reserved=0</a><br>
<a href="https://git.whamcloud.com/?p=fs/lustre-release.git;a=commit;h=7bb1e211d217d5a82ac2d5e4edad5ae018090761">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.whamcloud.com%2F%3Fp%3Dfs%2Flustre-release.git%3Ba%3Dcommit%3Bh%3D7bb1e211d217d5a82ac2d5e4edad5ae018090761&data=05%7C02%7Cadegremont%40nvidia.com%7Cf5d4d3826d8a48b637a508dd2b2bc296%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638714189151046567%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=8JLPFy1CyyoJn9OrW8i6mnSo%2BW6Q3D2JSK0ZNKHCn%2Fg%3D&reserved=0</a><br>
<br>
Since the LBUG is fatal all I get is the backtrace from the crash dump:<br>
<br>
lbug_with_loc.cxold.8+0x18<br>
ll_truncate_inode_pages_final+0xab<br>
vvp_prune+0x181<br>
cl_object_prune+0x58<br>
lov_layout_change.isra.49+0x1ba<br>
lov_conf_set+0x391<br>
cl_conf_set+0x60<br>
ll_layout_conf+0x14b<br>
? _ptlrpc_req_finished+0x54d<br>
ll_layout_lock_set+0x3df<br>
? ll_take_md_lock+0x148<br>
ll_layout_refresh+0x1cc<br>
vvp_io_init+0x22e<br>
cl_io_init0.isra.14+0x86<br>
ll_file_io_generic+0x388<br>
? file_update_time+0x62<br>
? srso_return_thunk+0x5<br>
? __generic_file_write_iter+0x102<br>
ll_file_write_iter+0x558<br>
? kmem_cache_freee+0x116<br>
new_sync_write+0x112<br>
vfs_write+0x5a<br>
<br>
If this is a manifestation of LU-16637 there is a fix, but I have<br>
checked the changelogs and LU-16637 is listed as applied to 2.16.0 but<br>
it does not seem to be listed in the 2.15.[1-6] changelogs.<br>
_______________________________________________<br>
lustre-discuss mailing list<br>
lustre-discuss@lists.lustre.org<br>
<a href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org">https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.lustre.org%2Flistinfo.cgi%2Flustre-discuss-lustre.org&data=05%7C02%7Cadegremont%40nvidia.com%7Cf5d4d3826d8a48b637a508dd2b2bc296%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638714189151061516%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=5zgL9DVZ6HA1M630A1e4RSajZnPBoCfQA%2BuryjBnbfk%3D&reserved=0</a><br>
</div>
</span></font></div>
</body>
</html>