[lustre-discuss] LBUG: 2.5.16, EL8 Linux 4.18.0-553.30.1 in 'll_truncate_inode_pages
Peter Grandi
pg at lustre.list.sabi.co.UK
Thu Jan 2 04:45:04 PST 2025
Relatively rarely across a 200-machine cluster I get an LBUG on the
clients which seems triggered by specific access patterns (most jobs do
not trigger it) and looks quite similar to:
https://jira.whamcloud.com/browse/LU-16637
http://lists.lustre.org/pipermail/lustre-devel-lustre.org/2023-April/011016.html
https://git.whamcloud.com/?p=fs/lustre-release.git;a=commit;h=7bb1e211d217d5a82ac2d5e4edad5ae018090761
Since the LBUG is fatal all I get is the backtrace from the crash dump:
lbug_with_loc.cxold.8+0x18
ll_truncate_inode_pages_final+0xab
vvp_prune+0x181
cl_object_prune+0x58
lov_layout_change.isra.49+0x1ba
lov_conf_set+0x391
cl_conf_set+0x60
ll_layout_conf+0x14b
? _ptlrpc_req_finished+0x54d
ll_layout_lock_set+0x3df
? ll_take_md_lock+0x148
ll_layout_refresh+0x1cc
vvp_io_init+0x22e
cl_io_init0.isra.14+0x86
ll_file_io_generic+0x388
? file_update_time+0x62
? srso_return_thunk+0x5
? __generic_file_write_iter+0x102
ll_file_write_iter+0x558
? kmem_cache_freee+0x116
new_sync_write+0x112
vfs_write+0x5a
If this is a manifestation of LU-16637 there is a fix, but I have
checked the changelogs and LU-16637 is listed as applied to 2.16.0 but
it does not seem to be listed in the 2.15.[1-6] changelogs.
More information about the lustre-discuss
mailing list