[lustre-discuss] sync command hangs in Lustre version 2.17.0
Tung-Han Hsieh
tunghan.hsieh at gmail.com
Tue Apr 28 06:53:35 PDT 2026
Greetings,
I am asking whether anyone experienced the following problem. For a system
running Lustre-2.17.0, with zfs-2.3.6 backend for both MDT and OST, does
anyone encounter the problem of running "sync" command ?
We just found that in our system, the "sync" command hangs forever, and
cannot be killed. Running ps command shows that:
2927047 pts/7 D 0:00 sync
3092934 pts/1 D 0:00 sync
3094210 ? D 0:00 sync
3094779 ? D 0:00 sync
3097873 ? D 0:00 sync
It seems that "sync" command called a kernel system call but got stuck,
which leads to the loading of the client to be around 60.0.
Our system of MDT, OST, and clients are all Debian Linux 12.13, with
vanilla Linux kernel 5.4.279. The clients mount the Lustre file system with
"-o flock" option. There are some messages in dmesg of the client:
[883364.582531] LustreError: 2815962:0:(osc_cache.c:943:osc_extent_wait())
extent 00000000f472f689@{[16128 -> 16383/16383],
[3|1|-|active|wiuY|000000007cd2cc8f],
[1703936|256|+|-|000000000bd3bac6|256|0000000000000000]}
chome2-OST0001-osc-ffff9e85b32e8000: wait ext to 0 timedout, recovery in
progress?
[883364.582779] LustreError: 2815962:0:(osc_cache.c:943:osc_extent_wait())
### extent: 00000000f472f689 ns: chome2-OST0001-osc-ffff9e85b32e8000 lock:
000000000bd3bac6/0xf7ea3edb91058483 lrc: 4/0,1 mode: PW/PW res:
[0x9ea04:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req
0->79904767) gid 0 flags: 0x800020000020000 nid: local remote:
0x3e94ff34acdc5236 expref: -99 pid: 2815962 timeout: 0 lvb_type: 1
[886477.421380] LustreError: 2585374:0:(osc_cache.c:943:osc_extent_wait())
extent 00000000f472f689@{[16128 -> 16383/16383],
[4|1|-|active|wiuY|000000007cd2cc8f],
[1703936|256|+|+|000000000bd3bac6|256|0000000000000000]}
chome2-OST0001-osc-ffff9e85b32e8000: wait ext to 0 timedout, recovery in
progress?
[886477.421596] LustreError: 2585374:0:(osc_cache.c:943:osc_extent_wait())
### extent: 00000000f472f689 ns: chome2-OST0001-osc-ffff9e85b32e8000 lock:
000000000bd3bac6/0xf7ea3edb91058483 lrc: 4/0,1 mode: PW/PW res:
[0x9ea04:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req
0->79904767) gid 0 flags: 0x800020000020000 nid: local remote:
0x3e94ff34acdc5236 expref: -99 pid: 2815962 timeout: 0 lvb_type: 1
and there is no relevant errors in dmesg of MDT and OST servers.
Since newer release of Lustre file system is not available yet, I am
wondering whether there are something we could do to work around it.
Thanks for your help.
Best Regards,
T.H.Hsieh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20260428/936d2fa7/attachment.htm>
More information about the lustre-discuss
mailing list