[lustre-discuss] Writing barrier stuck in failed state.

Andreas Dilger adilger at ddn.com
Mon Jul 7 23:16:19 PDT 2025


On Jul 7, 2025, at 01:47, Sergey Vergun via lustre-discuss <lustre-discuss at lists.lustre.org> wrote:

openat(AT_FDCWD, "/dev/obd", O_RDWR)    = 3
ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0x66, 0x7f, 0x8), 0x7ffea24f1a90) = 0

This is OBD_IOC_NAME2DEV = _IOWR('f', 127, OBD_IOC_DATA_TYPE) which is needed to find the Lustre OBD device number to call the ioctl() on.

ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0x67, 0x5, 0x8), 0x7ffea24f3da0) = -1 EINVAL (Invalid argument)

This is OBD_IOC_BARRIER =  _IOWR('g', 5, OBD_IOC_DATA_TYPE), which is the old ioctl number for barrier control on release < 2.16, which was originally defined as _IOWR('f', 261, OBD_IOC_DATA_TYPE) by mistake.  This is the correct ioctl number for your 2.15.4 release.

Strace show something like it doing both old and new ioctl numbers('g' 5 and 'f' 105) and fail on old one (new added in LU-16634) Should it be like that?

There is a new OBD_IOC_BARRIER_V2 = _IOW('f', 105, struct obd_ioctl_data) for 2.16+, to avoid the old number overflowing the 255-ioctl limit for 'f', but that should only be used for 2.16.0+ utilities + modules, and there is compatibility for both old+new utilities and code, so I don't think that is related here.

Have you tried restarting your MGS?  What does "lctl barrier_stat FSNAME" show?


вт, 22 апр. 2025 г. в 12:04, Sergey Vergun <sewergun at gmail.com<mailto:sewergun at gmail.com>>:
Hello. We have lustre 2.15.4 on Rocky 8.9
Two MDS with ZFS backend.
After some time doing snapshots writing barrier stuck in failed state.

Filesystem seems like working but I can't take snapshots anymore. barrier_thaw or rescan throw error "Invalid argument" and "-b off" option for snapshot_create have no effect. lsnapshot.log say that I call snapshot creation with barrier ON even if I say to not use it.

How can I find what caused failed state to prevent that in future and is it possible to unlock?
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
—
Andreas Dilger
Lustre Principal Architect
Whamcloud/DDN




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20250708/436e8054/attachment.htm>


More information about the lustre-discuss mailing list