[lustre-discuss] lustre issue with OST setting to read-only mode as soon as writes are attempted. using Lustre 1.8.8
Kurt Strosahl
strosahl at jlab.org
Thu May 7 07:54:26 PDT 2015
Good Morning,
We recently had an ost encounter an issue with what appears to be its journal... The ost is sitting as a partition atop a raid6 array, which was rebuilding due to a failed disk. The ost has a journal on an external mirrored disk. We unmounted the ost, and ran the following: e2fsck -y -C 0 /dev/sdc2 -j /dev/sdd5
After that we remounted the ost, and as soon as the first client tried to write to it after recover it went back to read-only. We unmounted it again, ran e2fsck again, and again it flipped to read-only the second writes tried to go to it (I had set it to read only in the mds, and let it sit for a few minutes before setting it back to read/write to make sure that it was only on a write that the problem happened).
May 7 10:28:48 kernel:
May 7 10:28:48 kernel: Aborting journal on device sdd5.
May 7 10:28:48 kernel: LDISKFS-fs (sdc2): Remounting filesystem read-only
May 7 10:28:48 kernel: LDISKFS-fs error (device sdc2) in ldiskfs_mb_free_blocks: IO failure
May 7 10:28:48 kernel: LDISKFS-fs error (device sdc2) in ldiskfs_reserve_inode_write: Journal has aborted
May 7 10:28:48 kernel: LDISKFS-fs error (device sdc2) in ldiskfs_reserve_inode_write: Journal has aborted
May 7 10:28:48 kernel: LDISKFS-fs error (device sdc2) in ldiskfs_ext_remove_space: Journal has aborted
May 7 10:28:48 kernel: LDISKFS-fs error (device sdc2) in ldiskfs_reserve_inode_write: Journal has aborted
May 7 10:28:48 kernel: LDISKFS-fs error (device sdc2) in ldiskfs_orphan_del: Journal has aborted
May 7 10:28:48 kernel: LDISKFS-fs error (device sdc2) in ldiskfs_reserve_inode_write: Journal has aborted
May 7 10:28:48 kernel: LDISKFS-fs error (device sdc2) in ldiskfs_ext_truncate: Journal has aborted
May 7 10:28:48 kernel: LustreError: 2436:0:(filter_log.c:174:filter_recov_log_unlink_cb()) error destroying object 2760722: -30
May 7 10:28:48 kernel: LustreError: 2434:0:(llog_cat.c:441:llog_cat_process_thread()) llog_cat_process() failed -30
May 7 10:28:58 kernel: LustreError: 8791:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) can't get handle for 47 credits: rc = -30
May 7 10:28:58 kernel: LustreError: 8791:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) Skipped 54 previous similar messages
May 7 10:28:58 kernel: LustreError: 8791:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction: rc = -30
May 7 10:28:59 kernel: LustreError: 5245:0:(fsfilt-ldiskfs.c:367:fsfilt_ldiskfs_start()) error starting handle for op 4 (108 credits): rc -30
May 7 10:28:59 kernel: LustreError: 5245:0:(fsfilt-ldiskfs.c:367:fsfilt_ldiskfs_start()) Skipped 18 previous similar messages
May 7 10:29:03 kernel: LustreError: 8793:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction: rc = -30
May 7 10:29:07 kernel: LustreError: 8711:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction: rc = -30
Kurt J. Strosahl
System Administrator
Scientific Computing Group, Thomas Jefferson National Accelerator Facility
More information about the lustre-discuss
mailing list