[lustre-discuss] lustre issue with OST setting to read-only mode as soon as writes are attempted. using Lustre 1.8.8

Kurt Strosahl strosahl at jlab.org
Thu May 7 07:54:26 PDT 2015


Good Morning,

     We recently had an ost encounter an issue with what appears to be its journal...  The ost is sitting as a partition atop a raid6 array, which was rebuilding due to a failed disk.  The ost has a journal on an external mirrored disk.  We unmounted the ost, and ran  the following: e2fsck -y -C 0 /dev/sdc2 -j /dev/sdd5

     After that we remounted the ost, and as soon as the first client tried to write to it after recover it went back to read-only.  We unmounted it again, ran e2fsck again, and again it flipped to read-only the second writes tried to go to it (I had set it to read only in the mds, and let it sit for a few minutes before setting it back to read/write to make sure that it was only on a write that the problem happened).

May  7 10:28:48  kernel:
May  7 10:28:48  kernel: Aborting journal on device sdd5.
May  7 10:28:48  kernel: LDISKFS-fs (sdc2): Remounting filesystem read-only
May  7 10:28:48  kernel: LDISKFS-fs error (device sdc2) in ldiskfs_mb_free_blocks: IO failure
May  7 10:28:48  kernel: LDISKFS-fs error (device sdc2) in ldiskfs_reserve_inode_write: Journal has aborted
May  7 10:28:48  kernel: LDISKFS-fs error (device sdc2) in ldiskfs_reserve_inode_write: Journal has aborted
May  7 10:28:48  kernel: LDISKFS-fs error (device sdc2) in ldiskfs_ext_remove_space: Journal has aborted
May  7 10:28:48  kernel: LDISKFS-fs error (device sdc2) in ldiskfs_reserve_inode_write: Journal has aborted
May  7 10:28:48  kernel: LDISKFS-fs error (device sdc2) in ldiskfs_orphan_del: Journal has aborted
May  7 10:28:48  kernel: LDISKFS-fs error (device sdc2) in ldiskfs_reserve_inode_write: Journal has aborted
May  7 10:28:48  kernel: LDISKFS-fs error (device sdc2) in ldiskfs_ext_truncate: Journal has aborted
May  7 10:28:48  kernel: LustreError: 2436:0:(filter_log.c:174:filter_recov_log_unlink_cb()) error destroying object 2760722: -30
May  7 10:28:48  kernel: LustreError: 2434:0:(llog_cat.c:441:llog_cat_process_thread()) llog_cat_process() failed -30
May  7 10:28:58  kernel: LustreError: 8791:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) can't get handle for 47 credits: rc = -30
May  7 10:28:58  kernel: LustreError: 8791:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) Skipped 54 previous similar messages
May  7 10:28:58  kernel: LustreError: 8791:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction: rc = -30
May  7 10:28:59  kernel: LustreError: 5245:0:(fsfilt-ldiskfs.c:367:fsfilt_ldiskfs_start()) error starting handle for op 4 (108 credits): rc -30
May  7 10:28:59  kernel: LustreError: 5245:0:(fsfilt-ldiskfs.c:367:fsfilt_ldiskfs_start()) Skipped 18 previous similar messages
May  7 10:29:03  kernel: LustreError: 8793:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction: rc = -30
May  7 10:29:07  kernel: LustreError: 8711:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction: rc = -30

Kurt J. Strosahl
System Administrator
Scientific Computing Group, Thomas Jefferson National Accelerator Facility


More information about the lustre-discuss mailing list