[lustre-discuss] Lustre 2.8.0 - MDT/MGT failing to mount

Steve Barnet barnet at icecube.wisc.edu
Thu May 4 08:03:56 PDT 2017


Hi Rick,

On 5/4/17 10:01 AM, Mohr Jr, Richard Frank (Rick Mohr) wrote:
> Did you try doing a writeconf to regenerate the config logs for the file system?


Not yet, but quick enough to try. Do this for the MDT/MGT first,
then the OSTs?

Thanks much!

Best,

---Steve


>
> --
> Rick Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
>
>
>> On May 4, 2017, at 10:03 AM, Steve Barnet <barnet at icecube.wisc.edu> wrote:
>>
>> Hi all,
>>
>>  This is Lustre 2.8.0 community edition, combined MGS/MDT.
>>
>> I was adding storage to a filesystem and mistakenly duplicated an
>> index for one of the OSTs at creation time. Since these OSTs were
>> new and no data had been written, I made the mistake of reformatting
>> the affected OSTs (including the first one I successfully mounted).
>>
>>  When I tried to remount the newly formatted OST, the MDS kernel
>> panicked (log attached). After a device level backup and an e2fsck,
>> I can mount the MDT as ldiskfs. e2fsck did correct some orphaned
>> inodes, but those appear to be user files only, nothing from the
>> Lustre metadata files themselves.
>>
>>  However, the MDT/MGT still will not mount. The logs indicate
>> that the original definition of the duplicated OST still exists
>> somewhere. I checked the CONFIGS directory, and indeed there was
>> a file associated with the OST in question. I copied that file
>> out of the CONFIGS directory and attempted to mount the MDT/MGT
>> again, but no change.
>>
>> The logs read:
>>
>> May  4 06:41:22 lfs4-mds kernel: Lustre: MGS: Connection restored to MGC10.128.11.174 at tcp1_0 (at 0 at lo)
>> May  4 06:41:22 lfs4-mds kernel: LustreError: 12300:0:(genops.c:334:class_newdev()) Device lfs4-OST000e-osc-MDT0000 already exists at 22, won't add
>> May  4 06:41:22 lfs4-mds kernel: LustreError: 12300:0:(obd_config.c:370:class_attach()) Cannot create device lfs4-OST000e-osc-MDT0000 of type osp : -17
>> May  4 06:41:22 lfs4-mds kernel: LustreError: 12300:0:(obd_config.c:1666:class_config_llog_handler()) MGC10.128.11.174 at tcp1: cfg command failed: rc = -17
>> May  4 06:41:22 lfs4-mds kernel: Lustre:    cmd=cf001 0:lfs4-OST000e-osc-MDT0000  1:osp  2:lfs4-MDT0000-mdtlov_UUID
>> May  4 06:41:22 lfs4-mds kernel:
>> May  4 06:41:22 lfs4-mds kernel: LustreError: 15c-8: MGC10.128.11.174 at tcp1: The configuration from log 'lfs4-MDT0000' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
>> May  4 06:41:22 lfs4-mds kernel: LustreError: 12213:0:(obd_mount_server.c:1309:server_start_targets()) failed to start server lfs4-MDT0000: -17
>> May  4 06:41:22 lfs4-mds kernel: LustreError: 12213:0:(obd_mount_server.c:1798:server_fill_super()) Unable to start targets: -17
>> May  4 06:41:22 lfs4-mds kernel: Lustre: Failing over lfs4-MDT0000
>> May  4 06:41:28 lfs4-mds kernel: Lustre: 12213:0:(client.c:2063:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1493898082/real 1493898082] req at ffff8803113459c0 x1566404887184424/t0(0) o251->MGC10.128.11.174 at tcp1@0 at lo:26/25 lens 224/224 e 0 to 1 dl 1493898088 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> May  4 06:41:28 lfs4-mds kernel: Lustre: server umount lfs4-MDT0000 complete
>> May  4 06:41:28 lfs4-mds kernel: LustreError: 12213:0:(obd_mount.c:1426:lustre_fill_super()) Unable to mount  (-17)
>> May  4 06:45:04 lfs4-mds kernel: LDISKFS-fs (sdb): mounted filesystem with ordered data mode. quota=on. Opts:
>>
>>
>> Again, no data was written to these. I was poking around a bit with
>> the procedure for fixing a bad LAST_ID. From what I was able to
>> piece together, it doesn't look like the MDT has any notion of
>> precreated objects on this OST yet, so I am suspecting something
>> in mountdata, perhaps.
>>
>> Any ideas?
>>
>> Thanks much!
>>
>> Best,
>>
>> ---Steve
>>
>> <mds-panic.txt>_______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
>



More information about the lustre-discuss mailing list