[Lustre-discuss] The client profile could not be read from the MGS
Wojciech Turek
wjt27 at cam.ac.uk
Tue Jan 5 09:57:33 PST 2010
Hello everyone and Happy New Year,
On my MDS server I have two file systems work and work2. Yesterday I
reconfigured file system named 'work' and ran writeconf in order to
recreate it's configuration logs. I ran writeconf while other file
system work2 was running. Both file systems share the same MGS and I
think that writeconf cleared CONFIGS directory on the MGS for both of
them. I didn't see any problems immediately after I run writeconf
until I unmounted work2 from one of the client servers. When I tried
to mount it back this message appeared:
mount.lustre: mount 10.44.245.203 at tcp:/work2 at /scratch2 failed:
Invalid argument
This may have multiple causes.
Is 'work2' the correct filesystem name?
Are the mount options correct?
Check the syslog for more info.
And the syslog on the clients says:
Jan 5 17:15:47 node-h01 kernel: LustreError: 156-2: The client
profile 'work2-client' could not be read from the MGS. Does that
filesystem exist?
Jan 5 17:15:47 node-h01 kernel: LustreError:
7936:0:(ldlm_request.c:996:ldlm_cli_cancel_req()) Got rc -108 from
cancel RPC: canceling anyway
Jan 5 17:15:47 node-h01 kernel: LustreError:
7936:0:(ldlm_request.c:1605:ldlm_cli_cancel_list())
ldlm_cli_cancel_list: -108
Jan 5 17:15:47 node-h01 kernel: Lustre: client ffff81016d4dd000 umount complete
Jan 5 17:15:47 node-h01 kernel: LustreError:
7936:0:(obd_mount.c:1980:lustre_fill_super()) Unable to mount (-22)
I have done some searching and I found one similar problem reported on
this mailing list.
the suggestion was to check the CONFIGS dir if the client profile file exists.
On my MDS node I ran this command:
debugfs -c -R 'ls -l CONFIGS' /dev/drbd_mds03_vg/mgs_lv
debugfs 1.40.7.sun3 (28-Feb-2008)
/dev/drbd_mds03_vg/mgs_lv: catastrophic mode - not reading inode or
group bitmaps
303105 40777 (2) 0 0 4096 4-Jan-2010 11:39 .
2 40755 (2) 0 0 4096 22-May-2009 10:59 ..
303106 100644 (1) 0 0 12288 22-May-2009 10:59 mountdata
303107 100644 (1) 0 0 28704 4-Jan-2010 05:15 work-client
303108 100644 (1) 0 0 27936 4-Jan-2010 05:15 work-MDT0000
303109 100644 (1) 0 0 8880 4-Jan-2010 05:16 work-OST0000
303110 100644 (1) 0 0 8880 4-Jan-2010 05:16 work-OST0001
303111 100644 (1) 0 0 8880 4-Jan-2010 05:17 work-OST0002
303112 100644 (1) 0 0 8880 4-Jan-2010 05:17 work-OST0003
303113 100644 (1) 0 0 8880 4-Jan-2010 05:18 work-OST0004
303114 100644 (1) 0 0 8880 4-Jan-2010 05:21 work-OST0005
303115 100644 (1) 0 0 8880 4-Jan-2010 05:21 work-OST0006
303116 100644 (1) 0 0 8880 4-Jan-2010 05:21 work-OST0007
303117 100644 (1) 0 0 8880 4-Jan-2010 05:22 work-OST0008
303118 100644 (1) 0 0 8880 4-Jan-2010 05:23 work-OST0009
303119 100644 (1) 0 0 8880 4-Jan-2010 05:23 work-OST000a
303120 100644 (1) 0 0 8880 4-Jan-2010 05:23 work-OST000b
303121 100644 (1) 0 0 0 4-Jan-2010 11:39 work2-client
work2-client file is zero size and all the OST and MDT files for work2
file system are missing.
Is there a way to recover this files without stopping work2 file system?
If I umount all work2 OSTs and MDT and then run writeconf on them and
mount them back, would this recreate this missing files?
Also can do above without umounting clients (let them wait until
lustre targets come back) and would this kill any jobs running one
them?
Many thanks for your input
Cheers
Wojciech
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wjt27 at cam.ac.uk
Tel: (+)44 1223 763517
More information about the lustre-discuss
mailing list