[Lustre-discuss] OSTs not activating following MGS/MDS move

Christopher J. Walker C.J.Walker at qmul.ac.uk
Thu Mar 7 07:31:44 PST 2013


On 26/02/13 17:30, Colin Faber wrote:
> Hi,
> 
> 
> As a follow up (for archival reasons) the issue Patrick experienced was 
> CATALOG file corruption. Truncation of the CATALOG file on the MDS via 
> ldiskfs mount corrected his problem.
> 

Thanks for the follow up.

As I'm about to undertake a similar move (though on 1.8.8-wc1), and
would like to avoid similar problems. It would be useful to know if the
CATALOG file corruption was caused by the procedure, or if it was a
coincidence.

Chris


> -cf
> 
> 
> On 02/26/2013 08:43 AM, Patrick Shopbell wrote:
>> Hello everyone,
>> I am having an odd problem here, on our small Lustre
>> installation. We have a single MGS/MDS and 3 OSS's with
>> 7 OSTs total. I just tried moving the MDS/MGS to a faster
>> machine, following the instructions in sections 17.3 and 17.4
>> of the Lustre manual: with the system offline, I mounted
>> the file systems as "ldiskfs" and then used the Lustre tar
>> command to make a copy of everything. I checked a bunch of the
>> xattrs - all looked to match fine.
>>
>> Finally, I reset the system configs on the MDS/MGS with:
>>
>>> tunefs.lustre --writeconf /dev/md126
>> and on the OSSs with something like:
>>
>>> tunefs.lustre --writeconf /dev/sdb
>>> tunefs.lustre --erase-param --mgsnode=192.168.30.113 at tcp --index=0 --writeconf /dev/sdb
>> where I kept the indices the same as in my original setup.
>>
>> I can now mount the MGS/MDS, and then mount the OSTs. However,
>> I get these three errors on the MGS, when an OST mounts:
>>
>> Feb 25 22:38:38 yupana kernel: LustreError: 3636:0:(lov_log.c:155:lov_llog_origin_connect()) error osc_llog_connect tgt 6 (-107)
>> Feb 25 22:38:38 yupana kernel: LustreError: 3636:0:(mds_lov.c:873:__mds_lov_synchronize()) lustre-OST0006_UUID failed at llog_origin_connect: -107
>> Feb 25 22:38:38 yupana kernel: LustreError: 3636:0:(mds_lov.c:903:__mds_lov_synchronize()) lustre-OST0006_UUID sync failed -107, deactivating
>>
>> And when I run 'lctl dl', the OSTs are apparently all inactive:
>>
>>   5 IN osc lustre-OST0000-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
>>
>> Any ideas what I need to do to activate these? I am running
>> Lustre 2.3 on all nodes. I can see the file system on a client
>> and, it seems like, read files, but I cannot create any new
>> files, presumably because the OSTs are not active.
>>
>> Thanks for your suggestions,
>> Patrick
>>
>> *---------------------------------------------------------------*
>> | Patrick Shopbell           Department of Astronomy            |
>> | pls at astro.caltech.edu      Mail Code 249-17                   |
>> | (626) 395-4097             California Institute of Technology |
>> | (626) 568-9352  (FAX)      Pasadena, CA  91125                |
>> | WWW: http://www.astro.caltech.edu/~pls/                       |
>> *---------------------------------------------------------------*
>>
>>
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list