[Lustre-discuss] OST acting up

Ron Croonenberg ronc at lanl.gov
Thu Nov 13 09:35:12 PST 2014


> Hi Fernando,
>
> It looks like I had to clear the logs on the MDT but also all OSTs
> apparently.  There was nothing wrong with OST03, it just didn't see it
> it looked like
>
>
> Ron
>
> On 11/13/2014 10:29 AM, Fernando Pérez wrote:
>> I had the same problem in the past with lustre 2.4.2 when I change one
>> ost due to a hardware problem.
>>
>> Stop lustre after reformat the new ost (unmount clients, stop mgs/mds,
>> unmount osts) and start all again was the only way that solved this
>> problem for me.
>>
>> Regards.
>>
>> =============================================
>> Fernando Pérez
>> Institut de Ciències del Mar (CMIMA-CSIC)
>> Departament Oceanografía Física i Tecnològica
>> Passeig Marítim de la Barceloneta,37-49
>> 08003 Barcelona
>> Phone:  (+34) 93 230 96 35
>> =============================================
>>
>>> El 13/11/2014, a las 17:33, Ron Croonenberg <ronc at lanl.gov> escribió:
>>>
>>> Actually that OSS has 4 OSTs  (I'll check the logs again for some
>>> obvious stuff)
>>>
>>> I tried a few things:
>>>
>>> Since the MDS doesn't seem to know about that OST, I tried this:
>>>
>>> mkfs.lustre --ost --reformat --backfstype=zfs --fsname=l2
>>> --mgsnode=10.1.17.1 at o2ib42 --failover=10.1.17.22 at o2ib42 --index=3
>>> OST03/ost0
>>>
>>> when I try to mount that OST I get:
>>> mount.lustre: mount OST03/ost03 at /lustre/l2/ost03 failed: Address
>>> already in use
>>> The target service's index is already in use. (OST03/ost03)
>>>
>>> (also if I try to do an mkfs.lustre with a --writeconf)
>>>
>>>
>>> When I do an mkfs.lustre with the --replace option (saying this OST
>>> is going to replace the one with index 3 it 'seems' to work)
>>>
>>> with the replace option I can mount the OST on the OSS and an lctl dl
>>> shows it's up, but it doesn't show on the MDS
>>>
>>> Ron
>>>
>>>
>>> On 11/13/2014 08:54 AM, Mohr Jr, Richard Frank (Rick Mohr) wrote:
>>>>
>>>> On Nov 13, 2014, at 10:49 AM, Ron Croonenberg <ronc at lanl.gov>
>>>>   wrote:
>>>>
>>>>> I am using Lustre 2.4.2 and have an OST that doesn't seem to be
>>>>> written to.
>>>>>
>>>>> When I check the MDS with 'lctl dl' I do not see that OST in the list.
>>>>> However when I check the OSS that OST belongs to I can see it is
>>>>> mounted and up;
>>>>>
>>>>>   0 UP osd-zfs l2-OST0003-osd l2-OST0003-osd_UUID 5
>>>>>   3 UP obdfilter l2-OST0003 l2-OST0003_UUID 5
>>>>>   4 UP lwp l2-MDT0000-lwp-OST0003 l2-MDT0000-lwp-OST0003_UUID 5
>>>>>
>>>>>
>>>>> Since it isn't written to (the MDS doesn't seem to know about it, I
>>>>> created a directory. The index of that OST is 3  so I did a "lfs
>>>>> setstripe -i 3 -c 1 /mnt/l2-lustre/test-37" to force stuff that is
>>>>> written in that directory to be written on OST03
>>>>>
>>>>> However when I issue that command I get:
>>>>>
>>>>> -bash-4.1# lfs setstripe -i 3 -c 1 /mnt/l2-lustre/test-37
>>>>> error on ioctl 0x4008669a for '/mnt/l2-lustre/test-37' (3): Invalid
>>>>> argument
>>>>> error: setstripe: create stripe file '/mnt/l2-lustre/test-37' failed
>>>>
>>>>
>>>>
>>>> Does that OSS server only have one OST?  If so, could there be a
>>>> communication problem between the MDS server and the OSS server?  Is
>>>> there anything in the log files that indicates the OSS server is
>>>> trying to connect to the MDS server but fails for some reason?
>>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>



More information about the lustre-discuss mailing list