[Lustre-discuss] OST acting up

Fernando Pérez fperez at icm.csic.es
Thu Nov 13 09:29:02 PST 2014


I had the same problem in the past with lustre 2.4.2 when I change one ost due to a hardware problem. 

Stop lustre after reformat the new ost (unmount clients, stop mgs/mds, unmount osts) and start all again was the only way that solved this problem for me.

Regards.

=============================================
Fernando Pérez
Institut de Ciències del Mar (CMIMA-CSIC)
Departament Oceanografía Física i Tecnològica
Passeig Marítim de la Barceloneta,37-49
08003 Barcelona
Phone:  (+34) 93 230 96 35
=============================================

> El 13/11/2014, a las 17:33, Ron Croonenberg <ronc at lanl.gov> escribió:
> 
> Actually that OSS has 4 OSTs  (I'll check the logs again for some obvious stuff)
> 
> I tried a few things:
> 
> Since the MDS doesn't seem to know about that OST, I tried this:
> 
> mkfs.lustre --ost --reformat --backfstype=zfs --fsname=l2 --mgsnode=10.1.17.1 at o2ib42 --failover=10.1.17.22 at o2ib42 --index=3 OST03/ost0
> 
> when I try to mount that OST I get:
> mount.lustre: mount OST03/ost03 at /lustre/l2/ost03 failed: Address already in use
> The target service's index is already in use. (OST03/ost03)
> 
> (also if I try to do an mkfs.lustre with a --writeconf)
> 
> 
> When I do an mkfs.lustre with the --replace option (saying this OST is going to replace the one with index 3 it 'seems' to work)
> 
> with the replace option I can mount the OST on the OSS and an lctl dl shows it's up, but it doesn't show on the MDS
> 
> Ron
> 
> 
> On 11/13/2014 08:54 AM, Mohr Jr, Richard Frank (Rick Mohr) wrote:
>> 
>> On Nov 13, 2014, at 10:49 AM, Ron Croonenberg <ronc at lanl.gov>
>>  wrote:
>> 
>>> I am using Lustre 2.4.2 and have an OST that doesn't seem to be written to.
>>> 
>>> When I check the MDS with 'lctl dl' I do not see that OST in the list.
>>> However when I check the OSS that OST belongs to I can see it is mounted and up;
>>> 
>>>  0 UP osd-zfs l2-OST0003-osd l2-OST0003-osd_UUID 5
>>>  3 UP obdfilter l2-OST0003 l2-OST0003_UUID 5
>>>  4 UP lwp l2-MDT0000-lwp-OST0003 l2-MDT0000-lwp-OST0003_UUID 5
>>> 
>>> 
>>> Since it isn't written to (the MDS doesn't seem to know about it, I created a directory. The index of that OST is 3  so I did a "lfs setstripe -i 3 -c 1 /mnt/l2-lustre/test-37" to force stuff that is written in that directory to be written on OST03
>>> 
>>> However when I issue that command I get:
>>> 
>>> -bash-4.1# lfs setstripe -i 3 -c 1 /mnt/l2-lustre/test-37
>>> error on ioctl 0x4008669a for '/mnt/l2-lustre/test-37' (3): Invalid argument
>>> error: setstripe: create stripe file '/mnt/l2-lustre/test-37' failed
>> 
>> 
>> 
>> Does that OSS server only have one OST?  If so, could there be a communication problem between the MDS server and the OSS server?  Is there anything in the log files that indicates the OSS server is trying to connect to the MDS server but fails for some reason?
>> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list