[Lustre-discuss] Another server question.

Robert Minvielle robert at lite3d.com
Tue Feb 3 08:29:42 PST 2009


I have been testing more since my last (premature) post. Some questions
come to mind and I am most likely just doing something wrong here...

I have five OSTs, one of them is the MGS/MDT. Yes, it is a totally bad
idea to have a MGD/MDT on the same node as an OST/OSS, but this is only
a test. I down one of the servers (normal shutdown, not the MGD of course). 
OK, so the clients seem to be frozen in regards to the lustre. Many here 
have noted that it should be ok, with the exception of files that were
stored on the downed server, but that does not seem to be the case here. 
That is not my main concern however, the real question is, I bring the server
back up; check its ID by issuing lctl dl; I check the MGS by a cat /proc/fs/lustre/devices
and see the ID in there as UP. OK, so it all seems well again, but the client
is still (somewhat) stuck. I unmount and mount the client back as per the
Lustre FAQ, but it still has problems. I reboot the client, hrm, it still
can not perform certain filesystem operations (ls -lR, df, du, find . all hang). 
I can create files and read files if I know their location, but I can not seem 
to perform any "recursive" type actions on the mount point on the client. 

I note also on the client, it seems to see all of the servers in a 
cat /proc/fs/lustre/devices. 

I was going to restart the MGS/OSS servers, but the last time I did that
nothing worked again and I had to start over. I have to be missing something
here. I thought you could reboot a OST at will with more or less no side effects
other than clients not seeing the files that were on that OST. I assume that is
actually true, but that I am doing something wrong bringing it up. Any ideas?




More information about the lustre-discuss mailing list