[Lustre-discuss] Replace OST

Dilger, Andreas andreas.dilger at intel.com
Wed Mar 19 12:51:41 PDT 2014


On 2014/03/19, 9:35 AM, "Lee, Brett" <brett.lee at intel.com> wrote:

>Hi Jon,
>
>Looks like you are jumping in with both feet (new to Lustre and trying to
>replace an OST).  Pretty ambitious... ;)
>
>From what I can tell you are following section 14.8.3 in the latest
>Lustre manual:
>
>14.8.3.  Removing an OST from the File System
>http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/
>lustre_manual.xhtml
>
>I'm not familiar with the process (having not needed to do that), but I
>believe that step describes how to remove an *active* OST from the file
>system.
>
>Here's what I'd suggest:
>
>1.  Verify that each step in the *latest* manual has been executed
>successfully.
>
>2.  If the "lfs find" process has not completed yet, run "strace -p
><pid>" on the parent process to provide the list more detail.

Looks like https://jira.hpdd.intel.com/browse/LU-1738 which hasn't been
fixed yet.  In the meantime, you can use "lfs getstripe" instead.

>3.  Provide the list which version of Lustre are you running.
>
>4.  Provide the list relevant syslog messages from the client, MDS, MGS,
>and the OSS with the deactivated OST.
>
>Note that for replacing a *fully* failed OST, I believe you would
>reformat the OST and then follow 14.8.5 (in the latest manual).
>
>Dr. Brett Lee, Solutions Architect
>High Performance Data Division, Intel
>+1.303.625.3595
>
>
>
>
>
>> -----Original Message-----
>> From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-
>> bounces at lists.lustre.org] On Behalf Of Jon Tegner
>> Sent: Monday, March 17, 2014 10:28 AM
>> To: lustre-discuss at lists.lustre.org
>> Subject: [Lustre-discuss] Replace OST
>> 
>> Hi,
>> 
>> I'm new to lustre, so please excuse me for probably some stupid
>>questions.
>> 
>> I have set up a small test system, consisting of
>> 
>> * 1 MGS/MDT
>> * 2 OSS/OSTs
>> * 6 clients on infiniband and one on gigabit.
>> 
>> I have verified the scaling effect (increased performance with two OSTs
>> compared to one). I further wanted to gain some experience when
>> components fail, and in a first test I wanted to replace one of my
>>OSS/OSTs.
>> Did the following (trying to follow chapter 14.7 in the manual):
>> 
>> 1. Unmounted the OST on one of my two OSS/OSTs (simulating a crash).
>> 
>> 2. Deactivating this OST on the MGS/MDT.
>> 
>> 3. Trying to remove the files located on this OST with the command:
>> 
>> lfs find --obd lustre-OST0002 -print0 /home |  tee /tmp/files_to_restore
>> | xargs -0 -n 1 unlink
>> 
>> The last command was issued from one of the clients, but it just hangs.
>> 
>> Are there something wrong with the way I'm trying to do this? Any help
>> would be greatly appreciated!
>> 
>> Thanks!
>> 
>> /jon
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>_______________________________________________
>Lustre-discuss mailing list
>Lustre-discuss at lists.lustre.org
>http://lists.lustre.org/mailman/listinfo/lustre-discuss
>


Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division





More information about the lustre-discuss mailing list