[Lustre-discuss] clarification on mkfs.lustre options

Sebastian Gutierrez gutseb at cs.stanford.edu
Mon Aug 2 22:06:17 PDT 2010


Hello

During this upgrade my plan was to add OSS and new OSTs to my FS.  
Deactivate my old OSTs and migrate the data by copying the old data into 
/lustre/tmp.  Once most of the data was moved over I was going to 
schedule a outage to rsync the Deltas.   I was then going to empty the 
old OSTs and upgrade the disks in those OSTs and copy the /CONFIG* data 
over to the newly created disks and rebalance the data across the 
OSTs.   However I ran into a issue caused by a typo detailed below.


> For RAID-1+0 the alignment is much less important. While there is still some negative effect if the 1MB read of write is not aligned (because it will make an extra pair of disks active to fill the RPC) this is not nearly so bad as RAID-5/6 where it will cause also the parity chunk to be rewritten.
>
> If you are using a 6-disk RAID-1+0 then it would be OK for example to configure the RAID chunksize to be 128kB. While this means that a 1MB IO would handle 3*128kB from two pairs of disks and 4*128kB from the third pair of disks (each IO would be sequential though).
>
>   It means a given pair of disks would do a bit more work than the others for a given RPC, but since the IO is sequential (assuming the request itself is sequential) it will not need an extra seek for the last disk and the extra IO is a minimal effort.
>
>    

It looks like I had a typo in my config the first time I created the 
FS.  I had planned ahead and have extra disks to shuffle things around. 
While migrating data off of the old OSTs these settings seem to have me 
missing about 5 TB out of 20TB.

I created my 6 disk raid 1/0 filesystem with the following settings.
The first time I created the FS I had the following options.
--mkfsoptions="-E stripe=256 -E stride=32"

To recover from this I am performing the following:

I am recreating the FS with the settings below and cp the contents of 
the OST.old to OST.new then remount the OST.new as OST.old.

<stripe-width>=(<chunk>*<data disks>)/<4k>
96=(128*3)/4k
--mkfsoptions="-E stripe-width=96,stride=32"

I have a couple of sanity check questions.
If I have the old OST and new OST side by side would it be enough to do 
a cp -ar of the /ost/O dir or should I use a different migration procedure?

I have found some mention on lustre-discuss that using a tool that does 
a backup of the xattrs is preferable.  I am assuming that the cp -a 
should be sufficient since it is supposed to preserve all.  In the 
lustre-discuss articles I only saw a mention of the patched tar and 
rsync.  Is there any reason not to trust cp?

on a related tangent
I also found that the documentation in the manual is a bit out of 
date.   The manual refers to the <stripe-width> as <stripe> The curent 
version of mkfs.lustre only takes <stripe-width> as a valid option.  I 
will submit a documentation bug for this tomorrow.

Thank You
Sebastian









More information about the lustre-discuss mailing list