[Lustre-discuss] MD1000 woes and OSS migration suggestions

Wojciech Turek wjt27 at cam.ac.uk
Wed Dec 30 17:55:41 PST 2009


Hi Nick,

I don't think you should invest into new MD1000 brick just to make it
working in split mode.
Split mode doesn't give you much, except that you will have your 15
disk split between two servers but individual servers won't be able to
see other half of the MD1000 storage. This is not that great because
you don't get extra redundancy and failover functionality in lustre.
I think the best approach here will be to buy MD3000 RAID array
enclosure (which is basically MD1000 + two built in raid controller
modules). It costs around £1.5k more than MD1000 but it is definitely
worth it.
MD3000 allows to connect up to two servers with fully redundant data
paths from each server to any virtual disk configured on the MD3000
controller. if you follow link below you find cabling diagram Figure
2-9. Cabling Two Hosts (with Dual HBAs) Using Redundant Data Paths
Also you can connect maximum up to four server with non redundant data
paths, Figure 2-6. Cabling Up to Four Hosts with Non redundant Data
Paths
http://support.dell.com/support/edocs/systems/md3000/en/2ndGen/HOM/HTML/operate.htm
In addition to that you can hook up extra two MD1000 enclosures to a
single MD3000 array and they will be managed by MD3000 RAID
controllers which will make your life much easier.

In order to migrate your data from lustre file system 'lustre1'
located on OSS1 I suggest to set up brand new lustre file system
'lustre2' on OSS2 connected to MD3000 enclosure and then using your
third server acting as a lustre client mount both lustre file systems
and copy data from lustre1 to lustre2. At some point you will need to
make lustre1 quiescent so there is no new writes done to it, you can
do that by deactivating all lustre1 OST on the MDS and then you can
make a final rsync between lustre1 and lustre2. Once this is done you
can umount lustre1 and lustre2 and then mount lustre2 back under
lustre1 mount point. Once you have your production lustre filesystem
working on lustre2 you can disconnect MD1000 from OSS1 and connect it
to MD3000 expansion ports. You can also connect OSS01 to MD3000
controller ports. This way you will get extra space available from
added MD1000 which you can use to configure new OSTs and add them to
lustre2 file system. Since both OSS1 and OSS2 can see each others OSTs
(thanks to MD3000) you can configure lustre failover on this servers.
If you will need more capacity in the future you can just connect
second MD1000 to your MD3000 controller.

In my cluster I have six (MD3000 MD1000 MD1000) triplets configured as
a single large lustre file system which provides around 180TB RAID6
usable space and it works pretty well providing very good aggregated
bandwidth.

If you will have more questions don't hesitate to drop me an email. I
have a bit of experience (bad and good) with this Dell hardware and I
am happy to help.

Best regards,

Wojciech



2009/12/30 Nick Jennings <nick at creativemotiondesign.com>:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Everyone,
>
>  We've been using an MD1000 as our storage array for close to a year
> now, just hooked up to one OSS (LVM+ldiskfs). I recently ordered 2 more
> servers, one to be hooked up to the MD1000 to help distribute the load,
> the other to act as a lustre client (web node).
>
>  The hosting company informs me that the MD1000 was never setup to
> operate in split mode (which I asked for in the beginning) so basically
> only one server can be connected to it.
>
>  I now am faced with a tough call, we can't bring the filesystem down
> for any extended period of time (a few minutes is OK, though 0 downtime
> would be perfect!) and I'm not sure how to proceed in a way that would
> make things cause the least amount of headache.
>
>  The only thing I can think of is to set up a second MD1000 (configured
> for split mode) connect it to OSS2 (the new one which is not yet being
> used), add it to the Lustre filesystem and then somehow migrate the data
> from OSS1 (old MD1000) to OSS2 (new MD1000) ... then, bring OSS1
> offline, and connect it to the second partition of new MD1000 and bring
> that end online once more.
>
>  I've never done anything like this and am not entirely sure if this is
> the best method. Any suggestions, alternatives, docs or things to look
> out for would be greatly appreciated.
>
> Thanks,
> Nick
>
> - --
> Nick Jennings
> Director of Technology
> Creative Motion Design
> www.creativemotiondesign.com
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAks6u+QACgkQ3WjKacHecdMqgwCfZorkD1w1ri3I2/M3APHIpxQI
> /68An0GvkWvR6F5vOY5zz9Ty2u23rtaO
> =Rurj
> -----END PGP SIGNATURE-----
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>



-- 
--
Wojciech Turek

Assistant System Manager

High Performance Computing Service
University of Cambridge
Email: wjt27 at cam.ac.uk
Tel: (+)44 1223 763517



More information about the lustre-discuss mailing list