<br><font size=2 color=#800080 face="sans-serif">Thanks much Daire,</font>

<br>

<br><font size=2 color=#800080 face="sans-serif">Your insights are very

much appreciated.</font>

<br>

<br><tt><font size=2>> Interesting configuration. Is there a particular

reason why you decided<br>

> on using Xen VMs? Is failover better with Xen instances? I'm guessing

<br>

> you don't have hundreds of clients hammering the hardware.</font></tt>

<br>

<br><font size=2 color=#800080 face="sans-serif">Cost was the deciding

factor for XEN.  True, failover and STONITH are FAST with XEN (it's

mostly all up in memory and killing a domU is easy) , and boot time of

XEN domUs is more seconds (35 - 50) than minutes.  But it all came

down to cost.  We only get about 5k folks a day knocking at the Customer

Portal door, so not a lot of traffic (we're not Google by any stretch of

the imagination).</font>

<br>

<br><font size=2 color=#800080 face="sans-serif">A small group within our

ITS department got together to architect this solution in 3 days (Security,

Portal Apps Devel, App Admins, System Admins, Network Admins, Storage Admins).

 The choice to use XEN VMs really was based on cost.  We were

pleasantly suprised by our interim VP to receive X number of dollars to

create a high availability solution for our client facing customer portal

and those X number of dollars wasn't a lot and was only available for a

limited time so we had to hustle (use it or lose it for the year).  Out

of that bucket we needed to purchase network and server hardware, OS support,

App and Portal support, backup server client licenses, LDAP support and

high availability disk solution support (Lustre training and ongoing support

and initial configuration).   So the funds got gobbled up fast and

anywhere we could save a buck was reviewed and held weight.  Purchasing

RHEL 5 Advanced Platforum Premium gave us 24x7 support and unlimited number

of XEN servers (right value, right price).</font>

<br>

<br><font size=2 color=#800080 face="sans-serif">We bought low cost IBM

Intel xSeries servers for DEV, CERT, and PROD.  The DEV server is

running 5 XEN domUs.  The 3 CERT servers all together are running

13 XEN domUs.  The 5 PROD servers are all together running 22 XEN

domUs.  Nine physical servers total, 40 virtual, hardware that's maxed

out and configured with multiple dual HBAs and quad GigE NICs to the tune

of about $200k I think.  We've got WAS and WPS servers, HTTP servers,

LDAP servers, DB2 servers (thankfully free with the Portal server), Lustre

MDS / OSS all running side by side with each other.</font>

<br>

<br><font size=2 color=#800080 face="sans-serif">Our current customer facing

portal (we haven't cut over to the new hardware yet) consists of 16 servers

and dropping to 9 reduces our "carbon footprint", but definitely

increases the complexity of our environment.  Our data center, like

many, is power constrained (fully using our UPS') and we have a large internal

push to consolidate and virtualize to realize full server utilization potential

(do more with less) as well as reduce energy costs.  We currently

use RHEL's GFS / CS, but we're on 3U8 (which uses disk pools) and from

RHEL4 and up, GFS uses LVM instead of disk pools.  This requires a

complete rebuild and hardware refresh any way you look at it, so we opened

the playing field to all HA disk solutions.  We wanted to decouple

the HA disk from the Application Server layer (to allow the app layer to

remain up when GFS panics...and yes, GFS has panicked the entire 6 node

cluster before and brought down the Application layer; bye-bye portal access;

RHEL 3 was very buggy).  Lustre allows us to do that and has a great

support base.</font>

<br>

<br><tt><font size=2>> I'm curious as to why you created 5 filesystems

on the same "hardware" <br>

> instead of one big filesystem?</font></tt>

<br>

<br><font size=2 color=#800080 face="sans-serif">Legacy filesystems, before

my time and reaching far into the past.  Those filesystems have migrated

from an IBM Regatta class p690 server, to the current RHEL GFS 16 server

environment, to this new environment.  We're working with the data

content owners to establish new filesystem guidelines which will include

archiving old / unused data and better recognizing data ownership.  But

that's Phase 2 of this project and another type of migration.  Phase

1 is the migration and implementation of a solid HA hardware / software

environment (or as SPOF free as we can make it).  Phase 2 will change

the entire file system structure and provide us with tools to enforce disk

usage and accountability (along with establishing better control over disk

growth and who to charge back for SAN expansion).  To change those

filesystems now would mean our entire development and publishing structure

would breakdown (automated publishing scripts would all break, several

integral connected servers that check data existence would go nuts, VPs

would have words with VPs, not a pretty scene).  Basically, politics

and corporate culture are the current reasons.  We just need time

to plan and carefully coordinate with all parties to develop a new filesystem

structure and get folks to start posting new data and migrating existing

data to new filesystems.</font>

<br>

<br><tt><font size=2>> I'm not sure it is possible to migrate an existing

filesystem to LVM <br>

> easily - you would need to do a file backup of your MDT first and

restore <br>

> to the LVM device (section 15.1.3.1 of the manual). So in your case

to <br>

> wipe (!) a single MDT and create a new one I'd do something like:<br>

> <br>

>   pvcreate /dev/xvdj<br>

>   vgcreate lusfs01 /dev/xvdj<br>

>   vgchange -a y lusfs01<br>

>   lvcreate -L3G -nmdt lusfs01</font></tt>

<br>

<br><font size=2 color=#800080 face="sans-serif">I have another limited

time opportunity open to me.  Into the middle of this large Customer

Portal project another project got dropped; a new SAN with all the fun

that comes with a new SAN (migration, migration, migration).   As

I'm being asked to migrate all my virtual domU OS' (which are located on

the old SAN) along with their corresponding data disks (also on the old

SAN), I figured, I'd take advantage of that migration and instead rebuild

Lustre with LVM to get  the benefits of journaling and snapshotting,

as you'd mentioned.  Thank you for making clear that the MDT's are

where you'd recommend using LVM, I wasn't sure if it was just the MDS servers

or both MDS + OSS.</font>

<br>

<br><font size=2 color=#800080 face="sans-serif">And thank you for taking

the time to answer.  Your reply is absolutely brilliant and what I'd

hope for (it's exactly what I need to present my case to the business).

 We're not live, let's recreate and bring on board these additional

LVM features!  Give me some new SAN data disks for building up LVM,

I'll build it alongside the old SAN data disks, transfer the MDT data and

then drop the old SAN.  This is faster and more efficient than using

our SAN vendor's migration solution for the data disks (I still have to

use it for the OS disk though, but rebuilding with LVm is still a time

savings and a known procedure).</font>

<br>

<br><font size=2 color=#800080 face="sans-serif">Cheers and many thanks

again, Daire,</font>

<br>

<br><font size=2 face="sans-serif">Ms. Andrea D. Rucks<br>

Sr. Unix Systems Administrator,<br>

Lawson ITS Unix Server Team</font>