[Lustre-discuss] SAN, shared storage, iscsi using lustre?

Kit Westneat kwestneat at datadirectnet.com
Tue Aug 12 13:17:40 PDT 2008


Lustre is different from something like StorNext or other clustered fses 
in that the clients never actually touch the storage, but instead 
communicate with the servers who then communicate with the storage. 
That's why it really doesn't matter what Lustre runs as its backing 
filesystem, as the filesystem will only be mounted by the storage server.

This white paper is a good intro to the architecture of Lustre:
www.sun.com/software/products/lustre/docs/lustrefilesystem_wp.pdf

- Kit

Alex wrote:
> On Tuesday 12 August 2008 19:45, Brian J. Murrell wrote:
>   
>> On Tue, 2008-08-12 at 19:20 +0300, Alex wrote:
>>     
>>> My problem comes below:
>>>
>>> Let say that I have:
>>> - N computers (N>8) sharing their volumes (volX, where X=N). Each volX is
>>> arround 120GB.
>>>       
>> What exactly do you mean "sharing their volumes"?
>>     
>
> I mean that i'm exporting via iscsi a block device (could be an entire hard
> disk or just a slice). In this case is and entire hard disk, 120GB large,
> named for simplicity volX (vol1, vol2... vol8) because i have 8 computers
> doing this.
>
>   
>>> - M servers (M>3) - which are accessing a clustered shared storage volume
>>> (read/write)
>>>       
>> Where/what is this clustered share storage volume these servers are
>> accessing?
>>     
>
> For example could be a GFS shared volume over volX. Here on lustre i don't
> know ... You tell me ...
>
>   
>>> Now, I want:
>>> - to build somehow a cluster file system on top of vol1, vol2, ... volN
>>> volumes
>>>       
>> Do you mean "disk" or "partition" when you say "volumes" here and are
>> these disks/partitions in the "N computers" you refer to above?
>>     
>
> Yes. There are 8 disks exported via iscsi by each computer in our testlab. Say
> it informally SAN. Does not matter if they are disk or partitions. They are
> block devices which can be accessed by each of our SERVERS (SERVER1, SERVER2
> and SERVER3) using iscsi, and mounted locally as /dev/sda,
> dev/sdb, ... /dev/sdh! I tried before to post here GFS, but because redhat
> does not support raid in GFS cluster, i cold not setup failover. For example
> i cant goup /dev/sda and /dev/sdb in /dev/md0, and so on up to /dev/md3 and
> after that to use lvm to unify md0+md1+md2+md3 in one logical volume (mylv)
> which will run on TOP a clustered file system(GFS)!
>
> I can't mkfs.gfs ... /dev/myvg/mylv and mount mount -t
> gfs /dev/myvg/mylv /var/shared_data on ALL our SERVERS!
>
>   
>>> - resulted logical volume to be used on SERVER1, SERVER2 and SERVER3
>>> (read/write access at the same time)
>>>       
>> Hrm.  This is all quite confusing, probably because you are not yet
>> understanding the Lustre architecture.  To try to map what you are
>> describing to Lustre, I'd say your "N computers" are an MDS and OSSes
>> and their 120GB "volumes" are an MDT and OSTs (respectively) and your "M
>> servers" are Lustre clients.
>>     
>
> I don't know lustre. I asked about. I just want to know if is possibile ... If
> the answer is yes, my question is: who will be MDS and WHO will be OSSes. How
> MANY MDS and HOW MANY OSSes I NEED in order to obtain what i want!
>
>   
>>> - Using lustre, can i join all volX (exported via iscsi) toghether in one
>>> bigger volume (using raid/lvm) and have a fault-tolerance SHARED STORAGE
>>> (failure of a single drive (volX) or server (computerX) should not bring
>>> down the storage usage)?
>>>       
>> I don't think this computes within the Lustre architecture.  You
>> probably need to review what Lustre does and how it works again.
>>     
>
> No. My question are  referring to situation described above and also to lustre
> FAQ.
>
> [snip]
> Can you run Lustre on LVM volumes, software RAID, etc?
>
> Yes. You can use any Linux block device as storage for a backend Lustre server
> file system, including LVM or software RAID devices.
> [end snip]
>
> And more from FAQ....
>
> [snip]
> Which storage interconnects are supported?
>
> Just to be clear: Lustre does not require a SAN, nor does it require a fabric
> like iSCSI. It will work just fine over simple IDE block devices. But because
> many people already have SANs, or want some amount of shared storage for
> failover, this is a common question.
>
> For storage behind server nodes, FibreChannel, InfiniBand, iSCSI, or any other
> block storage protocol can be used. Failover functionality requires shared
> storage (each partition used active/passive) between a pair of nodes on a
> fabric like SCSI, FC or SATA.
> [end snip]
>
> So, for me, reading this, is very clear without being an expert that lustre
> support BLOCK DEVICES in any RAID/LVM configuration... Also lustre, can work
> with my iscsi block devices /dev/sd[a-h] ...
>
> I asked here hoping that someone is using RAID/LVM over BLOCK DEVICES IMPORTED
> via iscsi on production and can confirm that what i want is not an utopia!
>
>   
>>> - I have one doubt regarding lustre: i saw that is using EXT3 on top,
>>> which is a LOCAL FILE SYSTEM not suitable for SHARED STORAGE (different
>>> computers accesing the same volume and write at the same time on it).
>>>       
>> This is moot.  Lustre manages the ext3 filesystem as it's backing store
>> and provides shared access.
>>     
>
> This is not clear at all... Generally speaking ext3 is a local file system
> (used on one computer). Reading FAQ, didn't find an answer, so i asked
> here...
>
>   
>>> - So, using lustre's patched kernels and tools, ext3 become suitable for
>>> SHARED STORAGE?
>>>       
>> You probably just need to ignore that Lustre uses ext3 "under the hood"
>> and trust that Lustre deals with it appropriately.
>>     
>
> No i can't ignore ... I want to be sure that ext3 used by lustre is a
> clustered file system. Redhat NEVER indicated their ext3 as file system for
> clusters. They are using GFS for that. I saw a lot of other howtos on the net
> which are ignoring parallel write problem on cluster configuration and teach
> peoples how to use for example xfs to mount the same partition on more
> servers and write on it at the same time... So, if lustre's ext3 file system
> is clustered, why nobody add a note to the FAQ about that: "we are using a
> patched ext3 version, which differ by redhat ext3 because it support cluster
> file systems like GFS"...
>
> Regards,
> Alx
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>   


-- 
---
Kit Westneat
kwestneat at datadirectnet.com
812-484-8485




More information about the lustre-discuss mailing list