[Lustre-discuss] GlusterFS and Lustre

laytonjb at charter.net laytonjb at charter.net
Wed Apr 30 09:15:59 PDT 2008


---- Craig Tierney <Craig.Tierney at noaa.gov> wrote: 
> laytonjb at charter.net wrote:
> > ---- Craig Tierney <Craig.Tierney at noaa.gov> wrote: 
> >> rishi pathak wrote:
> >>> I came across this www.gluster.org <http://www.gluster.org>
> >>> Has any one tried it .
> >>> Is it a true parallel file system allowing concurrent read and write to 
> >>> a file by many  processes.
> >>> Will it be suitable for HPC applications.
> >>>
> >>>
> >> I wouldn't call GlusterFS a parallel filesystem in the same way I would
> >> refer to Lustre or PVFS. GlusterFS is a distributed filesystem,
> >> where complete files are contained on one of multiple servers. 
> > 
> > This isn't quite accurate. Depending upon the translators you use, the files
> > can be stripped across servers. For clusters it is almost always the case that
> > the files will be stripped.
> > 
> >> striping, even they say striping for their implementation is bad 

Hmm... The last time I talked to AB he suggested using striping for better
performance. But as you say below, it depends upon the strip size and
other translators in use (I've seen that drive performance).

> >> (http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F).
> >> Because of GlusterFS's modular architecture it was easy for them to implement.
> >> They do have MPI-IO support on their roadmap, so maybe they are planning to work around
> >> the issues described in the link above in user space.
> >>
> 
> Yes, there is a translator that will stripe files.  However, see the above comment.
> Even they say it isn't a good idea to use it.
> 
> I don't see why that for clusters it would always be the case that files will be striped?  Are
> you implying that clusters means "Large distributed HPC systems that read/write very large files"?

I like the idea of striped files from the perspective if that I lose the server
where the file is located, I've lost access to the file until it's restored. I can
mirror the file but that's wasting space.

But, as you point out, it depends upon the application(s). (I think I'll get a
tatoo that says that :)  ).

> There is implicitly overhead in reconstructing a striped file that will impact performance
> (but could be minimal, I haven't tested it). 

Yep - good comment. I haven't tested the reconstruction either manual or AFR.

Streaming performance may be better but what
> about random IO patterns?  If my codes don't do parallel IO, why would I necessarily
> add the complexity?
> 
> I know Lustre does striping quite well, but not applications require it.
> 
> >> GlusterFS is much more like Ibrix or Netapp/GX than Lustre.  It seems best as a distributed NFS
> >> replacement.  In my minimal testing, performance scales linearly as you add data servers.
> >> Metadata performance is reasonable (by feel, not by actual measurements).
> > 
> > One of the design ideas behind GlusterFS is that it doesn't have a metadata
> > server. So i'm not sure what you were measuring. It may have been the
> > metadata performance for the underlying file system rather than GlusterFS.
> 
> By metadata performance, I meant IOPS.  It doesn't have a dedicated metadata server,
> but all servers perform the function.  The streaming performance is quite
> good, but what I if I need to use NetCDF files, compile code, or use the filesystem
> as a large distributed mailserver?
> 
> Why I say streaming performance is good, I have been able to get a single server
> to push about 300 MB/s.  This is a limitation of my storage device, not the
> filesystem.  I don't know how performs over the IB transport when a faster
> disk array is used.
> 
> > 
> > I haven't tested it yet, but it has some interesting ideas (all in user-space so
> > there are no kernel mods to worry about, no metadata server, stackable
> > translators for tuning performance). 
> > 
> 
> Yes, these features are very nice.  I liked that I could get it running on an older
> kernel in only a few minutes (non-lustre server supported kernel).  So far it is
> meeting my needs for a small application.  I haven't been using it long, so
> I cannot comment on long term stability.  When I have some larger storage servers,
> I plan to test it further (as well as Lustre).
> 


Jeff



More information about the lustre-discuss mailing list