[Lustre-discuss] GlusterFS and Lustre

Wed Apr 30 09:00:27 PDT 2008

laytonjb at charter.net wrote:
> ---- Craig Tierney <Craig.Tierney at noaa.gov> wrote: 
>> rishi pathak wrote:
>>> I came across this www.gluster.org <http://www.gluster.org>
>>> Has any one tried it .
>>> Is it a true parallel file system allowing concurrent read and write to 
>>> a file by many  processes.
>>> Will it be suitable for HPC applications.
>>>
>>>
>> I wouldn't call GlusterFS a parallel filesystem in the same way I would
>> refer to Lustre or PVFS. GlusterFS is a distributed filesystem,
>> where complete files are contained on one of multiple servers. 
> 
> This isn't quite accurate. Depending upon the translators you use, the files
> can be stripped across servers. For clusters it is almost always the case that
> the files will be stripped.
> 
>> striping, even they say striping for their implementation is bad 
>> (http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F).
>> Because of GlusterFS's modular architecture it was easy for them to implement.
>> They do have MPI-IO support on their roadmap, so maybe they are planning to work around
>> the issues described in the link above in user space.
>>

Yes, there is a translator that will stripe files.  However, see the above comment.
Even they say it isn't a good idea to use it.

I don't see why that for clusters it would always be the case that files will be striped?  Are
you implying that clusters means "Large distributed HPC systems that read/write very large files"?
There is implicitly overhead in reconstructing a striped file that will impact performance
(but could be minimal, I haven't tested it). Streaming performance may be better but what
about random IO patterns?  If my codes don't do parallel IO, why would I necessarily
add the complexity?

I know Lustre does striping quite well, but not applications require it.

>> GlusterFS is much more like Ibrix or Netapp/GX than Lustre.  It seems best as a distributed NFS
>> replacement.  In my minimal testing, performance scales linearly as you add data servers.
>> Metadata performance is reasonable (by feel, not by actual measurements).
> 
> One of the design ideas behind GlusterFS is that it doesn't have a metadata
> server. So i'm not sure what you were measuring. It may have been the
> metadata performance for the underlying file system rather than GlusterFS.

By metadata performance, I meant IOPS.  It doesn't have a dedicated metadata server,
but all servers perform the function.  The streaming performance is quite
good, but what I if I need to use NetCDF files, compile code, or use the filesystem
as a large distributed mailserver?

Why I say streaming performance is good, I have been able to get a single server
to push about 300 MB/s.  This is a limitation of my storage device, not the
filesystem.  I don't know how performs over the IB transport when a faster
disk array is used.

> 
> I haven't tested it yet, but it has some interesting ideas (all in user-space so
> there are no kernel mods to worry about, no metadata server, stackable
> translators for tuning performance). 
> 

Yes, these features are very nice.  I liked that I could get it running on an older
kernel in only a few minutes (non-lustre server supported kernel).  So far it is
meeting my needs for a small application.  I haven't been using it long, so
I cannot comment on long term stability.  When I have some larger storage servers,
I plan to test it further (as well as Lustre).

Craig

> Jeff
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 

-- 
Craig Tierney (craig.tierney at noaa.gov)