[lustre-discuss] Announce: Lustre Systems Administration Guide

Marcin Dulak marcin.dulak at gmail.com
Sat Nov 18 02:47:01 PST 2017

On Sat, Nov 18, 2017 at 4:20 AM, Stu Midgley <sdm900 at gmail.com> wrote:

> Thank you both for the documentation.  I know how hard it is to maintain.
> I've asked that all my admin staff to read it - even if some of it doesn't
> directly apply to our environment.
> What we would like is ell organised, comprehensive, accurate and up to
> date documenation.  Most of the time when I dive into the manual, or other
> online material, I find it isn't quite right (path's slightly wrong or
> outdated etc).  I also have difficulty finding all the information I want
> in a single location and in a logical fashon.  These aren't new issues and
> blight all documentation, but having the definitive source in a wiki might
> open it up to more transparency, greater use and thus, ultimately, being
> kept up to date, even if its by others outside Intel.

Documentation should be treated in the say way as code, i.e. automatically
tested. This is not a new idea
and with the access to various kinds of virtualization this is feasible now.
There are Python projects (
https://gitlab.com/ase/ase/tree/master/doc/tutorials), that make use of
this idea thanks to http://www.sphinx-doc.org which allows one to execute
embedded Python commands
during the process of building the documentation in html or pdf formats out
of rst (restructured text) files.
There is a system that stores LFS (Linux from scratch) in an xml format for
extraction to be executed http://www.linuxfromscratch.org/alfs/
https://github.com/ojab/jhalfs but it seems not to be under a continuous
automatic testing.
However, projects like https://docs.openstack.org/install-guide/
suprisingly do not use this idea and it takes months to correct a small
inconsistency in the documentation

It is not very difficult to create a virtual setup consisting of several
lustre servers in an unattended way (
https://github.com/marcindulak/vagrant-lustre-tutorial-centos6) and use that
to test the lustre documentation.
An alternative to making the lustre documentation executable would be to
abstract the basics of lustre using a supported configuration management
system (is there any progress about
https://www.youtube.com/watch?v=WX00LQLYf2w ?) and test that using the
standard CI tools.



> I'd also like a section where people can post their experiences and
> solutions.  For example, in recent times, we have battled bad interactions
> with ZFS+lustre which lead to poor performance and ZFS corruption.  While
> we have now tuned both lustre and zfs and the bugs have mostly been fixed,
> the learnings, trouble shooting methods etc. should be preserved and might
> assist others in the future diagnose tricky problems.

> That's my 5c.
> On Sat, Nov 18, 2017 at 6:03 AM, Dilger, Andreas <andreas.dilger at intel.com
> > wrote:
>> On Nov 16, 2017, at 22:41, Cowe, Malcolm J <malcolm.j.cowe at intel.com>
>> wrote:
>> >
>> > I am pleased to announce the availability of a new systems
>> administration guide for the Lustre file system, which has been published
>> to wiki.lustre.org. The content can be accessed directly from the front
>> page of the wiki, or from the following URL:
>> >
>> > http://wiki.lustre.org/Category:Lustre_Systems_Administration
>> >
>> > The guide is intended to provide comprehensive instructions for the
>> installation and configuration of production-ready Lustre storage clusters.
>> Topics covered:
>> >
>> >       • Introduction to Lustre
>> >       • Lustre File System Components
>> >       • Lustre Software Installation
>> >       • Lustre Networking (LNet)
>> >       • LNet Router Configuration
>> >       • Lustre Object Storage Devices (OSDs)
>> >       • Creating Lustre File System Services
>> >       • Mounting a Lustre File System on Client Nodes
>> >       • Starting and Stopping Lustre Services
>> >       • Lustre High Availability
>> >
>> > Refer to the front page of the guide for the complete table of contents.
>> Malcolm,
>> thanks so much for your work on this.  It is definitely improving the
>> state of the documentation available today.
>> I was wondering if people have an opinion on whether we should remove
>> some/all of the administration content from the Lustre Operations Manual,
>> and make that more of a reference manual that contains details of
>> commands, architecture, features, etc. as a second-level reference from
>> the wiki admin guide?
>> For that matter, should we export the XML Manual into the wiki and
>> leave it there?  We'd have to make sure that the wiki is being indexed
>> by Google for easier searching before we could do that.
>> Cheers, Andreas
>> > In addition, for people who are new to Lustre, there is a high-level
>> introduction to Lustre concepts, available as a PDF download:
>> >
>> > http://wiki.lustre.org/images/6/64/LustreArchitecture-v4.pdf
>> >
>> >
>> > Malcolm Cowe
>> > High Performance Data Division
>> >
>> > Intel Corporation | www.intel.com
>> >
>> > _______________________________________________
>> > lustre-discuss mailing list
>> > lustre-discuss at lists.lustre.org
>> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Lustre Principal Architect
>> Intel Corporation
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> --
> Dr Stuart Midgley
> sdm900 at gmail.com
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20171118/58586dd6/attachment-0001.html>

More information about the lustre-discuss mailing list