[lustre-discuss] refresh file layout error

Patrick Farrell paf at cray.com
Fri Sep 4 14:08:39 PDT 2015


Oops, I'm sorry - this was supposed to be a reply to Amit Kumar's thread.  Apologies.
________________________________________
From: lustre-discuss [lustre-discuss-bounces at lists.lustre.org] on behalf of Patrick Farrell [paf at cray.com]
Sent: Friday, September 04, 2015 4:07 PM
To: lustre-discuss at lists.lustre.org; E.S. Rosenberg; Wahl,      Edward
Subject: Re: [lustre-discuss] refresh file layout error

Martin might know about that short read thing, since his site has a nice wiki page on it:
https://wickie.hlrs.de/platforms/index.php/Lustre_short_read

Technically Lustre is allowed to return fewer bytes than requested, as it says on that page.  But it doesn't normally - LU-6389 is a bug where that can happen kind of often.  (Again, it's technically allowed as that page says...  But it shouldn't really happen in practice, which is why LU-6389 is a bug.)

So perhaps Gaussian does not retry short reads?  If memory serves, it's closed source, so you can't check - but perhaps you could ask the vendor?
________________________________________
From: lustre-discuss [lustre-discuss-bounces at lists.lustre.org] on behalf of Martin Hecht [hecht at hlrs.de]
Sent: Friday, September 04, 2015 8:53 AM
To: E.S. Rosenberg; Wahl, Edward
Cc: lustre-discuss at lists.lustre.org
Subject: Re: [lustre-discuss] refresh file layout error

On 09/03/2015 07:22 AM, E.S. Rosenberg wrote:
> On Wed, Sep 2, 2015 at 8:47 PM, Wahl, Edward <ewahl at osc.edu> wrote:
>
>> That would be my guess here.  Any chance this is across NFS?  Seen that a
>> great deal with this error, it used to cause crashes.
>>
> Strictly speaking it is not, but it may be because a part of the path the
> server 'sees'/'knows' is a symlink to the lustre filesystem which lives on
> nfs...
>
Ah, I can remember a problem we had some years ago, when users with
their $HOME on NFS were accessing many files in directories on lustre
via symlink. Somehow the NAS box serving the nfs file system didn't
immediately notice that the files weren't on its own file system and
repeatedly had to look up in its cache, just to notice that the files
are somewhere else behind a symlink. If I recall correctly, the problem
could be avoided by:
- Either access the file via absolute path, or cd into the directory
(both via mount point, not (!) via symlink)
- Or make the symlink an absolute one (I'm not 100% sure, but I believe
the problem was only with relative links pointing out of the NFS upwards
across the mountpoint and down again into the lustre file system).
It could be something similar here. Do you have any chance to access the
files via absolute path in your setup and web server configuration?

best regards, Martin

_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list