[Lustre-discuss] Path lost when accessing files
styr at free.fr
styr at free.fr
Fri Jun 17 00:47:28 PDT 2011
Thanks Sebastian,
I didn't check that out. I'll start following that bug.
----- Mail Original -----
De: "Sebastien Piechurski" <spiechurski at sgi.com>
À: styr at free.fr, "lustre-discuss" <lustre-discuss at lists.lustre.org>
Envoyé: Jeudi 16 Juin 2011 15h30:38 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne
Objet: RE: [Lustre-discuss] Path lost when accessing files
Hi,
This problem is documented in bug 23978 (http://bugzilla.lustre.org/show_bug.cgi?id=23978).
To summarize: the fortran runtime is making a call to getcwd() to get the full path to a file which was given as a relative path.
Lustre sometimes fail to answer to this syscall, which returns a non initialized buffer and an error code, BUT the fortran runtime does not test the getcwd() return code, and uses the buffer as-is.
The uninitialized buffer is what you see as " @", followed by the relative path.
A patch is currently inspected.
> -----Original Message-----
> From: lustre-discuss-bounces at lists.lustre.org
> [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of
> styr at free.fr
> Sent: jeudi 16 juin 2011 12:17
> To: lustre-discuss
> Subject: [Lustre-discuss] Path lost when accessing files
>
> Hi Lustre users,
>
> we actually a little problems with jobs running on our
> cluster and using Lustre. Sometimes, we have these errors :
> forrtl: No such file or directory
> forrtl: severe (29): file not found, unit 213, file @/suivi.d000
>
> It does not only happen with forttl but also sometimes with
> other files. It tries to access a file located at :
> @/suivi.d000. We also had errors when he was trying to access
> files like there were at the root of the FS, in this example
> /suivi.d000.
>
> It's like it was loosing or corrupting the PWD environment variable.
>
> The funny thing is that when we execute this same job again,
> it works perfectly. We didn't succeed in reproducing the
> errors but they still happens from time to time.
>
> I didn't find any Lustre errors in my logs related the these problems.
>
> We're using Lustre 1.8.5 on SLES 11SP1 nodes, and SLES 10 OSS and MDS.
>
> Do you have any clue?
>
> Thanks,
>
> Jay N.
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
More information about the lustre-discuss
mailing list