[Lustre-discuss] Path lost when accessing files
spiechurski at sgi.com
Thu Jun 16 06:30:38 PDT 2011
This problem is documented in bug 23978 (http://bugzilla.lustre.org/show_bug.cgi?id=23978).
To summarize: the fortran runtime is making a call to getcwd() to get the full path to a file which was given as a relative path.
Lustre sometimes fail to answer to this syscall, which returns a non initialized buffer and an error code, BUT the fortran runtime does not test the getcwd() return code, and uses the buffer as-is.
The uninitialized buffer is what you see as " @", followed by the relative path.
A patch is currently inspected.
> -----Original Message-----
> From: lustre-discuss-bounces at lists.lustre.org
> [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of
> styr at free.fr
> Sent: jeudi 16 juin 2011 12:17
> To: lustre-discuss
> Subject: [Lustre-discuss] Path lost when accessing files
> Hi Lustre users,
> we actually a little problems with jobs running on our
> cluster and using Lustre. Sometimes, we have these errors :
> forrtl: No such file or directory
> forrtl: severe (29): file not found, unit 213, file @/suivi.d000
> It does not only happen with forttl but also sometimes with
> other files. It tries to access a file located at :
> @/suivi.d000. We also had errors when he was trying to access
> files like there were at the root of the FS, in this example
> It's like it was loosing or corrupting the PWD environment variable.
> The funny thing is that when we execute this same job again,
> it works perfectly. We didn't succeed in reproducing the
> errors but they still happens from time to time.
> I didn't find any Lustre errors in my logs related the these problems.
> We're using Lustre 1.8.5 on SLES 11SP1 nodes, and SLES 10 OSS and MDS.
> Do you have any clue?
> Jay N.
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
More information about the lustre-discuss