[lustre-discuss] File locking errors.
Prentice Bisbal
pbisbal at pppl.gov
Fri Feb 16 07:58:37 PST 2018
I'm using CentOS 6, too.
Prentice
On 02/15/2018 05:17 PM, E.S. Rosenberg wrote:
>
>
> On Fri, Feb 16, 2018 at 12:00 AM, Colin Faber <cfaber at gmail.com
> <mailto:cfaber at gmail.com>> wrote:
>
> If the mount on the users clients had the various options enabled,
> and those aren't present in fstab, you'd end up with such
> behavior. Also 2.8? Can you upgrade to 2.10 LTS??
>
> Depending on when they installed their system that may not be such a
> 'small' change, our 2.8 is running on CentOS 6.8 so an upgrade to 2.10
> requires us to also upgrade the OS from 6.x to 7.x and though I very
> much want to do that that is a more intensive process that so far I
> have not had the time for and I can imagine others have the same issue.
> Regards,
> Eli
>
>
>
>
> On Feb 15, 2018 1:06 PM, "Prentice Bisbal" <pbisbal at pppl.gov
> <mailto:pbisbal at pppl.gov>> wrote:
>
> No. Several others have asked me the same thing, so that seems
> like it might be the issue. The only problem with that
> solution is that the user claimed his program worked just fine
> up until a couple of weeks ago, so if that is the issue, I'll
> still be scratching my head trying to figure out how/what changed
>
>
> Prentice
>
> On 02/15/2018 12:31 PM, Alexander I Kulyavtsev wrote:
>> Do you have *flock* option in fstab for lustre mount or in
>> command you use to mount lustre on client?
>>
>> Search for flock on lustre wiki
>> http://wiki.lustre.org/Mounting_a_Lustre_File_System_on_Client_Nodes
>> <http://wiki.lustre.org/Mounting_a_Lustre_File_System_on_Client_Nodes>
>> or lustre manual
>> http://doc.lustre.org/lustre_manual.pdf
>> <http://doc.lustre.org/lustre_manual.pdf>
>>
>> Here are links where to start learning about lustre:
>> * http://lustre.org/getting-started-with-lustre/
>> <http://lustre.org/getting-started-with-lustre/>
>> * http://wiki.lustre.org
>> * https://wiki.hpdd.intel.com
>> * jira.hpdd.intel.com <http://jira.hpdd.intel.com>
>> * http://opensfs.org/lustre/
>>
>> Alex.
>>
>>> On Feb 15, 2018, at 11:02 AM, Prentice Bisbal
>>> <pbisbal at pppl.gov <mailto:pbisbal at pppl.gov>> wrote:
>>>
>>> Hi.
>>>
>>> I'm an experience HPC system admin, but I know almost
>>> nothing about Lustre administration. The system admin who
>>> administered our small Lustre filesystem recently retired,
>>> and no one has filled that gap yet. A user recently reported
>>> they are now getting file-locking errors from a program
>>> they've run repeatedly on Lustre in the past. When the run
>>> the same program on an NFS filesystem, the error goes away.
>>> I've cut-and-pasted the error messages below.
>>>
>>> Since I have real experience as a Lustre admin, I turned to
>>> google, and it looks like it might be that the file-locking
>>> daemon died (if Lustre has a separate file-lock daemon), or
>>> somehow file-locking was recently disabled. If that is
>>> possible, how do I check this, and restart or re-enable if
>>> necessary? I skimmed the user manual, and could not find
>>> anything on either of these issues.
>>>
>>> Any and all help will be greatly appreciated.
>>>
>>> Some of the error messages:
>>>
>>> HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) MPI-process 9:
>>> #000: H5F.c line 579 in H5Fopen(): unable to open file
>>> major: File accessibilty
>>> minor: Unable to open file
>>> #001: H5Fint.c line 1168 in H5F_open(): unable to lock the
>>> file or initialize file structure
>>> major: File accessibilty
>>> minor: Unable to open file
>>> #002: H5FD.c line 1821 in H5FD_lock(): driver lock request
>>> failed
>>> major: Virtual File Layer
>>> minor: Can't update object
>>> #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to
>>> flock file, errno = 38, error message = 'Function not
>>> implemented'
>>> major: File accessibilty
>>> minor: Bad file ID accessed
>>> Error: couldn't open file HDF5-DIAG: Error detected in HDF5
>>> (1.10.0-patch1) MPI-process 13:
>>> #000: H5F.c line 579 in H5Fopen(): unable to open file
>>> major: File accessibilty
>>> minor: Unable to open file
>>> #001: H5Fint.c line 1168 in H5F_open(): unable to lock the
>>> file or initialize file structure
>>> major: File accessibilty
>>> minor: Unable to open file
>>> #002: H5FD.c line 1821 in H5FD_lock(): driver lock request
>>> failed
>>> major: Virtual File Layer
>>> minor: Can't update object
>>> #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to
>>> flock file, errno = 38, error message = 'Function not
>>> implemented'
>>> major: File accessibilty
>>> minor: Bad file ID accessed
>>>
>>> --
>>> Prentice
>>>
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> <mailto:lustre-discuss at lists.lustre.org>
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>> <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
>>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> <mailto:lustre-discuss at lists.lustre.org>
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> <mailto:lustre-discuss at lists.lustre.org>
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180216/7b0a3d6d/attachment-0001.html>
More information about the lustre-discuss
mailing list