[lustre-discuss] File locking errors.

E.S. Rosenberg esr+lustre at mail.hebrew.edu
Thu Feb 15 14:17:56 PST 2018


On Fri, Feb 16, 2018 at 12:00 AM, Colin Faber <cfaber at gmail.com> wrote:

> If the mount on the users clients had the various options enabled, and
> those aren't present in fstab, you'd end up with such behavior. Also 2.8?
> Can you upgrade to 2.10 LTS??
>
Depending on when they installed their system that may not be such a
'small' change, our 2.8 is running on CentOS 6.8 so an upgrade to 2.10
requires us to also upgrade the OS from 6.x to 7.x and though I very much
want to do that that is a more intensive process that so far I have not had
the time for and I can imagine others have the same issue.
Regards,
Eli

>
>
>
> On Feb 15, 2018 1:06 PM, "Prentice Bisbal" <pbisbal at pppl.gov> wrote:
>
>> No. Several others have asked me the same thing, so that seems like it
>> might be the issue. The only problem with that solution is that the user
>> claimed his program worked just fine up until a couple of weeks ago, so if
>> that is the issue, I'll still be scratching my head trying to figure out
>> how/what changed
>>
>>
>> Prentice
>>
>> On 02/15/2018 12:31 PM, Alexander I Kulyavtsev wrote:
>>
>> Do you have *flock* option in fstab for lustre mount or in command you
>> use to mount lustre on client?
>>
>> Search for flock on lustre wiki
>> http://wiki.lustre.org/Mounting_a_Lustre_File_System_on_Client_Nodes
>> or lustre manual
>> http://doc.lustre.org/lustre_manual.pdf
>>
>> Here are links where to start learning about lustre:
>> * http://lustre.org/getting-started-with-lustre/
>> * http://wiki.lustre.org
>> * https://wiki.hpdd.intel.com
>> * jira.hpdd.intel.com
>> * http://opensfs.org/lustre/
>>
>> Alex.
>>
>> On Feb 15, 2018, at 11:02 AM, Prentice Bisbal <pbisbal at pppl.gov> wrote:
>>
>> Hi.
>>
>> I'm an experience HPC system admin, but I know almost nothing about
>> Lustre administration. The system admin who administered our small Lustre
>> filesystem recently retired, and no one has filled that gap yet. A user
>> recently reported they are now getting file-locking errors from a program
>> they've run repeatedly on Lustre in the past. When the run the same program
>> on an NFS filesystem, the error goes away. I've cut-and-pasted the error
>> messages below.
>>
>> Since I have real experience as a Lustre admin, I turned to google, and
>> it looks like it might be that the file-locking daemon died (if Lustre has
>> a separate file-lock daemon), or somehow file-locking was recently
>> disabled. If that is possible, how do I check this, and restart or
>> re-enable if necessary?  I skimmed the user manual, and could not find
>> anything on either of these issues.
>>
>> Any and all help will be greatly appreciated.
>>
>> Some of the error messages:
>>
>> HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) MPI-process 9:
>>   #000: H5F.c line 579 in H5Fopen(): unable to open file
>>     major: File accessibilty
>>     minor: Unable to open file
>>   #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or
>> initialize file structure
>>     major: File accessibilty
>>     minor: Unable to open file
>>   #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed
>>     major: Virtual File Layer
>>     minor: Can't update object
>>   #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file,
>> errno = 38, error message = 'Function not implemented'
>>     major: File accessibilty
>>     minor: Bad file ID accessed
>> Error: couldn't open file HDF5-DIAG: Error detected in HDF5
>> (1.10.0-patch1) MPI-process 13:
>>   #000: H5F.c line 579 in H5Fopen(): unable to open file
>>     major: File accessibilty
>>     minor: Unable to open file
>>   #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or
>> initialize file structure
>>     major: File accessibilty
>>     minor: Unable to open file
>>   #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed
>>     major: Virtual File Layer
>>     minor: Can't update object
>>   #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file,
>> errno = 38, error message = 'Function not implemented'
>>     major: File accessibilty
>>     minor: Bad file ID accessed
>>
>> --
>> Prentice
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
>>
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180216/bd88bdbb/attachment.html>


More information about the lustre-discuss mailing list