[Lustre-discuss] Weird behavior on lustre clients

Jagga Soorma jagga13 at gmail.com
Sun Aug 8 20:04:01 PDT 2010


Andreas,

Yes, these I/O errors are for any NFS filesystems mounted on all lustre
clients.  Even though this nfs mount has nothing to do with lustre there
seems to be something specific on the lustre clients with the kernel-ib and
lustre client modules installed that seems to be causing this problem.

I believe lustre caches data locally and then flushes it out on a regular
basis, but don't know enough to rule lustre out.  It looks like this issue
is happening every 8-10mins.  Is there something that lustre is doing on the
system that might be flushing some type of a cache or might be causing this
problem?  If I do a df every 5mins or so then I never see this problem.

I have just run out of things to try and wanted to check the lustre route as
a last resort in hopes of getting more information that might help me find a
permanent solution for this issue.

Any assistance/comments would be appreciated.

Thanks,
-J

On Sun, Aug 8, 2010 at 6:53 PM, Andreas Dilger <andreas.dilger at oracle.com>wrote:

> On 2010-08-08, at 16:44, Jagga Soorma wrote:
> > One other piece of information.  It seems like I have found a workaround
> by adding a cronjob that runs every 2mins and runs a df command.  Is there
> some caching issue that might be caused by lustre?
>
> Are the IO errors on NFS filesystems that have nothing to do with Lustre,
> or is this from NFS re-exporting of a Lustre filesystem?
>
> >> I am experiencing some weird behavior on my lustre clients.  I have
> worked with Novell support and they keeping pointing to lustre as the
> culprit for these issues.  I am getting intermittent I/O errors when running
> df/ls on any nfs mounts without anything being logged in syslog.  After
> putting nfs and rpc in debug mode by running:
> >
> > I am using all supported packages/kernels for lustre and on servers
> without the lustre clients installed I have no issues with nfs.  Does the
> interval between these errors mean anything?
> >
> > Any help would be greatly appreciated.
> >
> > reshpc115:~ # uname -a
> > Linux reshpc115 2.6.27.29-0.1-default #1 SMP 2009-08-15 17:53:59 +0200
> x86_64 x86_64 x86_64 GNU/Linux
> > reshpc115:~ # rpm -qa | grep -i lustre
> > lustre-client-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> > lustre-client-modules-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> > reshpc115:~ # rpm -qa | grep -i kernel-ib
> > kernel-ib-1.4.2-2.6.27.29_0.1_default
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100808/bef9a2a8/attachment.htm>


More information about the lustre-discuss mailing list