[Lustre-discuss] Weird behavior on lustre clients

Jagga Soorma jagga13 at gmail.com
Sun Aug 8 13:44:57 PDT 2010


One other piece of information.  It seems like I have found a workaround by
adding a cronjob that runs every 2mins and runs a df command.  Is there some
caching issue that might be caused by lustre?

Thanks,
-J

On Sun, Aug 8, 2010 at 3:15 AM, Jagga Soorma <jagga13 at gmail.com> wrote:

> Hello,
>
> I am experiencing some weird behavior on my lustre clients.  I have worked
> with Novell support and they keeping pointing to lustre as the culprit for
> these issues.  I am getting intermittent I/O errors when running df/ls on
> any nfs mounts without anything being logged in syslog.  After putting nfs
> and rpc in debug mode by running:
>
> rpcdebug -m nfs -s all
> rpcdebug -m rpc -s all
>
> I now see the following errors in my logs:
>
> ..snip..
> Aug  8 02:32:56 reshpc115 kernel: RPC:  2440 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:32:56 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:32:59 reshpc115 kernel: RPC:  2441 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:32:59 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:47:59 reshpc115 kernel: RPC:  2447 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:47:59 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:57:59 reshpc115 kernel: RPC:  2451 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:57:59 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:58:00 reshpc115 kernel: RPC:  2452 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:58:00 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:58:13 reshpc115 kernel: RPC:  2453 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:58:13 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:58:26 reshpc115 kernel: RPC:  2454 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:58:26 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:58:30 reshpc115 kernel: RPC:  2455 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:58:30 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:58:32 reshpc115 kernel: RPC:  2456 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:58:32 reshpc115 kernel: nfs_statfs: statfs error = 5
> ..snip..
>
> I am using all supported packages/kernels for lustre and on servers without
> the lustre clients installed I have no issues with nfs.  Does the interval
> between these errors mean anything?
>
> Any help would be greatly appreciated.
>
> Thanks,
> -J
>
> --
> reshpc115:~ # uname -a
> Linux reshpc115 2.6.27.29-0.1-default #1 SMP 2009-08-15 17:53:59 +0200
> x86_64 x86_64 x86_64 GNU/Linux
> reshpc115:~ # rpm -qa | grep -i lustre
> lustre-client-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> lustre-client-modules-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> reshpc115:~ # rpm -qa | grep -i kernel-ib
> kernel-ib-1.4.2-2.6.27.29_0.1_default
> --
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100808/95ec1727/attachment.htm>


More information about the lustre-discuss mailing list