[Lustre-discuss] Weird behavior on lustre clients
Jagga Soorma
jagga13 at gmail.com
Sun Aug 8 13:44:57 PDT 2010
One other piece of information. It seems like I have found a workaround by
adding a cronjob that runs every 2mins and runs a df command. Is there some
caching issue that might be caused by lustre?
Thanks,
-J
On Sun, Aug 8, 2010 at 3:15 AM, Jagga Soorma <jagga13 at gmail.com> wrote:
> Hello,
>
> I am experiencing some weird behavior on my lustre clients. I have worked
> with Novell support and they keeping pointing to lustre as the culprit for
> these issues. I am getting intermittent I/O errors when running df/ls on
> any nfs mounts without anything being logged in syslog. After putting nfs
> and rpc in debug mode by running:
>
> rpcdebug -m nfs -s all
> rpcdebug -m rpc -s all
>
> I now see the following errors in my logs:
>
> ..snip..
> Aug 8 02:32:56 reshpc115 kernel: RPC: 2440 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug 8 02:32:56 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug 8 02:32:59 reshpc115 kernel: RPC: 2441 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug 8 02:32:59 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug 8 02:47:59 reshpc115 kernel: RPC: 2447 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug 8 02:47:59 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug 8 02:57:59 reshpc115 kernel: RPC: 2451 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug 8 02:57:59 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug 8 02:58:00 reshpc115 kernel: RPC: 2452 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug 8 02:58:00 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug 8 02:58:13 reshpc115 kernel: RPC: 2453 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug 8 02:58:13 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug 8 02:58:26 reshpc115 kernel: RPC: 2454 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug 8 02:58:26 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug 8 02:58:30 reshpc115 kernel: RPC: 2455 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug 8 02:58:30 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug 8 02:58:32 reshpc115 kernel: RPC: 2456 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug 8 02:58:32 reshpc115 kernel: nfs_statfs: statfs error = 5
> ..snip..
>
> I am using all supported packages/kernels for lustre and on servers without
> the lustre clients installed I have no issues with nfs. Does the interval
> between these errors mean anything?
>
> Any help would be greatly appreciated.
>
> Thanks,
> -J
>
> --
> reshpc115:~ # uname -a
> Linux reshpc115 2.6.27.29-0.1-default #1 SMP 2009-08-15 17:53:59 +0200
> x86_64 x86_64 x86_64 GNU/Linux
> reshpc115:~ # rpm -qa | grep -i lustre
> lustre-client-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> lustre-client-modules-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> reshpc115:~ # rpm -qa | grep -i kernel-ib
> kernel-ib-1.4.2-2.6.27.29_0.1_default
> --
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100808/95ec1727/attachment.htm>
More information about the lustre-discuss
mailing list