[Lustre-discuss] l_getgroups message
Andreas Dilger
adilger at sun.com
Mon Aug 18 16:59:09 PDT 2008
On Aug 18, 2008 15:46 +0200, Heiko Schroeter wrote:
> from time to time we see these messages on our MDS 1.6.5.1 during
> copying data onto lustre.
>
> Is this just informational or an indicator of a broken setup ? Network load
> problems ?
>
> We checked the group rights and they look ok to us. The lustre MDS system
> including clients runs with YP setup. It seems to us that the message comes
> from "lustre-1.6.5.1/lustre/utils/l_getgroups.c". But we cannot nail down
> the real reason.
> Aug 18 14:39:53 mds1 l_getgroups: LONG OP getgrent loop: 25 elapsed, 3 expected
> Aug 18 14:39:53 mds1 l_getgroups: LONG OP get_groups_local: 25 elapsed, 10 expected
The reason is that it seems YP is taking longer than Lustre has expected it
to. You should be able to remove these messages by increasing the timeout in
/proc/fs/lustre/mds/{mds}/group_acquire_expire. You might also reduce the
load on the YP server by increasing the group cache lifetime by increasing
/proc/fs/lustre/mds/{mds}/group_expire_interval (default 600 seconds).
The message itself isn't harmful, just a warning at this point that your name
services are taking longer than expected.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
More information about the lustre-discuss
mailing list