[Lustre-discuss] Looping in __d_lookup
Alex Lyashkov
Alexey.Lyashkov at Sun.COM
Wed May 21 23:22:04 PDT 2008
On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote:
> >
> > kernel is 2.6.23.17 with patchless lustre 1.6.4.3,
>
> I'm running 1.6.4.3 patchless as well against an 2.6.18 vanilla kernel.
> Or at least that is what I thought. OpenVz patch effectively makes the
> kernel a 2.6.18++ kernel because they add features from newer kernels in
> their maintained 2.6.18 based kernel.
>
> So the lockup in __d_lookup may just relate to newer patchless clients.
>
> I got a debug patch from the OpenVz community which indicate dcache
> chain corruption in a lustre code path.
>
> Patch snippet is
>
> --- ./fs/dcache.c.ddebug2 2008-05-21 14:52:15.000000000 +0400
> +++ ./fs/dcache.c 2008-05-21 15:10:06.000000000 +0400
> @@ -1350,6 +1350,18 @@ static void __d_rehash(struct dentry * e
> {
>
> entry->d_flags &= ~DCACHE_UNHASHED;
> + if (!spin_is_locked(&dcache_lock)) {
> + printk(KERN_ERR "Dcache lock is not taken on add\n");
> + dump_stack();
> + } else if (list->first != NULL &&
> + list->first->pprev != &list->first) {
> + printk(KERN_ERR "Dcache chain corruption:\n");
> + printk(KERN_ERR "Chain %p --next-> %p\n",
> + list, list->first);
> + printk(KERN_ERR "First %p <-pprev- %p\n",
> + list->first, list->first->pprev);
> + dump_stack();
> + }
> hlist_add_head_rcu(&entry->d_hash, list);
> }
>
> and stack trace
>
> [ 6447.548789] Dcache chain corruption:
> [ 6447.549529] Chain ffff8100010de880 --next-> ffff8100b4ce00b0
> [ 6447.550699] First ffff8100b4ce00b0 <-pprev- 0000000000200200
> [ 6447.551711]
> [ 6447.551713] Call Trace:
> [ 6447.552809] [<ffffffff8020ae20>] show_trace+0xae/0x360
> [ 6447.553784] [<ffffffff8020b0e7>] dump_stack+0x15/0x17
> [ 6447.554727] [<ffffffff8029ee94>] __d_rehash+0x75/0x97
> [ 6447.555797] [<ffffffff8029ef2a>] d_rehash+0x74/0x91
> [ 6447.556846] [<ffffffff883b4c6a>] :lustre:ll_revalidate_it+0xa1a/0xd90
> [ 6447.557966] [<ffffffff883b529c>] :lustre:ll_revalidate_nd+0x2bc/0x360
> [ 6447.559082] [<ffffffff80295741>] do_lookup+0x15d/0x193
> [ 6447.560142] [<ffffffff80296fd9>] __link_path_walk+0x409/0x10ac
> [snip]
>
This patch and backtrace say - dcache chain was damaged _before_ enter
to lustre, lustre start add entry to new position in dentry cache, and
find damaged entry in list.
--
Alex Lyashkov <Alexey.lyashkov at sun.com>
Lustre Group, Sun Microsystems
More information about the lustre-discuss
mailing list