[Lustre-discuss] Looping in __d_lookup

Jakob Goldbach jakob at goldbach.dk
Wed May 21 12:05:09 PDT 2008


> 
> kernel is 2.6.23.17 with patchless lustre 1.6.4.3, 

I'm running 1.6.4.3 patchless as well against an 2.6.18 vanilla kernel.
Or at least that is what I thought. OpenVz patch effectively makes the
kernel a 2.6.18++ kernel because they add features from newer kernels in
their maintained 2.6.18 based kernel.  

So the lockup in __d_lookup may just relate to newer patchless clients. 

I got a debug patch from the OpenVz community which indicate dcache
chain corruption in a lustre code path. 

Patch snippet is

--- ./fs/dcache.c.ddebug2	2008-05-21 14:52:15.000000000 +0400
+++ ./fs/dcache.c	2008-05-21 15:10:06.000000000 +0400
@@ -1350,6 +1350,18 @@ static void __d_rehash(struct dentry * e
 {
 
  	entry->d_flags &= ~DCACHE_UNHASHED;
+	if (!spin_is_locked(&dcache_lock)) {
+		printk(KERN_ERR "Dcache lock is not taken on add\n");
+		dump_stack();
+	} else if (list->first != NULL &&
+			list->first->pprev != &list->first) {
+		printk(KERN_ERR "Dcache chain corruption:\n");
+		printk(KERN_ERR "Chain %p --next-> %p\n",
+				list, list->first);
+		printk(KERN_ERR "First %p <-pprev- %p\n",
+				list->first, list->first->pprev);
+		dump_stack();
+	}
  	hlist_add_head_rcu(&entry->d_hash, list);
 }

and stack trace 

[ 6447.548789] Dcache chain corruption:
[ 6447.549529] Chain ffff8100010de880 --next-> ffff8100b4ce00b0
[ 6447.550699] First ffff8100b4ce00b0 <-pprev- 0000000000200200
[ 6447.551711] 
[ 6447.551713] Call Trace:
[ 6447.552809]  [<ffffffff8020ae20>] show_trace+0xae/0x360
[ 6447.553784]  [<ffffffff8020b0e7>] dump_stack+0x15/0x17
[ 6447.554727]  [<ffffffff8029ee94>] __d_rehash+0x75/0x97
[ 6447.555797]  [<ffffffff8029ef2a>] d_rehash+0x74/0x91
[ 6447.556846]  [<ffffffff883b4c6a>] :lustre:ll_revalidate_it+0xa1a/0xd90
[ 6447.557966]  [<ffffffff883b529c>] :lustre:ll_revalidate_nd+0x2bc/0x360
[ 6447.559082]  [<ffffffff80295741>] do_lookup+0x15d/0x193
[ 6447.560142]  [<ffffffff80296fd9>] __link_path_walk+0x409/0x10ac
[snip]

See details in http://bugzilla.openvz.org/show_bug.cgi?id=895

/Jakob




More information about the lustre-discuss mailing list