[Lustre-discuss] 2.6.22

Papp Tamás tompos at martos.bme.hu
Tue Jun 17 08:01:23 PDT 2008


Bernd Schubert wrote:
> Hello Tamás,
>
> On Tuesday 17 June 2008 16:41:55 Papp Tamás wrote:
>   
>> Dear All,
>>
>> Is there any reason to not user kernels with version 2.6.22.x above
>> 2.6.22.14 or should it work?
>>
>>
>> I've just compiled it with 2.6.22.19 and I can mount the cluster, but
>> after the first ls command it gives me an oops, and stuck on this stage.
>>     
>
> I didn't have the time to test lustre-1.6.5, but it would be quite helpful if 
> you could paste the oops.
>   

helo!

Sure..

This is from dmesg:

PM: Adding info for No Bus:lnet
Lustre: OBD class driver, info at clusterfs.com
        Lustre Version: 1.6.5
        Build Version: 
1.6.5-19700101010000-PRISTINE-.usr.src.linux-2.6.22.19.-2.6.22.19
PM: Adding info for No Bus:obd_psdev
Lustre: Added LNI 192.168.0.123 at tcp [8/256]
Lustre: Accept secure, port 988
LustreError: 2007:0:(router_proc.c:1013:lnet_proc_init()) couldn't 
create proc entry sys/lnet/stats
Lustre: Lustre Client File System; info at clusterfs.com
Lustre: Request x1 sent from MGC10.1.1.1 at tcp to NID 10.1.1.1 at tcp 5s ago 
has timed out (limit 5s).
Lustre: Changing connection for MGC10.1.1.1 at tcp to 
MGC10.1.1.1 at tcp_1/10.1.1.2 at tcp
Lustre: Request x3 sent from MGC10.1.1.1 at tcp to NID 10.1.1.2 at tcp 5s ago 
has timed out (limit 5s).
LustreError: 2005:0:(client.c:716:ptlrpc_import_delay_req()) @@@ 
IMP_INVALID  req at ffff810073341000 x4/t0 
o501->MGS at MGC10.1.1.1@tcp_1:26/25 lens 136/248 e 0 to 100 dl 0 ref 1 fl 
Rpc:/0/0 rc 0/0
LustreError: 15c-8: MGC10.1.1.1 at tcp: The configuration from log 
'cubefs-client' failed (-108). This may be the result of communication 
errors between this node and the MGS, a bad configuration, or other 
errors. See the syslog for more information.
LustreError: 2005:0:(llite_lib.c:1061:ll_fill_super()) Unable to process 
log: -108
Lustre: client ffff8100734e6000 umount complete
LustreError: 2005:0:(obd_mount.c:1951:lustre_fill_super()) Unable to 
mount  (-108)
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
PM: Removing info for No Bus:vcs1
PM: Removing info for No Bus:vcsa1
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
PM: Removing info for No Bus:vcs1
PM: Removing info for No Bus:vcsa1
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
PM: Removing info for No Bus:vcs1
PM: Removing info for No Bus:vcsa1
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
PM: Adding info for No Bus:vcs3
PM: Adding info for No Bus:vcsa3
PM: Removing info for No Bus:vcs3
PM: Removing info for No Bus:vcsa3
PM: Adding info for No Bus:vcs3
PM: Adding info for No Bus:vcsa3
PM: Adding info for No Bus:vcs4
PM: Adding info for No Bus:vcsa4
PM: Removing info for No Bus:vcs4
PM: Removing info for No Bus:vcsa4
PM: Adding info for No Bus:vcs4
PM: Adding info for No Bus:vcsa4
PM: Adding info for No Bus:vcs2
PM: Adding info for No Bus:vcsa2
PM: Removing info for No Bus:vcs2
PM: Removing info for No Bus:vcsa2
PM: Adding info for No Bus:vcs2
PM: Adding info for No Bus:vcsa2
PM: Adding info for No Bus:vcs5
PM: Adding info for No Bus:vcsa5
PM: Removing info for No Bus:vcs5
PM: Removing info for No Bus:vcsa5
PM: Adding info for No Bus:vcs5


An this is from messages log:

Jun 17 16:06:20 core-123 kernel: Lustre: OBD class driver, 
info at clusterfs.com
Jun 17 16:06:20 core-123 kernel:         Lustre Version: 1.6.5
Jun 17 16:06:20 core-123 kernel:         Build Version: 
1.6.5-19700101010000-PRISTINE-.usr.src.linux-2.6.22.19.-2.6.22.19
Jun 17 16:06:20 core-123 kernel: Lustre: Added LNI 192.168.0.123 at tcp [8/256]
Jun 17 16:06:20 core-123 kernel: Lustre: Accept secure, port 988
Jun 17 16:06:20 core-123 kernel: LustreError: 
2014:0:(router_proc.c:1013:lnet_proc_init()) couldn't create proc entry 
sys/lnet/stats
Jun 17 16:06:21 core-123 kernel: Lustre: Lustre Client File System; 
info at clusterfs.com
Jun 17 16:06:21 core-123 kernel: Lustre: 
cubefs-clilov-ffff810076303800.lov: set parameter stripesize=8388608
Jun 17 16:06:26 core-123 kernel: Lustre: Request x8 sent from 
cubefs-MDT0000-mdc-ffff810076303800 to NID 10.1.1.2 at tcp 5s ago has timed 
out (limit 5s).
Jun 17 16:06:46 core-123 kernel: Lustre: Changing connection for 
cubefs-MDT0000-mdc-ffff810076303800 to 10.1.1.1 at tcp/10.1.1.1 at tcp
Jun 17 16:06:46 core-123 kernel: Lustre: Client cubefs-client has started
Jun 17 16:06:54 core-123 pcscd: winscard.c:219:SCardConnect() Reader 
E-Gate 0 0 Not Found
Jun 17 16:06:54 core-123 pcscd:last message repeated 3 times
Jun 17 16:06:54 core-123 acpid: client connected from 2208[0:0]
Jun 17 16:06:58 core-123 acpid: client connected from 2218[0:0]
Jun 17 16:07:03 core-123 acpid: client connected from 2226[0:0]
Jun 17 16:09:16 core-123 kernel: Unable to handle kernel paging request 
at 00000000ffffffff RIP:
Jun 17 16:09:16 core-123 kernel:  [<ffffffff8108beef>] 
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:16 core-123 kernel: PGD 73cd9067 PUD 0
Jun 17 16:09:16 core-123 kernel: Oops: 0000 [1] SMP
Jun 17 16:09:16 core-123 kernel: CPU 0
Jun 17 16:09:16 core-123 kernel: Modules linked in: cifs mgc lustre lov 
mdc lquota osc ksocklnd ptlrpc obdclass lnet lvfs libcfs rfcomm l2cap 
bluetooth nfs lockd nfs_acl fuse sunrpc snd_hda_intel snd_seq_dummy 
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss 
snd_mixer_oss snd_pcm snd_timer iTCO_wdt iTCO_vendor_support r8169 snd 
soundcore snd_page_alloc floppy sg parport_pc parport ata_piix 
ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd 
ehci_hcd
Jun 17 16:09:16 core-123 kernel: Pid: 2339, comm: ll_sa_2337 Not tainted 
2.6.22.19 #3
Jun 17 16:09:16 core-123 kernel: RIP: 0010:[<ffffffff8108beef>]  
[<ffffffff8108beef>] kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:16 core-123 kernel: RSP: 0018:ffff810073d47da0  EFLAGS: 
00010006
Jun 17 16:09:16 core-123 kernel: RAX: 0000000000000000 RBX: 
0000000000000246 RCX: 00000000ffffffff
Jun 17 16:09:16 core-123 kernel: RDX: ffff81007fcf7210 RSI: 
00000000000000d0 RDI: ffff81000103e800
Jun 17 16:09:16 core-123 kernel: RBP: ffff810073c9607c R08: 
ffff810073d46000 R09: 0000000000000002
Jun 17 16:09:16 core-123 kernel: R10: 00000000ffffffff R11: 
0000000000000001 R12: ffff810073d2e680
Jun 17 16:09:16 core-123 kernel: R13: ffff810073d47ee0 R14: 
0000000000000000 R15: ffff81007945cab8
Jun 17 16:09:16 core-123 kernel: FS:  00002b9794ef7710(0000) 
GS:ffffffff81362000(0000) knlGS:0000000000000000
Jun 17 16:09:16 core-123 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Jun 17 16:09:16 core-123 kernel: CR2: 00000000ffffffff CR3: 
0000000073c41000 CR4: 00000000000006e0
Jun 17 16:09:16 core-123 kernel: Process ll_sa_2337 (pid: 2339, 
threadinfo ffff810073d46000, task ffff810073c195f0)
Jun 17 16:09:16 core-123 kernel: Stack:  ffff810073d47ee0 
ffffffff810a4f12 ffff810073d47ee0 ffff810073c9607c
Jun 17 16:09:16 core-123 kernel:  ffff81007b3e9580 0000000000000000 
ffff810079c34b40 ffffffff884e9b87
Jun 17 16:09:16 core-123 kernel:  0000000000000000 0000000000000000 
0000000000000000 0000000000000000
Jun 17 16:09:16 core-123 kernel: Call Trace:
Jun 17 16:09:16 core-123 kernel:  [<ffffffff810a4f12>] d_alloc+0x22/0x1d0
Jun 17 16:09:16 core-123 kernel:  [<ffffffff884e9b87>] 
:lustre:ll_statahead_thread+0xed7/0x1610
Jun 17 16:09:16 core-123 kernel:  [<ffffffff81029150>] 
default_wake_function+0x0/0x10
Jun 17 16:09:16 core-123 kernel:  [<ffffffff8100acc8>] child_rip+0xa/0x12
Jun 17 16:09:16 core-123 kernel:  [<ffffffff884a9270>] 
:lustre:ll_inode_permission+0x0/0xc0
Jun 17 16:09:16 core-123 kernel:  [<ffffffff884e8cb0>] 
:lustre:ll_statahead_thread+0x0/0x1610
Jun 17 16:09:16 core-123 kernel:  [<ffffffff8100acbe>] child_rip+0x0/0x12
Jun 17 16:09:16 core-123 kernel:
Jun 17 16:09:16 core-123 kernel:
Jun 17 16:09:16 core-123 kernel: Code: 48 8b 04 c1 48 89 42 10 53 9d 5b 
48 89 c8 c3 66 90 49 89 d0
Jun 17 16:09:16 core-123 kernel: RIP  [<ffffffff8108beef>] 
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:16 core-123 kernel:  RSP <ffff810073d47da0>
Jun 17 16:09:16 core-123 kernel: CR2: 00000000ffffffff
Jun 17 16:09:36 core-123 kernel: Unable to handle kernel paging request 
at 00000000fffffffe RIP:
Jun 17 16:09:36 core-123 kernel:  [<ffffffff8108beef>] 
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:36 core-123 kernel: PGD 73c73067 PUD 0
Jun 17 16:09:36 core-123 kernel: Oops: 0000 [2] SMP
Jun 17 16:09:36 core-123 kernel: CPU 0
Jun 17 16:09:36 core-123 kernel: Modules linked in: cifs mgc lustre lov 
mdc lquota osc ksocklnd ptlrpc obdclass lnet lvfs libcfs rfcomm l2cap 
bluetooth nfs lockd nfs_acl fuse sunrpc snd_hda_intel snd_seq_dummy 
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss 
snd_mixer_oss snd_pcm snd_timer iTCO_wdt iTCO_vendor_support r8169 snd 
soundcore snd_page_alloc floppy sg parport_pc parport ata_piix 
ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd 
ehci_hcd
Jun 17 16:09:16 core-123 kernel: Pid: 2339, comm: ll_sa_2337 Not tainted 
2.6.22.19 #3
Jun 17 16:09:16 core-123 kernel: RIP: 0010:[<ffffffff8108beef>]  
[<ffffffff8108beef>] kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:16 core-123 kernel: RSP: 0018:ffff810073d47da0  EFLAGS: 
00010006
Jun 17 16:09:16 core-123 kernel: RAX: 0000000000000000 RBX: 
0000000000000246 RCX: 00000000ffffffff
Jun 17 16:09:16 core-123 kernel: RDX: ffff81007fcf7210 RSI: 
00000000000000d0 RDI: ffff81000103e800
Jun 17 16:09:16 core-123 kernel: RBP: ffff810073c9607c R08: 
ffff810073d46000 R09: 0000000000000002
Jun 17 16:09:16 core-123 kernel: R10: 00000000ffffffff R11: 
0000000000000001 R12: ffff810073d2e680
Jun 17 16:09:16 core-123 kernel: R13: ffff810073d47ee0 R14: 
0000000000000000 R15: ffff81007945cab8
Jun 17 16:09:16 core-123 kernel: FS:  00002b9794ef7710(0000) 
GS:ffffffff81362000(0000) knlGS:0000000000000000
Jun 17 16:09:16 core-123 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Jun 17 16:09:16 core-123 kernel: CR2: 00000000ffffffff CR3: 
0000000073c41000 CR4: 00000000000006e0
Jun 17 16:09:16 core-123 kernel: Process ll_sa_2337 (pid: 2339, 
threadinfo ffff810073d46000, task ffff810073c195f0)
Jun 17 16:09:16 core-123 kernel: Stack:  ffff810073d47ee0 
ffffffff810a4f12 ffff810073d47ee0 ffff810073c9607c
Jun 17 16:09:16 core-123 kernel:  ffff81007b3e9580 0000000000000000 
ffff810079c34b40 ffffffff884e9b87
Jun 17 16:09:16 core-123 kernel:  0000000000000000 0000000000000000 
0000000000000000 0000000000000000
Jun 17 16:09:16 core-123 kernel: Call Trace:
Jun 17 16:09:16 core-123 kernel:  [<ffffffff810a4f12>] d_alloc+0x22/0x1d0
Jun 17 16:09:16 core-123 kernel:  [<ffffffff884e9b87>] 
:lustre:ll_statahead_thread+0xed7/0x1610
Jun 17 16:09:16 core-123 kernel:  [<ffffffff81029150>] 
default_wake_function+0x0/0x10
Jun 17 16:09:16 core-123 kernel:  [<ffffffff8100acc8>] child_rip+0xa/0x12
Jun 17 16:09:16 core-123 kernel:  [<ffffffff884a9270>] 
:lustre:ll_inode_permission+0x0/0xc0
Jun 17 16:09:16 core-123 kernel:  [<ffffffff884e8cb0>] 
:lustre:ll_statahead_thread+0x0/0x1610
Jun 17 16:09:16 core-123 kernel:  [<ffffffff8100acbe>] child_rip+0x0/0x12
Jun 17 16:09:16 core-123 kernel:
Jun 17 16:09:16 core-123 kernel:
Jun 17 16:09:16 core-123 kernel: Code: 48 8b 04 c1 48 89 42 10 53 9d 5b 
48 89 c8 c3 66 90 49 89 d0
Jun 17 16:09:16 core-123 kernel: RIP  [<ffffffff8108beef>] 
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:16 core-123 kernel:  RSP <ffff810073d47da0>
Jun 17 16:09:16 core-123 kernel: CR2: 00000000ffffffff
Jun 17 16:09:36 core-123 kernel: Unable to handle kernel paging request 
at 00000000fffffffe RIP:
Jun 17 16:09:36 core-123 kernel:  [<ffffffff8108beef>] 
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:36 core-123 kernel: PGD 73c73067 PUD 0
Jun 17 16:09:36 core-123 kernel: Oops: 0000 [2] SMP
Jun 17 16:09:36 core-123 kernel: CPU 0
Jun 17 16:09:36 core-123 kernel: Modules linked in: cifs mgc lustre lov 
mdc lquota osc ksocklnd ptlrpc obdclass lnet lvfs libcfs rfcomm l2cap 
bluetooth nfs lockd nfs_acl fuse sunrpc snd_hda_intel snd_seq_dummy 
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss 
snd_mixer_oss snd_pcm snd_timer iTCO_wdt iTCO_vendor_support r8169 snd 
soundcore snd_page_alloc floppy sg parport_pc parport ata_piix 
ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd 
ehci_hcd
Jun 17 16:09:36 core-123 kernel: Pid: 2342, comm: sshd Not tainted 
2.6.22.19 #3
Jun 17 16:09:36 core-123 kernel: RIP: 0010:[<ffffffff8108beef>]  
[<ffffffff8108beef>] kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:36 core-123 kernel: RSP: 0018:ffff810073d4fbc8  EFLAGS: 
00010002
Jun 17 16:09:36 core-123 kernel: RAX: 0000000000000000 RBX: 
0000000000000246 RCX: 00000000fffffffe
Jun 17 16:09:36 core-123 kernel: RDX: ffff81007fcf7210 RSI: 
00000000000000d0 RDI: ffff81000103e800
Jun 17 16:09:36 core-123 kernel: RBP: ffff81007e2df340 R08: 
0000000000000001 R09: 0000000000000000
Jun 17 16:09:36 core-123 kernel: R10: 0000000000000085 R11: 
0000000000000001 R12: ffff81007e2df340
Jun 17 16:09:36 core-123 kernel: R13: ffff810073d4fc78 R14: 
0000000000000000 R15: ffff81007e2e4020
Jun 17 16:09:36 core-123 kernel: FS:  00002b21291a53b0(0000) 
GS:ffffffff81362000(0000) knlGS:0000000000000000
Jun 17 16:09:36 core-123 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Jun 17 16:09:36 core-123 kernel: CR2: 00000000fffffffe CR3: 
00000000741eb000 CR4: 00000000000006e0
Jun 17 16:09:36 core-123 kernel: Process sshd (pid: 2342, threadinfo 
ffff810073d4e000, task ffff81007417eea0)
Jun 17 16:09:36 core-123 kernel: Stack:  0000000000000000 
ffffffff810a4f12 0000000000000000 ffff81007e2df340
Jun 17 16:09:36 core-123 kernel:  ffff810073d4fea8 ffff810073d4fc78 
ffff810073d4fc88 ffffffff81099d0c
Jun 17 16:09:36 core-123 kernel:  ffff810073d4fca8 ffff810037f36600 
ffff81007e2e40d8 ffff810073d4c00c
Jun 17 16:09:36 core-123 kernel: Call Trace:
Jun 17 16:09:36 core-123 kernel:  [<ffffffff810a4f12>] d_alloc+0x22/0x1d0
Jun 17 16:09:36 core-123 kernel:  [<ffffffff81099d0c>] do_lookup+0x19c/0x210
Jun 17 16:09:36 core-123 kernel:  [<ffffffff8109bf22>] 
__link_path_walk+0x882/0xe20
Jun 17 16:09:36 core-123 kernel:  [<ffffffff810a3e5f>] dput+0x1f/0x130
Jun 17 16:09:36 core-123 kernel:  [<ffffffff8109c51b>] 
link_path_walk+0x5b/0x100
Jun 17 16:09:36 core-123 kernel:  [<ffffffff8109c7dc>] 
do_path_lookup+0x8c/0x260
Jun 17 16:09:36 core-123 kernel:  [<ffffffff8109d67a>] 
__path_lookup_intent_open+0x6a/0xd0
Jun 17 16:09:36 core-123 kernel:  [<ffffffff8109d8ab>] open_namei+0x8b/0x6f0
Jun 17 16:09:36 core-123 kernel:  [<ffffffff81091024>] sys_statfs+0x94/0xc0
Jun 17 16:09:36 core-123 kernel:  [<ffffffff8107df8d>] 
free_pages_and_swap_cache+0x8d/0xb0
Jun 17 16:09:36 core-123 kernel:  [<ffffffff8109010c>] 
do_filp_open+0x1c/0x50
Jun 17 16:09:36 core-123 kernel:  [<ffffffff81090194>] do_sys_open+0x54/0xf0
Jun 17 16:09:36 core-123 kernel:  [<ffffffff8100a02c>] tracesys+0xdc/0xe1
Jun 17 16:09:36 core-123 kernel:
Jun 17 16:09:36 core-123 kernel:
Jun 17 16:09:36 core-123 kernel: Code: 48 8b 04 c1 48 89 42 10 53 9d 5b 
48 89 c8 c3 66 90 49 89 d0
Jun 17 16:09:36 core-123 kernel: RIP  [<ffffffff8108beef>] 
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:36 core-123 kernel:  RSP <ffff810073d4fbc8>
Jun 17 16:09:36 core-123 kernel: CR2: 00000000fffffffe


And I rebooted it.

The system is and uptodate FC8.

Thank you,

tamas



More information about the lustre-discuss mailing list