[Lustre-discuss] Kernel oops after cat on /proc/fs/lustre/mgs/MGS/exports/*/stats

Wojciech Turek wjt27 at cam.ac.uk
Fri Apr 23 07:07:53 PDT 2010


Hi,

This is a known bug that is fixed in 1.8.2

https://bugzilla.lustre.org/show_bug.cgi?id=21420

Best regards

Wojciech

On 23 April 2010 13:18, Christopher Huhn <C.Huhn at gsi.de> wrote:

> Dear lustre wizards,
>
> we are experiencing problems on our MDS and our Lustre expert is abroad
> (he just attended LUG meeting).
>
> One of the symptoms we observe are reproducible kernel oopses when
> viewing some stats files beneath /proc/fs/lustre/mgs/MGS/exports :
>
>    mds:~# cat /proc/fs/lustre/mgs/MGS/exports/10.12... at tcp/stats
>    Killed
>    mds:~#  mds kernel: Oops: 0000 [38] SMP
>    Apr 23 13:23:19 mds kernel: Unable to handle kernel paging request
>    at ffffffff00040024 RIP:
>    Apr 23 13:23:19 mds kernel: [<ffffffff883d6680>]
>    :obdclass:lprocfs_stats_seq_show+0x80/0x1e0
>    Apr 23 13:23:19 mds kernel: PGD 203067 PUD 0
>    Apr 23 13:23:19 mds kernel: Oops: 0000 [38] SMP
>    Apr 23 13:23:20 mds kernel: CPU 7
>    Apr 23 13:23:20 mds kernel: Modules linked in: mds fsfilt_ldiskfs(F)
>    mgs mgc ldiskfs crc16 lustre lov mdc lquota osc ksocklnd ptlrpc
>    obdclass lnet lvfs libcfs xt_tcpudp iptable_filter ip_tables
>    x_tables drbd cn button ac battery bonding xfs ipmi_si ipmi_devintf
>    ipmi_msghandler serio_raw psmouse joydev pcspkr i2c_i801 i2c_core
>    shpchp pci_hotplug evdev parport_pc parport ext3 jbd mbcache
>    dm_mirror dm_snapshot dm_mod raid10 raid456 xor raid1 raid0
>    multipath linear md_mod sd_mod ide_cd cdrom ata_generic libata
>    generic usbhid hid piix 3w_9xxx floppy ide_core ehci_hcd uhci_hcd
>    e1000 scsi_mod thermal processor fan
>    Apr 23 13:23:20 mds kernel: Pid: 7293, comm: cat Tainted: GF
>    2.6.22+lustre1.6.7.2+0.credativ.etch.1 #2
>    Apr 23 13:23:20 mds kernel: RIP: 0010:[<ffffffff883d6680>]
>    [<ffffffff883d6680>] :obdclass:lprocfs_stats_seq_show+0x80/0x1e0
>    Apr 23 13:23:20 mds kernel: RSP: 0018:ffff8103ba5f9e48  EFLAGS: 00010282
>    Apr 23 13:23:20 mds kernel: RAX: ffffffff00040004 RBX:
>    7fffffffffffffff RCX: 0000000000000006
>    Apr 23 13:23:20 mds kernel: RDX: 0101010101010101 RSI:
>    0000000000000000 RDI: 0000000000000000
>    Apr 23 13:23:20 mds kernel: RBP: 0000000000000000 R08:
>    0000000000000008 R09: 0000000000000000
>    Apr 23 13:23:20 mds kernel: R10: 0000000000000000 R11:
>    0000000000000000 R12: 0000000000000000
>    Apr 23 13:23:20 mds kernel: R13: 0000000000000000 R14:
>    0000000000000000 R15: ffff8108000a1760
>    Apr 23 13:23:20 mds kernel: FS:  00002b4a366786d0(0000)
>    GS:ffff81081004b840(0000) knlGS:0000000000000000
>    Apr 23 13:23:20 mds kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
>    000000008005003b
>    Apr 23 13:23:20 mds kernel: CR2: ffffffff00040024 CR3:
>    000000078f018000 CR4: 00000000000006e0
>    Apr 23 13:23:20 mds kernel: Process cat (pid: 7293, threadinfo
>    ffff8103ba5f8000, task ffff8107dc299530)
>    Apr 23 13:23:20 mds kernel: Stack:  0000000000000202
>    ffffffff00000000 ffffffff00040004 ffff81067dae2640
>    Apr 23 13:23:20 mds kernel: 000000004bd18327 00000000000ca54d
>    0000000000000000 ffff81067dae2640
>    Apr 23 13:23:20 mds kernel: ffffffff00040004 0000000000040004
>    0000000000000400 0000000000000000
>    Apr 23 13:23:20 mds kernel: Call Trace:
>    Apr 23 13:23:20 mds kernel: [<ffffffff8029c0ac>] seq_read+0x105/0x28d
>    Apr 23 13:23:20 mds kernel: [<ffffffff80283f23>] vfs_read+0xcb/0x153
>    Apr 23 13:23:20 mds kernel: [<ffffffff802842bf>] sys_read+0x45/0x6e
>    Apr 23 13:23:20 mds kernel: [<ffffffff80209d8e>] system_call+0x7e/0x83
>    Apr 23 13:23:20 mds kernel:
>    Apr 23 13:23:20 mds kernel:
>    Apr 23 13:23:20 mds kernel: Code: 48 8b 50 20 48 8b 48 28 4c 03 60
>    10 4c 03 68 18 48 39 d3 48
>    Apr 23 13:23:20 mds kernel: RIP  [<ffffffff883d6680>]
>    :obdclass:lprocfs_stats_seq_show+0x80/0x1e0
>     mds kernel: CR2: ffffffff00040024
>    Apr 23 13:23:20 mds kernel: RSP <ffff8103ba5f9e48>
>    Apr 23 13:23:20 mds kernel: CR2: ffffffff00040024
>
>
> Server and affected client both run Lustre 1.6.7.2 on Debian Etch/x86_64
> in this case. The behavior does not change after a client reboot.
>
> All hints on how to solve this are really appreciated.
>
> Kind regards,
>    Christopher
>
> --
> Christopher Huhn
> Linux therapist
>
> GSI Helmholtzzentrum fuer Schwerionenforschung GmbH
> Planckstr. 1
> 64291 Darmstadt
> http://www.gsi.de/
>
> Gesellschaft mit beschraenkter Haftung
>
> Sitz der Gesellschaft / Registered Office:                    Darmstadt
> Handelsregister       / Commercial Register:
>                                        Amtsgericht Darmstadt, HRB 1528
>
> Geschaeftsfuehrung    / Managing Directors:
>                                 Professor Dr. Dr. h.c. Horst Stoecker,
>                                                    Christiane Neumann,
>                                                   Dr. Hartmut Eickhoff
> Vorsitzende des Aufsichtsrates / Supervisory Board Chair:
>                                           Dr. Beatrix Vierkorn-Rudolph
> Stellvertreter        / Deputy Chair:                 Dr. Rolf Bernhard
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>



-- 
--
Wojciech Turek

Assistant System Manager

High Performance Computing Service
University of Cambridge
Email: wjt27 at cam.ac.uk
Tel: (+)44 1223 763517
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100423/277987e4/attachment.htm>


More information about the lustre-discuss mailing list