[lustre-discuss] Lustre 2.10.0 mmap() Issues

Christopher Johnston chjohnst at gmail.com
Thu Aug 10 06:46:57 PDT 2017


Sure can Peter, will do that later this morning.

On Thu, Aug 10, 2017 at 8:58 AM, Jones, Peter A <peter.a.jones at intel.com>
wrote:

> Christopher
>
> Could you please open a JIRA ticket about this?
>
> Thanks
>
> Peter
>
> On 8/8/17, 8:58 AM, "lustre-discuss on behalf of Christopher Johnston" <
> lustre-discuss-bounces at lists.lustre.org on behalf of chjohnst at gmail.com>
> wrote:
>
> At my company we use mmap() exclusively for accessing our data on Lustre.
> For starters we are seeing some very weird (or maybe expected) poor random
> read/write performance for these types of access patterns.  I decided to
> give Lustre 2.10.0 a try with ZFS 0.7.0 as the backend instead of ldisk and
> after compiling and building the RPMs, the filesystem mounted up just
> fine.  I then started doing some iozone runs to test the stability of the
> filesystem and although izone does complete uts benchmark I am seeing a lot
> of stack traces coming out of various kernel threads.  Note I am only
> seeing this when using mmap().  We also ran our application as well just to
> verify.  I am also going to try with an ldiskfs format as well to see if
> this changes anything.
>
> My ZFS settings are modest, with 50% of memory allocated to the ARC:
>
> options zfs zfs_arc_max=3921674240 zfs_prefetch_disable=1
> recordsize=1M
> compression=on
> dedupe=off
> xattr=sa
> dnodesize=auto
>
>
> Below is the output from the stack trace:
>
> Aug  8 09:38:04 dev-gc01-oss001 kernel: BUG: Bad page state: 87 messages
> suppressed
> Aug  8 09:38:04 dev-gc01-oss001 kernel: BUG: Bad page state in process
> socknal_sd00_01  pfn:1cbac1
> Aug  8 09:38:04 dev-gc01-oss001 kernel: page:ffffea00072eb040 count:0
> mapcount:-1 mapping:          (null) index:0x0
> Aug  8 09:38:04 dev-gc01-oss001 kernel: page flags: 0x2fffff00008000(tail)
> Aug  8 09:38:04 dev-gc01-oss001 kernel: page dumped because: nonzero
> mapcount
> Aug  8 09:38:04 dev-gc01-oss001 kernel: Modules linked in: 8021q garp mrp
> stp llc osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE)
> fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE)
> iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul
> glue_helper ablk_helper cryptd ppdev sg i2c_piix4 parport_pc i2c_core
> parport pcspkr nfsd nfs_acl lockd grace binfmt_misc auth_rpcgss sunrpc
> ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zavl(POE) icp(POE)
> zcommon(POE) znvpair(POE) spl(OE) zlib_deflate sd_mod crc_t10dif
> crct10dif_generic virtio_net virtio_scsi crct10dif_pclmul crct10dif_common
> crc32c_intel serio_raw virtio_pci virtio_ring virtio
> Aug  8 09:38:04 dev-gc01-oss001 kernel: CPU: 0 PID: 2558 Comm:
> socknal_sd00_01 Tainted: P    B      OE  ------------
> 3.10.0-514.26.2.el7.x86_64 #1
> Aug  8 09:38:04 dev-gc01-oss001 kernel: Hardware name: Google Google
> Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Aug  8 09:38:04 dev-gc01-oss001 kernel: ffffea00072eb040 00000000005e265e
> ffff8800b879f5a8 ffffffff81687133
> Aug  8 09:38:04 dev-gc01-oss001 kernel: ffff8800b879f5d0 ffffffff81682368
> ffffea00072eb040 0000000000000000
> Aug  8 09:38:04 dev-gc01-oss001 kernel: 000fffff00000000 ffff8800b879f618
> ffffffff8118946d fff00000fe000000
> Aug  8 09:38:04 dev-gc01-oss001 kernel: Call Trace:
> Aug  8 09:38:04 dev-gc01-oss001 kernel: [<ffffffff81687133>]
> dump_stack+0x19/0x1b
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81682368>]
> bad_page.part.75+0xdf/0xfc
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8118946d>]
> free_pages_prepare+0x16d/0x190
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff811897b9>]
> __free_pages_ok+0x19/0xd0
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8118988b>]
> free_compound_page+0x1b/0x20
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81683526>]
> __put_compound_page+0x1f/0x22
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81683698>]
> put_compound_page+0x16f/0x17d
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8118edfc>]
> put_page+0x4c/0x60
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8155ec1f>]
> skb_release_data+0x8f/0x140
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8155ecf4>]
> skb_release_all+0x24/0x30
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8155f1ec>]
> consume_skb+0x2c/0x80
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8156f06d>]
> __dev_kfree_skb_any+0x3d/0x50
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa0019afb>]
> free_old_xmit_skbs.isra.32+0x6b/0xc0 [virtio_net]
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa0019baf>]
> start_xmit+0x5f/0x4f0 [virtio_net]
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8156f9a1>]
> dev_hard_start_xmit+0x171/0x3b0
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81597574>]
> sch_direct_xmit+0x104/0x200
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8157252c>]
> __dev_queue_xmit+0x23c/0x570
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81572870>]
> dev_queue_xmit+0x10/0x20
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815b6876>]
> ip_finish_output+0x466/0x750
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815b7873>]
> ip_output+0x73/0xe0
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815b5531>]
> ip_local_out_sk+0x31/0x40
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815b58a3>]
> ip_queue_xmit+0x143/0x3a0
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815cf04f>]
> tcp_transmit_skb+0x4af/0x990
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815cf68a>]
> tcp_write_xmit+0x15a/0xce0
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815d048e>]
> __tcp_push_pending_frames+0x2e/0xc0
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815bed2c>]
> tcp_push+0xec/0x120
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815c25b8>]
> tcp_sendmsg+0xc8/0xc40
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815ed854>]
> inet_sendmsg+0x64/0xb0
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81555ff0>]
> sock_sendmsg+0xb0/0xf0
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8168ea3b>] ?
> _raw_spin_unlock_bh+0x1b/0x40
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81556067>]
> kernel_sendmsg+0x37/0x50
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa09f40d9>]
> ksocknal_lib_send_iov+0xd9/0x140 [ksocklnd]
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa09ed32f>]
> ksocknal_process_transmit+0x2af/0xb90 [ksocklnd]
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa09f1b84>]
> ksocknal_scheduler+0x204/0x670 [ksocklnd]
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff810b1b20>] ?
> wake_up_atomic_t+0x30/0x30
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa09f1980>] ?
> ksocknal_recv+0x2a0/0x2a0 [ksocklnd]
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff810b0a4f>]
> kthread+0xcf/0xe0
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff810b0980>] ?
> kthread_create_on_node+0x140/0x140
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81697758>]
> ret_from_fork+0x58/0x90
> Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff810b0980>] ?
> kthread_create_on_node+0x140/0x140
>
> -Chris
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170810/f2f1d284/attachment-0001.htm>


More information about the lustre-discuss mailing list