[lustre-discuss] Lustre 2.10.0 mmap() Issues

Christopher Johnston chjohnst at gmail.com
Tue Aug 8 08:58:07 PDT 2017


At my company we use mmap() exclusively for accessing our data on Lustre.
For starters we are seeing some very weird (or maybe expected) poor random
read/write performance for these types of access patterns.  I decided to
give Lustre 2.10.0 a try with ZFS 0.7.0 as the backend instead of ldisk and
after compiling and building the RPMs, the filesystem mounted up just
fine.  I then started doing some iozone runs to test the stability of the
filesystem and although izone does complete uts benchmark I am seeing a lot
of stack traces coming out of various kernel threads.  Note I am only
seeing this when using mmap().  We also ran our application as well just to
verify.  I am also going to try with an ldiskfs format as well to see if
this changes anything.

My ZFS settings are modest, with 50% of memory allocated to the ARC:

options zfs zfs_arc_max=3921674240 zfs_prefetch_disable=1
recordsize=1M
compression=on
dedupe=off
xattr=sa
dnodesize=auto


Below is the output from the stack trace:

Aug  8 09:38:04 dev-gc01-oss001 kernel: BUG: Bad page state: 87 messages
suppressed
Aug  8 09:38:04 dev-gc01-oss001 kernel: BUG: Bad page state in process
socknal_sd00_01  pfn:1cbac1
Aug  8 09:38:04 dev-gc01-oss001 kernel: page:ffffea00072eb040 count:0
mapcount:-1 mapping:          (null) index:0x0
Aug  8 09:38:04 dev-gc01-oss001 kernel: page flags: 0x2fffff00008000(tail)
Aug  8 09:38:04 dev-gc01-oss001 kernel: page dumped because: nonzero
mapcount
Aug  8 09:38:04 dev-gc01-oss001 kernel: Modules linked in: 8021q garp mrp
stp llc osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE)
fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE)
iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul
glue_helper ablk_helper cryptd ppdev sg i2c_piix4 parport_pc i2c_core
parport pcspkr nfsd nfs_acl lockd grace binfmt_misc auth_rpcgss sunrpc
ip_tables xfs libcrc32c zfs(POE) zunicode(POE) zavl(POE) icp(POE)
zcommon(POE) znvpair(POE) spl(OE) zlib_deflate sd_mod crc_t10dif
crct10dif_generic virtio_net virtio_scsi crct10dif_pclmul crct10dif_common
crc32c_intel serio_raw virtio_pci virtio_ring virtio
Aug  8 09:38:04 dev-gc01-oss001 kernel: CPU: 0 PID: 2558 Comm:
socknal_sd00_01 Tainted: P    B      OE  ------------
3.10.0-514.26.2.el7.x86_64 #1
Aug  8 09:38:04 dev-gc01-oss001 kernel: Hardware name: Google Google
Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Aug  8 09:38:04 dev-gc01-oss001 kernel: ffffea00072eb040 00000000005e265e
ffff8800b879f5a8 ffffffff81687133
Aug  8 09:38:04 dev-gc01-oss001 kernel: ffff8800b879f5d0 ffffffff81682368
ffffea00072eb040 0000000000000000
Aug  8 09:38:04 dev-gc01-oss001 kernel: 000fffff00000000 ffff8800b879f618
ffffffff8118946d fff00000fe000000
Aug  8 09:38:04 dev-gc01-oss001 kernel: Call Trace:
Aug  8 09:38:04 dev-gc01-oss001 kernel: [<ffffffff81687133>]
dump_stack+0x19/0x1b
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81682368>]
bad_page.part.75+0xdf/0xfc
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8118946d>]
free_pages_prepare+0x16d/0x190
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff811897b9>]
__free_pages_ok+0x19/0xd0
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8118988b>]
free_compound_page+0x1b/0x20
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81683526>]
__put_compound_page+0x1f/0x22
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81683698>]
put_compound_page+0x16f/0x17d
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8118edfc>]
put_page+0x4c/0x60
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8155ec1f>]
skb_release_data+0x8f/0x140
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8155ecf4>]
skb_release_all+0x24/0x30
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8155f1ec>]
consume_skb+0x2c/0x80
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8156f06d>]
__dev_kfree_skb_any+0x3d/0x50
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa0019afb>]
free_old_xmit_skbs.isra.32+0x6b/0xc0 [virtio_net]
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa0019baf>]
start_xmit+0x5f/0x4f0 [virtio_net]
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8156f9a1>]
dev_hard_start_xmit+0x171/0x3b0
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81597574>]
sch_direct_xmit+0x104/0x200
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8157252c>]
__dev_queue_xmit+0x23c/0x570
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81572870>]
dev_queue_xmit+0x10/0x20
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815b6876>]
ip_finish_output+0x466/0x750
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815b7873>]
ip_output+0x73/0xe0
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815b5531>]
ip_local_out_sk+0x31/0x40
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815b58a3>]
ip_queue_xmit+0x143/0x3a0
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815cf04f>]
tcp_transmit_skb+0x4af/0x990
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815cf68a>]
tcp_write_xmit+0x15a/0xce0
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815d048e>]
__tcp_push_pending_frames+0x2e/0xc0
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815bed2c>]
tcp_push+0xec/0x120
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815c25b8>]
tcp_sendmsg+0xc8/0xc40
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff815ed854>]
inet_sendmsg+0x64/0xb0
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81555ff0>]
sock_sendmsg+0xb0/0xf0
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff8168ea3b>] ?
_raw_spin_unlock_bh+0x1b/0x40
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81556067>]
kernel_sendmsg+0x37/0x50
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa09f40d9>]
ksocknal_lib_send_iov+0xd9/0x140 [ksocklnd]
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa09ed32f>]
ksocknal_process_transmit+0x2af/0xb90 [ksocklnd]
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa09f1b84>]
ksocknal_scheduler+0x204/0x670 [ksocklnd]
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff810b1b20>] ?
wake_up_atomic_t+0x30/0x30
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffffa09f1980>] ?
ksocknal_recv+0x2a0/0x2a0 [ksocklnd]
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff810b0a4f>]
kthread+0xcf/0xe0
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff810b0980>] ?
kthread_create_on_node+0x140/0x140
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff81697758>]
ret_from_fork+0x58/0x90
Aug  8 09:38:05 dev-gc01-oss001 kernel: [<ffffffff810b0980>] ?
kthread_create_on_node+0x140/0x140

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170808/1d3c9b19/attachment.htm>


More information about the lustre-discuss mailing list