[Lustre-discuss] o2iblnd no resources
Liang Zhen
Zhen.Liang at Sun.COM
Fri Feb 1 23:39:09 PST 2008
Hi Kilian,
I think it's because o2iblnd uses fragmented RDMA by default(Max to
256), so we have to set max_send_wr as (concurrent_send * (256 + 1))
while creating QP by rdma_create_qp(), it takes a lot of resource and
can make a busy server out of memory sometime.
To resolve this problem, we have to use FMR to map fragmented buffer to
virtual contiguous I/O address, there will always be one fragment for
RDMA by this way.
Here is patch for this problem (using FMR in o2iblnd)
https://bugzilla.lustre.org/attachment.cgi?id=15144
Regards
Liang
Kilian CAVALOTTI wrote:
> Hi all,
>
> What can cause a client to receive a "o2iblnd no resources" message
> from an OSS?
> ---------------------------------------------------------------------------
> Feb 1 15:24:24 node-5-8 kernel: LustreError: 1893:0:(o2iblnd_cb.c:2448:kiblnd_rejected()) 10.10.60.3 at o2ib rejected: o2iblnd no resources
> ---------------------------------------------------------------------------
>
> I suspect an out-of-memory problem, and indeed the OSS logs are filled
> up with the following:
> ---------------------------------------------------------------------------
> ib_cm/3: page allocation failure. order:4, mode:0xd0
>
> Call Trace:<ffffffff8015c847>{__alloc_pages+777} <ffffffff801727e9>{alloc_page_interleave+61}
> <ffffffff8015c8e0>{__get_free_pages+11} <ffffffff8015facd>{kmem_getpages+36}
> <ffffffff80160262>{cache_alloc_refill+609} <ffffffff8015ff30>{__kmalloc+123}
> <ffffffffa014ee75>{:ib_mthca:mthca_alloc_qp_common+668}
> <ffffffffa014f42d>{:ib_mthca:mthca_alloc_qp+178} <ffffffffa0153e3a>{:ib_mthca:mthca_create_qp+311}
> <ffffffffa00d5b1b>{:ib_core:ib_create_qp+20} <ffffffffa021a5f9>{:rdma_cm:rdma_create_qp+43}
> <ffffffff8024b7b5>{dma_pool_free+245} <ffffffffa014b257>{:ib_mthca:mthca_init_cq+1073}
> <ffffffffa01540cf>{:ib_mthca:mthca_create_cq+282} <ffffffff801727e9>{alloc_page_interleave+61}
> <ffffffffa0400c10>{:ko2iblnd:kiblnd_cq_completion+0}
> <ffffffffa0400d50>{:ko2iblnd:kiblnd_cq_event+0} <ffffffffa00d5cc1>{:ib_core:ib_create_cq+33}
> <ffffffffa03f56bd>{:ko2iblnd:kiblnd_create_conn+3565}
> <ffffffffa0276f38>{:libcfs:cfs_alloc+40} <ffffffffa03fe457>{:ko2iblnd:kiblnd_passive_connect+2215}
> <ffffffffa00d8595>{:ib_core:ib_find_cached_gid+244}
> <ffffffffa021a278>{:rdma_cm:cma_acquire_dev+293} <ffffffffa03ff540>{:ko2iblnd:kiblnd_cm_callback+64}
> <ffffffffa03ff500>{:ko2iblnd:kiblnd_cm_callback+0}
> <ffffffffa021b19a>{:rdma_cm:cma_req_handler+863} <ffffffff801e8427>{alloc_layer+67}
> <ffffffff801e8645>{idr_get_new_above_int+423} <ffffffffa00fa0ab>{:ib_cm:cm_process_work+101}
> <ffffffffa00faa57>{:ib_cm:cm_req_handler+2398} <ffffffffa00fae3c>{:ib_cm:cm_work_handler+0}
> <ffffffffa00fae6a>{:ib_cm:cm_work_handler+46} <ffffffff80146fca>{worker_thread+419}
> <ffffffff80133566>{default_wake_function+0} <ffffffff801335b7>{__wake_up_common+67}
> <ffffffff80133566>{default_wake_function+0} <ffffffff8014ad18>{keventd_create_kthread+0}
> <ffffffff80146e27>{worker_thread+0} <ffffffff8014ad18>{keventd_create_kthread+0}
> <ffffffff8014acef>{kthread+200} <ffffffff80110de3>{child_rip+8}
> <ffffffff8014ad18>{keventd_create_kthread+0} <ffffffff8014ac27>{kthread+0}
> <ffffffff80110ddb>{child_rip+0}
> Mem-info:
> Node 0 DMA per-cpu:
> cpu 0 hot: low 2, high 6, batch 1
> cpu 0 cold: low 0, high 2, batch 1
> cpu 1 hot: low 2, high 6, batch 1
> cpu 1 cold: low 0, high 2, batch 1
> cpu 2 hot: low 2, high 6, batch 1
> cpu 2 cold: low 0, high 2, batch 1
> cpu 3 hot: low 2, high 6, batch 1
> cpu 3 cold: low 0, high 2, batch 1
> Node 0 Normal per-cpu:
> cpu 0 hot: low 32, high 96, batch 16
> cpu 0 cold: low 0, high 32, batch 16
> cpu 1 hot: low 32, high 96, batch 16
> cpu 1 cold: low 0, high 32, batch 16
> cpu 2 hot: low 32, high 96, batch 16
> cpu 2 cold: low 0, high 32, batch 16
> cpu 3 hot: low 32, high 96, batch 16
> cpu 3 cold: low 0, high 32, batch 16
> Node 0 HighMem per-cpu: empty
>
> Free pages: 35336kB (0kB HighMem)
> Active:534156 inactive:127091 dirty:1072 writeback:0 unstable:0 free:8834 slab:146612 mapped:26222 pagetables:1035
> Node 0 DMA free:9832kB min:52kB low:64kB high:76kB active:0kB inactive:0kB present:16384kB pages_scanned:37 all_unreclaimable? yes
> protections[]: 0 510200 510200
> Node 0 Normal free:25504kB min:16328kB low:20408kB high:24492kB active:2136624kB inactive:508364kB present:4964352kB pages_scanned:0 all_unreclaimable? no
> protections[]: 0 0 0
> Node 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
> protections[]: 0 0 0
> Node 0 DMA: 2*4kB 2*8kB 1*16kB 0*32kB 1*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 2*4096kB = 9832kB
> Node 0 Normal: 1284*4kB 2290*8kB 126*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 25504kB
> Node 0 HighMem: empty
> Swap cache: add 111, delete 111, find 23/36, race 0+0
> Free swap: 4096360kB
> 1245184 pages of RAM
> 235840 reserved pages
> 659867 pages shared
> 0 pages swap cached
> ---------------------------------------------------------------------------
>
> IB links are up and working on both the client and the OSS:
> ---------------------------------------------------------------------------
> client# ibstatus
> Infiniband device 'mthca0' port 1 status:
> default gid: fe80:0000:0000:0000:0005:ad00:0008:af71
> base lid: 0x83
> sm lid: 0x130
> state: 4: ACTIVE
> phys state: 5: LinkUp
> rate: 20 Gb/sec (4X DDR)
> oss# ibstatus
> Infiniband device 'mthca0' port 1 status:
> default gid: fe80:0000:0000:0000:0005:ad00:0008:cb11
> base lid: 0x126
> sm lid: 0x130
> state: 4: ACTIVE
> phys state: 5: LinkUp
> rate: 20 Gb/sec (4X DDR)
> ---------------------------------------------------------------------------
> And the Subnet Manager doesn't expose any unusual error or skyrocketed
> counter (I use OFED 1.2, kernel 2.6.9-55.0.9.EL_lustre.1.6.4.1smp).
>
> What I don't really get is that most clients can access files on this
> OSS with no issue, and besides, my limited understanding of the kernel
> memory mechanisms tend to let me believe that this OSS is not out of
> memory:
> ---------------------------------------------------------------------------
> # cat /proc/meminfo
> MemTotal: 4037380 kB
> MemFree: 31688 kB
> Buffers: 1333536 kB
> Cached: 1231900 kB
> SwapCached: 0 kB
> Active: 2138948 kB
> Inactive: 507720 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 4037380 kB
> LowFree: 31688 kB
> SwapTotal: 4096564 kB
> SwapFree: 4096360 kB
> Dirty: 6868 kB
> Writeback: 0 kB
> Mapped: 106984 kB
> Slab: 588200 kB
> CommitLimit: 6115252 kB
> Committed_AS: 860508 kB
> PageTables: 4304 kB
> VmallocTotal: 536870911 kB
> VmallocUsed: 274788 kB
> VmallocChunk: 536596091 kB
> HugePages_Total: 0
> HugePages_Free: 0
> Hugepagesize: 2048 kB
> ---------------------------------------------------------------------------
>
> This only appeared lately, after several week of continuous use of the
> filesystem, without any problem. Is there anything like a memory leak
> somewhere? Any help to diagnose the problem would be greatly appreciated.
>
> Thanks!
>
More information about the lustre-discuss
mailing list