[Lustre-discuss] Lustre Client - Memory Issue
Jagga Soorma
jagga13 at gmail.com
Mon Apr 19 14:04:44 PDT 2010
Also, here is the /proc/sys/lustre/memused for all my compute nodes:
--
580093506
138839275
26861811
1192253563
137585179
138928251
2214512411
1872972173
138846915
142122291
131465011
139968027
4626653800
188594515
23944887
2495517689
--
Can someone help me put some of this information together and make sense of
this? I am pretty sure this is related to lustre but not sure what might be
eating up the memory.
Thanks,
-J
On Mon, Apr 19, 2010 at 1:28 PM, Jagga Soorma <jagga13 at gmail.com> wrote:
> I have tried:
>
> echo 1 > /proc/sys/vm/drop_caches
> &
> echo 3 > /proc/sys/vm/drop_caches
>
> However the free memory does not change at all. Any ideas what might be
> going on?
>
> -Simran
>
>
> On Mon, Apr 19, 2010 at 11:37 AM, Jagga Soorma <jagga13 at gmail.com> wrote:
>
>> Could it be locking? I do have the flock option enabled.
>>
>> --
>> lustre_inode_cache 123 192 896 4 1 : tunables 54 27
>> 8 : slabdata 48 48 0
>> lov_oinfo 128 228 320 12 1 : tunables 54 27 8
>> : slabdata 19 19 0
>> ldlm_locks 1550 3992 512 8 1 : tunables 54 27 8
>> : slabdata 499 499 0
>> ldlm_resources 1449 3600 384 10 1 : tunables 54 27 8
>> : slabdata 360 360 0
>> --
>>
>> Thanks,
>> -J
>>
>>
>> On Mon, Apr 19, 2010 at 11:26 AM, Jagga Soorma <jagga13 at gmail.com> wrote:
>>
>>> Here is something from April 12 that I see in the client logs. Not sure
>>> if this is related:
>>>
>>> --
>>> Apr 12 14:51:16 manak kernel: Lustre: 7359:0:(rw.c:2092:ll_readpage())
>>> ino 424411146 page 0 (0) not covered by a lock (mmap?). check debug logs.
>>> Apr 12 14:51:16 manak kernel: Lustre: 7359:0:(rw.c:2092:ll_readpage())
>>> ino 424411146 page 1480 (6062080) not covered by a lock (mmap?). check
>>> debug logs.
>>> Apr 12 14:51:16 manak kernel: Lustre: 7359:0:(rw.c:2092:ll_readpage())
>>> Skipped 1479 previous similar messages
>>> Apr 12 14:51:17 manak kernel: Lustre: 7359:0:(rw.c:2092:ll_readpage())
>>> ino 424411146 page 273025 (1118310400) not covered by a lock (mmap?). check
>>> debug logs.
>>> Apr 12 14:51:17 manak kernel: Lustre: 7359:0:(rw.c:2092:ll_readpage())
>>> Skipped 271544 previous similar messages
>>> --
>>>
>>> -J
>>>
>>>
>>> On Mon, Apr 19, 2010 at 11:02 AM, Jagga Soorma <jagga13 at gmail.com>wrote:
>>>
>>>> Andreas,
>>>>
>>>> I am seeing the problem again on one of my hosts and here is a live
>>>> capture of the data. Can you assist with this?
>>>>
>>>> --
>>>> # free
>>>> total used free shared buffers
>>>> cached
>>>> Mem: 198091444 197636852 454592 0 4260
>>>> 34251452
>>>> -/+ buffers/cache: 163381140 34710304
>>>> Swap: 75505460 10281796 65223664
>>>>
>>>> # cat /proc/meminfo
>>>> MemTotal: 198091444 kB
>>>> MemFree: 458048 kB
>>>> Buffers: 4268 kB
>>>> Cached: 34099372 kB
>>>> SwapCached: 7730744 kB
>>>> Active: 62919152 kB
>>>> Inactive: 34107188 kB
>>>> SwapTotal: 75505460 kB
>>>> SwapFree: 65220676 kB
>>>> Dirty: 444 kB
>>>> Writeback: 0 kB
>>>> AnonPages: 58704728 kB
>>>> Mapped: 12036 kB
>>>> Slab: 99806476 kB
>>>> SReclaimable: 118532 kB
>>>> SUnreclaim: 99687944 kB
>>>> PageTables: 131200 kB
>>>>
>>>> NFS_Unstable: 0 kB
>>>> Bounce: 0 kB
>>>> WritebackTmp: 0 kB
>>>> CommitLimit: 174551180 kB
>>>> Committed_AS: 65739660 kB
>>>>
>>>> VmallocTotal: 34359738367 kB
>>>> VmallocUsed: 588416 kB
>>>> VmallocChunk: 34359149923 kB
>>>> HugePages_Total: 0
>>>> HugePages_Free: 0
>>>> HugePages_Rsvd: 0
>>>> HugePages_Surp: 0
>>>> Hugepagesize: 2048 kB
>>>> DirectMap4k: 8432 kB
>>>> DirectMap2M: 201308160 kB
>>>>
>>>> # cat /proc/slabinfo
>>>> slabinfo - version: 2.1
>>>> # name <active_objs> <num_objs> <objsize> <objperslab>
>>>> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata
>>>> <active_slabs> <num_slabs> <sharedavail>
>>>> nfs_direct_cache 0 0 128 30 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> nfs_write_data 36 44 704 11 2 : tunables 54 27
>>>> 8 : slabdata 4 4 0
>>>> nfs_read_data 32 33 704 11 2 : tunables 54 27
>>>> 8 : slabdata 3 3 0
>>>> nfs_inode_cache 0 0 984 4 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> nfs_page 0 0 128 30 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> rpc_buffers 8 8 2048 2 1 : tunables 24 12
>>>> 8 : slabdata 4 4 0
>>>> rpc_tasks 8 12 320 12 1 : tunables 54 27
>>>> 8 : slabdata 1 1 0
>>>> rpc_inode_cache 0 0 832 4 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> ll_async_page 8494811 8507076 320 12 1 : tunables 54
>>>> 27 8 : slabdata 708923 708923 216
>>>> ll_file_data 16 40 192 20 1 : tunables 120 60
>>>> 8 : slabdata 2 2 0
>>>> lustre_inode_cache 95 184 896 4 1 : tunables 54
>>>> 27 8 : slabdata 46 46 0
>>>> lov_oinfo 56 180 320 12 1 : tunables 54 27
>>>> 8 : slabdata 15 15 0
>>>>
>>>> osc_quota_info 0 0 32 112 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> ll_qunit_cache 0 0 112 34 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> llcd_cache 0 0 3952 1 1 : tunables 24 12
>>>> 8 : slabdata 0 0 0
>>>> ptlrpc_cbdatas 0 0 32 112 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> interval_node 1680 5730 128 30 1 : tunables 120 60
>>>> 8 : slabdata 191 191 0
>>>> ldlm_locks 2255 6232 512 8 1 : tunables 54 27
>>>> 8 : slabdata 779 779 0
>>>> ldlm_resources 2227 5570 384 10 1 : tunables 54 27
>>>> 8 : slabdata 557 557 0
>>>>
>>>> ll_import_cache 0 0 1248 3 1 : tunables 24 12
>>>> 8 : slabdata 0 0 0
>>>> ll_obdo_cache 0 459630919 208 19 1 : tunables 120
>>>> 60 8 : slabdata 0 24191101 0
>>>>
>>>> ll_obd_dev_cache 13 13 5672 1 2 : tunables 8 4
>>>> 0 : slabdata 13 13 0
>>>> obd_lvfs_ctxt_cache 0 0 96 40 1 : tunables 120
>>>> 60 8 : slabdata 0 0 0
>>>> SDP 0 0 1728 4 2 : tunables 24 12
>>>> 8 : slabdata 0 0 0
>>>> fib6_nodes 7 59 64 59 1 : tunables 120
>>>> 60 8 : slabdata 1 1 0
>>>> ip6_dst_cache 10 24 320 12 1 : tunables 54 27
>>>> 8 : slabdata 2 2 0
>>>>
>>>> ndisc_cache 3 30 256 15 1 : tunables 120 60
>>>> 8 : slabdata 2 2 0
>>>> RAWv6 35 36 960 4 1 : tunables 54 27
>>>> 8 : slabdata 9 9 0
>>>> UDPLITEv6 0 0 960 4 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> UDPv6 7 12 960 4 1 : tunables 54 27
>>>> 8 : slabdata 3 3 0
>>>> tw_sock_TCPv6 0 0 192 20 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> request_sock_TCPv6 0 0 192 20 1 : tunables 120
>>>> 60 8 : slabdata 0 0 0
>>>> TCPv6 3 4 1792 2 1 : tunables 24 12
>>>> 8 : slabdata 2 2 0
>>>> ib_mad 2051 2096 448 8 1 : tunables 54 27
>>>> 8 : slabdata 262 262 0
>>>>
>>>> fuse_request 0 0 608 6 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> fuse_inode 0 0 704 11 2 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> kcopyd_job 0 0 360 11 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> dm_uevent 0 0 2608 3 2 : tunables 24 12
>>>> 8 : slabdata 0 0 0
>>>> dm_clone_bio_info 0 0 16 202 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> dm_rq_target_io 0 0 408 9 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> dm_target_io 0 0 24 144 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> dm_io 0 0 32 112 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> uhci_urb_priv 1 67 56 67 1 : tunables 120 60
>>>> 8 : slabdata 1 1 0
>>>> ext3_inode_cache 2472 2610 768 5 1 : tunables 54 27
>>>> 8 : slabdata 522 522 0
>>>>
>>>> ext3_xattr 0 0 88 44 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> journal_handle 56 288 24 144 1 : tunables 120 60
>>>> 8 : slabdata 2 2 0
>>>> journal_head 216 240 96 40 1 : tunables 120 60
>>>> 8 : slabdata 6 6 0
>>>>
>>>> revoke_table 4 202 16 202 1 : tunables 120 60
>>>> 8 : slabdata 1 1 0
>>>> revoke_record 0 0 32 112 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> sgpool-128 2 2 4096 1 1 : tunables 24 12
>>>> 8 : slabdata 2 2 0
>>>> sgpool-64 2 2 2048 2 1 : tunables 24 12
>>>> 8 : slabdata 1 1 0
>>>> sgpool-32 2 4 1024 4 1 : tunables 54 27
>>>> 8 : slabdata 1 1 0
>>>> sgpool-16 2 8 512 8 1 : tunables 54 27
>>>> 8 : slabdata 1 1 0
>>>> sgpool-8 2 15 256 15 1 : tunables 120 60
>>>> 8 : slabdata 1 1 0
>>>> scsi_data_buffer 0 0 24 144 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> scsi_io_context 0 0 112 34 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> flow_cache 0 0 96 40 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> cfq_io_context 58 207 168 23 1 : tunables 120 60
>>>> 8 : slabdata 9 9 0
>>>> cfq_queue 56 308 136 28 1 : tunables 120 60
>>>> 8 : slabdata 11 11 0
>>>>
>>>> bsg_cmd 0 0 312 12 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> mqueue_inode_cache 1 4 896 4 1 : tunables 54
>>>> 27 8 : slabdata 1 1 0
>>>> isofs_inode_cache 0 0 608 6 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> minix_inode_cache 0 0 624 6 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> hugetlbfs_inode_cache 1 7 576 7 1 : tunables 54
>>>> 27 8 : slabdata 1 1 0
>>>> dnotify_cache 0 0 40 92 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> dquot 0 0 256 15 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> inotify_event_cache 0 0 40 92 1 : tunables 120
>>>> 60 8 : slabdata 0 0 0
>>>> inotify_watch_cache 94 159 72 53 1 : tunables 120
>>>> 60 8 : slabdata 3 3 0
>>>>
>>>> kioctx 0 0 384 10 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> kiocb 0 0 256 15 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> fasync_cache 0 0 24 144 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> shmem_inode_cache 878 1040 784 5 1 : tunables 54 27
>>>> 8 : slabdata 208 208 0
>>>>
>>>> pid_namespace 0 0 2112 3 2 : tunables 24 12
>>>> 8 : slabdata 0 0 0
>>>> nsproxy 0 0 56 67 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> posix_timers_cache 0 0 192 20 1 : tunables 120
>>>> 60 8 : slabdata 0 0 0
>>>> uid_cache 7 60 128 30 1 : tunables 120 60
>>>> 8 : slabdata 2 2 0
>>>> UNIX 128 220 704 11 2 : tunables 54 27
>>>> 8 : slabdata 20 20 0
>>>>
>>>> ip_mrt_cache 0 0 128 30 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> UDP-Lite 0 0 832 9 2 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> tcp_bind_bucket 15 118 64 59 1 : tunables 120 60
>>>> 8 : slabdata 2 2 0
>>>>
>>>> inet_peer_cache 1 59 64 59 1 : tunables 120 60
>>>> 8 : slabdata 1 1 0
>>>> secpath_cache 0 0 64 59 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> xfrm_dst_cache 0 0 384 10 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> ip_fib_alias 0 0 32 112 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> ip_fib_hash 15 106 72 53 1 : tunables 120 60
>>>> 8 : slabdata 2 2 0
>>>> ip_dst_cache 40 84 320 12 1 : tunables 54 27
>>>> 8 : slabdata 7 7 0
>>>>
>>>> arp_cache 8 15 256 15 1 : tunables 120 60
>>>> 8 : slabdata 1 1 0
>>>> RAW 33 35 768 5 1 : tunables 54 27
>>>> 8 : slabdata 7 7 0
>>>> UDP 11 36 832 9 2 : tunables 54 27
>>>> 8 : slabdata 4 4 0
>>>> tw_sock_TCP 4 20 192 20 1 : tunables 120 60
>>>> 8 : slabdata 1 1 0
>>>>
>>>> request_sock_TCP 0 0 128 30 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> TCP 16 24 1664 4 2 : tunables 24 12
>>>> 8 : slabdata 6 6 0
>>>> eventpoll_pwq 69 159 72 53 1 : tunables 120 60
>>>> 8 : slabdata 3 3 0
>>>> eventpoll_epi 69 150 128 30 1 : tunables 120 60
>>>> 8 : slabdata 5 5 0
>>>>
>>>> pfm_event_set 0 0 57344 1 16 : tunables 8 4
>>>> 0 : slabdata 0 0 0
>>>> pfm_context 0 0 8192 1 2 : tunables 8 4
>>>> 0 : slabdata 0 0 0
>>>> blkdev_integrity 0 0 112 34 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> blkdev_queue 10 12 2264 3 2 : tunables 24 12
>>>> 8 : slabdata 4 4 0
>>>> blkdev_requests 91 130 368 10 1 : tunables 54 27
>>>> 8 : slabdata 13 13 27
>>>> blkdev_ioc 56 371 72 53 1 : tunables 120 60
>>>> 8 : slabdata 7 7 0
>>>>
>>>> biovec-256 2 2 4096 1 1 : tunables 24 12
>>>> 8 : slabdata 2 2 0
>>>> biovec-128 2 4 2048 2 1 : tunables 24 12
>>>> 8 : slabdata 2 2 0
>>>> biovec-64 2 8 1024 4 1 : tunables 54 27
>>>> 8 : slabdata 2 2 0
>>>> biovec-16 2 30 256 15 1 : tunables 120 60
>>>> 8 : slabdata 2 2 0
>>>> biovec-4 2 118 64 59 1 : tunables 120 60
>>>> 8 : slabdata 2 2 0
>>>> biovec-1 223 606 16 202 1 : tunables 120 60
>>>> 8 : slabdata 3 3 70
>>>>
>>>> bio_integrity_payload 2 60 128 30 1 : tunables 120
>>>> 60 8 : slabdata 2 2 0
>>>> bio 205 330 128 30 1 : tunables 120
>>>> 60 8 : slabdata 11 11 70
>>>> sock_inode_cache 245 300 640 6 1 : tunables 54 27
>>>> 8 : slabdata 50 50 0
>>>> skbuff_fclone_cache 14 14 512 7 1 : tunables 54
>>>> 27 8 : slabdata 2 2 0
>>>> skbuff_head_cache 5121 5985 256 15 1 : tunables 120 60
>>>> 8 : slabdata 399 399 68
>>>> file_lock_cache 4 22 176 22 1 : tunables 120 60
>>>> 8 : slabdata 1 1 0
>>>> Acpi-Operand 889 1749 72 53 1 : tunables 120 60
>>>> 8 : slabdata 33 33 0
>>>>
>>>> Acpi-ParseExt 0 0 72 53 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> Acpi-Parse 0 0 48 77 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> Acpi-State 0 0 80 48 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> Acpi-Namespace 617 672 32 112 1 : tunables 120 60
>>>> 8 : slabdata 6 6 0
>>>> task_delay_info 389 884 112 34 1 : tunables 120 60
>>>> 8 : slabdata 26 26 0
>>>>
>>>> taskstats 0 0 328 12 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> page_cgroup 0 0 40 92 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> proc_inode_cache 1397 1446 608 6 1 : tunables 54 27
>>>> 8 : slabdata 240 241 190
>>>> sigqueue 29 96 160 24 1 : tunables 120 60
>>>> 8 : slabdata 4 4 1
>>>> radix_tree_node 193120 196672 552 7 1 : tunables 54 27
>>>> 8 : slabdata 28096 28096 216
>>>> bdev_cache 5 15 768 5 1 : tunables 54 27
>>>> 8 : slabdata 3 3 0
>>>>
>>>> sysfs_dir_cache 19120 19296 80 48 1 : tunables 120 60
>>>> 8 : slabdata 402 402 0
>>>> mnt_cache 30 105 256 15 1 : tunables 120 60
>>>> 8 : slabdata 7 7 0
>>>> inode_cache 1128 1176 560 7 1 : tunables 54 27
>>>> 8 : slabdata 166 168 24
>>>> dentry 4651 8189 208 19 1 : tunables 120 60
>>>> 8 : slabdata 431 431 0
>>>> filp 1563 2720 192 20 1 : tunables 120 60
>>>> 8 : slabdata 136 136 242
>>>> names_cache 142 142 4096 1 1 : tunables 24 12
>>>> 8 : slabdata 142 142 96
>>>>
>>>> key_jar 0 0 192 20 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> buffer_head 1129 3071 104 37 1 : tunables 120 60
>>>> 8 : slabdata 83 83 0
>>>> mm_struct 86 136 896 4 1 : tunables 54 27
>>>> 8 : slabdata 34 34 1
>>>> vm_area_struct 3406 4136 176 22 1 : tunables 120 60
>>>> 8 : slabdata 188 188 26
>>>> fs_cache 140 531 64 59 1 : tunables 120 60
>>>> 8 : slabdata 9 9 1
>>>> files_cache 83 150 768 5 1 : tunables 54 27
>>>> 8 : slabdata 30 30 1
>>>> signal_cache 325 388 960 4 1 : tunables 54 27
>>>> 8 : slabdata 97 97 0
>>>> sighand_cache 317 369 2112 3 2 : tunables 24 12
>>>> 8 : slabdata 123 123 0
>>>> task_xstate 155 256 512 8 1 : tunables 54 27
>>>> 8 : slabdata 32 32 2
>>>> task_struct 368 372 5872 1 2 : tunables 8 4
>>>> 0 : slabdata 368 372 0
>>>> anon_vma 966 1728 24 144 1 : tunables 120 60
>>>> 8 : slabdata 12 12 0
>>>> pid 377 960 128 30 1 : tunables 120 60
>>>> 8 : slabdata 32 32 0
>>>>
>>>> shared_policy_node 0 0 48 77 1 : tunables 120
>>>> 60 8 : slabdata 0 0 0
>>>> numa_policy 15 112 136 28 1 : tunables 120 60
>>>> 8 : slabdata 4 4 0
>>>> idr_layer_cache 284 322 544 7 1 : tunables 54 27
>>>> 8 : slabdata 46 46 0
>>>>
>>>> size-4194304(DMA) 0 0 4194304 1 1024 : tunables 1
>>>> 1 0 : slabdata 0 0 0
>>>> size-4194304 0 0 4194304 1 1024 : tunables 1
>>>> 1 0 : slabdata 0 0 0
>>>> size-2097152(DMA) 0 0 2097152 1 512 : tunables 1
>>>> 1 0 : slabdata 0 0 0
>>>> size-2097152 0 0 2097152 1 512 : tunables 1
>>>> 1 0 : slabdata 0 0 0
>>>> size-1048576(DMA) 0 0 1048576 1 256 : tunables 1
>>>> 1 0 : slabdata 0 0 0
>>>> size-1048576 0 0 1048576 1 256 : tunables 1
>>>> 1 0 : slabdata 0 0 0
>>>> size-524288(DMA) 0 0 524288 1 128 : tunables 1 1
>>>> 0 : slabdata 0 0 0
>>>> size-524288 0 0 524288 1 128 : tunables 1 1
>>>> 0 : slabdata 0 0 0
>>>> size-262144(DMA) 0 0 262144 1 64 : tunables 1 1
>>>> 0 : slabdata 0 0 0
>>>> size-262144 0 0 262144 1 64 : tunables 1 1
>>>> 0 : slabdata 0 0 0
>>>> size-131072(DMA) 0 0 131072 1 32 : tunables 8 4
>>>> 0 : slabdata 0 0 0
>>>> size-131072 3 3 131072 1 32 : tunables 8 4
>>>> 0 : slabdata 3 3 0
>>>> size-65536(DMA) 0 0 65536 1 16 : tunables 8 4
>>>> 0 : slabdata 0 0 0
>>>> size-65536 6 6 65536 1 16 : tunables 8 4
>>>> 0 : slabdata 6 6 0
>>>> size-32768(DMA) 0 0 32768 1 8 : tunables 8 4
>>>> 0 : slabdata 0 0 0
>>>> size-32768 10 10 32768 1 8 : tunables 8 4
>>>> 0 : slabdata 10 10 0
>>>>
>>>> size-16384(DMA) 0 0 16384 1 4 : tunables 8 4
>>>> 0 : slabdata 0 0 0
>>>> size-16384 44 44 16384 1 4 : tunables 8 4
>>>> 0 : slabdata 44 44 0
>>>>
>>>> size-8192(DMA) 0 0 8192 1 2 : tunables 8 4
>>>> 0 : slabdata 0 0 0
>>>> size-8192 3611 3611 8192 1 2 : tunables 8 4
>>>> 0 : slabdata 3611 3611 0
>>>>
>>>> size-4096(DMA) 0 0 4096 1 1 : tunables 24 12
>>>> 8 : slabdata 0 0 0
>>>> size-4096 1771 1771 4096 1 1 : tunables 24 12
>>>> 8 : slabdata 1771 1771 0
>>>>
>>>> size-2048(DMA) 0 0 2048 2 1 : tunables 24 12
>>>> 8 : slabdata 0 0 0
>>>> size-2048 4609 4714 2048 2 1 : tunables 24 12
>>>> 8 : slabdata 2357 2357 0
>>>>
>>>> size-1024(DMA) 0 0 1024 4 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> size-1024 4829 4900 1024 4 1 : tunables 54 27
>>>> 8 : slabdata 1225 1225 0
>>>>
>>>> size-512(DMA) 0 0 512 8 1 : tunables 54 27
>>>> 8 : slabdata 0 0 0
>>>> size-512 1478 1520 512 8 1 : tunables 54 27
>>>> 8 : slabdata 190 190 39
>>>>
>>>> size-256(DMA) 0 0 256 15 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> size-256 4662 5550 256 15 1 : tunables 120 60
>>>> 8 : slabdata 370 370 1
>>>>
>>>> size-128(DMA) 0 0 128 30 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> size-64(DMA) 0 0 64 59 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> size-64 17232 29382 64 59 1 : tunables 120 60
>>>> 8 : slabdata 498 498 0
>>>>
>>>> size-32(DMA) 0 0 32 112 1 : tunables 120 60
>>>> 8 : slabdata 0 0 0
>>>> size-128 9907 16140 128 30 1 : tunables 120 60
>>>> 8 : slabdata 538 538 0
>>>> size-32 12487 13104 32 112 1 : tunables 120 60
>>>> 8 : slabdata 117 117 0
>>>>
>>>> kmem_cache 181 181 4224 1 2 : tunables 8 4
>>>> 0 : slabdata 181 181 0
>>>>
>>>>
>>>> Tasks: 278 total, 1 running, 276 sleeping, 0 stopped, 1 zombie
>>>> Cpu(s): 3.8%us, 0.1%sy, 0.0%ni, 96.0%id, 0.0%wa, 0.0%hi, 0.0%si,
>>>> 0.0%st
>>>> Mem: 198091444k total, 197636988k used, 454456k free, 4544k
>>>> buffers
>>>> Swap: 75505460k total, 8567448k used, 66938012k free, 29144008k cached
>>>>
>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
>>>> COMMAND
>>>>
>>>> 107 root 15 -5 0 0 0 D 10 0.0 5:06.43
>>>> kswapd1
>>>>
>>>> 19328 user1 20 0 66.5g 60g 2268 D 4 32.0 31:48.49
>>>> R
>>>>
>>>> 1 root 20 0 1064 64 32 S 0 0.0 0:21.20
>>>> init
>>>>
>>>> 2 root 15 -5 0 0 0 S 0 0.0 0:00.06
>>>> kthreadd
>>>>
>>>> 3 root RT -5 0 0 0 S 0 0.0 0:00.24
>>>> migration/0
>>>>
>>>> 4 root 15 -5 0 0 0 S 0 0.0 1:01.12
>>>> ksoftirqd/0
>>>>
>>>> 5 root RT -5 0 0 0 S 0 0.0 0:00.30
>>>> migration/1
>>>>
>>>> 6 root 15 -5 0 0 0 S 0 0.0 0:00.50
>>>> ksoftirqd/1
>>>>
>>>> 7 root RT -5 0 0 0 S 0 0.0 0:00.22
>>>> migration/2
>>>>
>>>> 8 root 15 -5 0 0 0 S 0 0.0 0:00.36
>>>> ksoftirqd/2
>>>>
>>>> 9 root RT -5 0 0 0 S 0 0.0 0:00.28
>>>> migration/3
>>>>
>>>> 10 root 15 -5 0 0 0 S 0 0.0 0:00.60
>>>> ksoftirqd/3
>>>>
>>>> 11 root RT -5 0 0 0 S 0 0.0 0:00.18
>>>> migration/4
>>>>
>>>> 12 root 15 -5 0 0 0 S 0 0.0 0:00.40
>>>> ksoftirqd/4
>>>>
>>>> 13 root RT -5 0 0 0 S 0 0.0 0:00.26
>>>> migration/5
>>>>
>>>> 14 root 15 -5 0 0 0 S 0 0.0 0:00.76
>>>> ksoftirqd/5
>>>>
>>>> 15 root RT -5 0 0 0 S 0 0.0 0:00.20
>>>> migration/6
>>>>
>>>> 16 root 15 -5 0 0 0 S 0 0.0 0:00.36
>>>> ksoftirqd/6
>>>>
>>>> 17 root RT -5 0 0 0 S 0 0.0 0:00.26
>>>> migration/7
>>>>
>>>> 18 root 15 -5 0 0 0 S 0 0.0 0:00.68
>>>> ksoftirqd/7
>>>>
>>>> 19 root RT -5 0 0 0 S 0 0.0 0:00.88
>>>> migration/8
>>>>
>>>> 20 root 15 -5 0 0 0 S 0 0.0 0:07.70
>>>> ksoftirqd/8
>>>>
>>>> 21 root RT -5 0 0 0 S 0 0.0 0:01.12
>>>> migration/9
>>>>
>>>> 22 root 15 -5 0 0 0 S 0 0.0 0:01.20
>>>> ksoftirqd/9
>>>>
>>>> 23 root RT -5 0 0 0 S 0 0.0 0:03.50
>>>> migration/10
>>>>
>>>> 24 root 15 -5 0 0 0 S 0 0.0 0:01.22
>>>> ksoftirqd/10
>>>>
>>>> 25 root RT -5 0 0 0 S 0 0.0 0:04.84
>>>> migration/11
>>>>
>>>> 26 root 15 -5 0 0 0 S 0 0.0 0:01.90
>>>> ksoftirqd/11
>>>>
>>>> 27 root RT -5 0 0 0 S 0 0.0 0:01.46
>>>> migration/12
>>>>
>>>> 28 root 15 -5 0 0 0 S 0 0.0 0:01.42
>>>> ksoftirqd/12
>>>>
>>>> 29 root RT -5 0 0 0 S 0 0.0 0:01.62
>>>> migration/13
>>>>
>>>> 30 root 15 -5 0 0 0 S 0 0.0 0:01.84
>>>> ksoftirqd/13
>>>>
>>>> 31 root RT -5 0 0 0 S 0 0.0 0:01.90
>>>> migration/14
>>>>
>>>> 32 root 15 -5 0 0 0 S 0 0.0 0:01.18
>>>> ksoftirqd/14
>>>> --
>>>>
>>>> Thanks,
>>>> -J
>>>>
>>>> On Mon, Apr 19, 2010 at 10:07 AM, Andreas Dilger <
>>>> andreas.dilger at oracle.com> wrote:
>>>>
>>>>> There is a known problem with the DLM LRU size that may be affecting
>>>>> you. It may be something else too. Please check /proc/{slabinfo,meminfo} to
>>>>> see what is using the memory on the client.
>>>>>
>>>>> Cheers, Andreas
>>>>>
>>>>>
>>>>> On 2010-04-19, at 10:43, Jagga Soorma <jagga13 at gmail.com> wrote:
>>>>>
>>>>> Hi Guys,
>>>>>>
>>>>>> My users are reporting some issues with memory on our lustre 1.8.1
>>>>>> clients. It looks like when they submit a single job at a time the run time
>>>>>> was about 4.5 minutes. However, when they ran multiple jobs (10 or less) on
>>>>>> a client with 192GB of memory on a single node the run time for each job was
>>>>>> exceeding 3-4X the run time for the single process. They also noticed that
>>>>>> the swap space kept climbing even though there was plenty of free memory on
>>>>>> the system. Could this possibly be related to the lustre client? Does it
>>>>>> reserve any memory that is not accessible by any other process even though
>>>>>> it might not be in use?
>>>>>>
>>>>>> Thanks much,
>>>>>> -J
>>>>>> _______________________________________________
>>>>>> Lustre-discuss mailing list
>>>>>> Lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100419/37123355/attachment.htm>
More information about the lustre-discuss
mailing list