[Lustre-discuss] lustre ram usage (contd)
Balagopal Pillai
pillai at mathstat.dal.ca
Tue Dec 25 04:35:47 PST 2007
On Mon, 24 Dec 2007, Andreas Dilger wrote:
Hi Andreas,
Here the output of vmstat -m after doubling the ram yesterday
on both OSS. The rsync completed successfully yesterday night. But almost
5.4G of ram is used up!.
total used free shared buffers cached
Mem: 8166408 8094468 71940 0 2597688 48124
-/+ buffers/cache: 5448656 2717752
Swap: 4096564 224 4096340
Here is the vmstat -m. This time, ldiskfs_inode_cache is
biggest occupant. ex3 inode cache is smaller and dentry cache is quite
big. I can get around the problem if this is ext3 related by exporting the
backup volume by iscsi from the OSS and mounting on the lustre client
via an iscsi client. The nodes have 16GB and that should be enough for all
the caches. But ldiskfs_inode_cache is also becoming quite big. The only
difference between last time and this time is that, i have re-enabled all
the needed rsyncs with one copy of data to an nfs mounted ext3 volume and
another copy to another big Lustre volume. That could explain the beefing
up of ldiskfs_inode_cache this time. The current vmstat -m of both OSS are
pasted below -
1st OSS + MDS -
Cache Num Total Size Pages
ll_fmd_cache 0 0 56 69
osc_quota_info 0 0 32 119
lustre_dquot_cache 0 0 144 27
fsfilt_ldiskfs_fcb 0 0 56 69
ldiskfs_inode_cache 3969899 3969960 920 4
ldiskfs_xattr 0 0 88 45
ldiskfs_prealloc_space 5536 5662 104 38
ll_file_data 0 0 128 31
lustre_inode_cache 0 0 896 4
lov_oinfo 0 0 256 15
ll_qunit_cache 0 0 72 54
ldlm_locks 86258 110698 512 7
ldlm_resources 85847 103725 256 15
ll_import_cache 0 0 440 9
ll_obdo_cache 0 0 208 19
ll_obd_dev_cache 40 40 5328 1
fib6_nodes 11 61 64 61
ip6_dst_cache 16 24 320 12
ndisc_cache 1 15 256 15
rawv6_sock 10 12 1024 4
udpv6_sock 1 4 1024 4
tcpv6_sock 3 4 1728 4
rpc_buffers 8 8 2048 2
rpc_tasks 8 12 320 12
rpc_inode_cache 6 8 832 4
msi_cache 4 4 5760 1
ip_fib_alias 10 119 32 119
ip_fib_hash 10 61 64 61
dm_tio 0 0 24 156
dm_io 0 0 40 96
dm-bvec-(256) 0 0 4096 1
dm-bvec-128 0 0 2048 2
dm-bvec-64 0 0 1024 4
dm-bvec-16 0 0 256 15
dm-bvec-4 0 0 64 61
Cache Num Total Size Pages
dm-bvec-1 0 0 16 225
dm-bio 0 0 128 31
uhci_urb_priv 2 45 88 45
ext3_inode_cache 6104 20520 856 4
ext3_xattr 0 0 88 45
journal_handle 20 81 48 81
journal_head 482 2610 88 45
revoke_table 38 225 16 225
revoke_record 0 0 32 119
scsi_cmd_cache 7 7 512 7
unix_sock 103 150 768 5
ip_mrt_cache 0 0 128 31
tcp_tw_bucket 0 0 192 20
tcp_bind_bucket 14 119 32 119
tcp_open_request 0 0 128 31
inet_peer_cache 0 0 128 31
secpath_cache 0 0 192 20
xfrm_dst_cache 0 0 384 10
ip_dst_cache 38 90 384 10
arp_cache 16 30 256 15
raw_sock 9 9 832 9
udp_sock 14 54 832 9
tcp_sock 56 60 1536 5
flow_cache 0 0 128 31
mqueue_inode_cache 1 4 896 4
relayfs_inode_cache 0 0 592 13
isofs_inode_cache 0 0 632 6
hugetlbfs_inode_cache 1 6 624 6
ext2_inode_cache 0 0 752 5
ext2_xattr 0 0 88 45
dquot 0 0 224 17
eventpoll_pwq 3 54 72 54
eventpoll_epi 3 20 192 20
kioctx 0 0 384 10
kiocb 0 0 256 15
Cache Num Total Size Pages
dnotify_cache 2 96 40 96
fasync_cache 1 156 24 156
shmem_inode_cache 379 415 816 5
posix_timers_cache 0 0 184 21
uid_cache 5 31 128 31
sgpool-256 32 32 8192 1
sgpool-128 32 32 4096 1
sgpool-64 32 32 2048 2
sgpool-32 36 36 1024 4
sgpool-16 32 32 512 8
sgpool-8 45 45 256 15
cfq_pool 98 207 56 69
crq_pool 80 324 72 54
deadline_drq 0 0 96 41
as_arq 0 0 112 35
blkdev_ioc 360 476 32 119
blkdev_queue 33 63 856 9
blkdev_requests 80 120 264 15
biovec-(256) 256 256 4096 1
biovec-128 256 256 2048 2
biovec-64 256 256 1024 4
biovec-16 256 270 256 15
biovec-4 256 305 64 61
biovec-1 332 450 16 225
bio 310 310 128 31
file_lock_cache 3 75 160 25
sock_inode_cache 207 210 704 5
skbuff_head_cache 16465 21900 320 12
sock 6 12 640 6
proc_inode_cache 2637 2658 616 6
sigqueue 45 46 168 23
radix_tree_node 182213 186375 536 7
bdev_cache 52 52 832 4
mnt_cache 60 100 192 20
inode_cache 917 1239 584 7
Cache Num Total Size Pages
dentry_cache 2880362 2882112 240 16
filp 731 816 320 12
names_cache 4 5 4096 1
avc_node 12 432 72 54
key_jar 10 40 192 20
idr_layer_cache 111 119 528 7
buffer_head 650238 742680 88 45
mm_struct 45 112 1152 7
vm_area_struct 1626 2904 176 22
fs_cache 427 549 64 61
files_cache 48 126 832 9
signal_cache 534 615 256 15
sighand_cache 530 543 2112 3
task_struct 555 560 2000 2
anon_vma 679 1248 24 156
shared_policy_node 0 0 56 69
numa_policy 82 675 16 225
size-131072(DMA) 0 0 131072 1
size-131072 12 12 131072 1
size-65536(DMA) 0 0 65536 1
size-65536 229 229 65536 1
size-32768(DMA) 0 0 32768 1
size-32768 0 0 32768 1
size-16384(DMA) 0 0 16384 1
size-16384 1286 1286 16384 1
size-8192(DMA) 0 0 8192 1
size-8192 4884 4884 8192 1
size-4096(DMA) 0 0 4096 1
size-4096 744 786 4096 1
size-2048(DMA) 0 0 2048 2
size-2048 9114 9120 2048 2
size-1620(DMA) 0 0 1664 4
size-1620 86 100 1664 4
size-1024(DMA) 0 0 1024 4
size-1024 15217 16132 1024 4
Cache Num Total Size Pages
size-512(DMA) 0 0 512 8
size-512 1213 2752 512 8
size-256(DMA) 0 0 256 15
size-256 10441 11310 256 15
size-128(DMA) 0 0 128 31
size-128 205487 218488 128 31
size-64(DMA) 0 0 64 61
size-64 777658 891088 64 61
size-32(DMA) 0 0 32 119
size-32 43033 86632 32 119
kmem_cache 225 225 256 15
total used free shared buffers cached
Mem: 8166340 5462540 2703800 0 1515664 448516
-/+ buffers/cache: 3498360 4667980
Swap: 4096440 0 4096440
[root at lustre2 ~]# vmstat -m
Cache Num Total Size Pages
ll_fmd_cache 0 0 56 69
fsfilt_ldiskfs_fcb 4 69 56 69
ldiskfs_inode_cache 1971539 1971548 920 4
ldiskfs_xattr 0 0 88 45
ldiskfs_prealloc_space 9090 9120 104 38
ll_file_data 0 0 128 31
lustre_inode_cache 0 0 896 4
lov_oinfo 0 0 256 15
ll_qunit_cache 0 0 72 54
ldlm_locks 228 1253 512 7
ldlm_resources 226 2235 256 15
ll_import_cache 0 0 440 9
ll_obdo_cache 0 0 208 19
ll_obd_dev_cache 10 10 5328 1
fib6_nodes 11 61 64 61
ip6_dst_cache 16 24 320 12
ndisc_cache 1 15 256 15
rawv6_sock 10 12 1024 4
udpv6_sock 1 4 1024 4
tcpv6_sock 3 4 1728 4
rpc_buffers 8 8 2048 2
rpc_tasks 8 12 320 12
rpc_inode_cache 6 8 832 4
msi_cache 4 4 5760 1
ip_fib_alias 10 119 32 119
ip_fib_hash 10 61 64 61
dm_tio 0 0 24 156
dm_io 0 0 40 96
dm-bvec-(256) 0 0 4096 1
dm-bvec-128 0 0 2048 2
dm-bvec-64 0 0 1024 4
dm-bvec-16 0 0 256 15
dm-bvec-4 0 0 64 61
dm-bvec-1 0 0 16 225
dm-bio 0 0 128 31
Cache Num Total Size Pages
uhci_urb_priv 2 90 88 45
ext3_inode_cache 393257 393260 856 4
ext3_xattr 0 0 88 45
journal_handle 8 81 48 81
journal_head 653 2295 88 45
revoke_table 24 225 16 225
revoke_record 0 0 32 119
scsi_cmd_cache 10 49 512 7
unix_sock 106 150 768 5
ip_mrt_cache 0 0 128 31
tcp_tw_bucket 0 0 192 20
tcp_bind_bucket 17 119 32 119
tcp_open_request 0 0 128 31
inet_peer_cache 0 0 128 31
secpath_cache 0 0 192 20
xfrm_dst_cache 0 0 384 10
ip_dst_cache 38 80 384 10
arp_cache 16 30 256 15
raw_sock 9 9 832 9
udp_sock 15 36 832 9
tcp_sock 56 65 1536 5
flow_cache 0 0 128 31
mqueue_inode_cache 1 4 896 4
relayfs_inode_cache 0 0 592 13
isofs_inode_cache 0 0 632 6
hugetlbfs_inode_cache 1 6 624 6
ext2_inode_cache 0 0 752 5
ext2_xattr 0 0 88 45
dquot 0 0 224 17
eventpoll_pwq 3 54 72 54
eventpoll_epi 3 20 192 20
kioctx 0 0 384 10
kiocb 0 0 256 15
dnotify_cache 2 96 40 96
fasync_cache 1 156 24 156
Cache Num Total Size Pages
shmem_inode_cache 369 390 816 5
posix_timers_cache 0 0 184 21
uid_cache 6 62 128 31
sgpool-256 32 32 8192 1
sgpool-128 32 32 4096 1
sgpool-64 32 32 2048 2
sgpool-32 32 32 1024 4
sgpool-16 33 40 512 8
sgpool-8 45 90 256 15
cfq_pool 95 207 56 69
crq_pool 87 216 72 54
deadline_drq 0 0 96 41
as_arq 0 0 112 35
blkdev_ioc 300 357 32 119
blkdev_queue 35 72 856 9
blkdev_requests 95 135 264 15
biovec-(256) 256 256 4096 1
biovec-128 256 256 2048 2
biovec-64 256 256 1024 4
biovec-16 256 270 256 15
biovec-4 256 305 64 61
biovec-1 324 450 16 225
bio 305 372 128 31
file_lock_cache 3 50 160 25
sock_inode_cache 211 230 704 5
skbuff_head_cache 16556 21324 320 12
sock 6 12 640 6
proc_inode_cache 2361 2364 616 6
sigqueue 33 46 168 23
radix_tree_node 212941 212954 536 7
bdev_cache 45 56 832 4
mnt_cache 46 60 192 20
inode_cache 2730 2779 584 7
dentry_cache 2373901 2374016 240 16
filp 718 804 320 12
Cache Num Total Size Pages
names_cache 5 10 4096 1
avc_node 12 378 72 54
key_jar 12 40 192 20
idr_layer_cache 88 91 528 7
buffer_head 387120 387180 88 45
mm_struct 52 119 1152 7
vm_area_struct 1707 2706 176 22
fs_cache 349 488 64 61
files_cache 47 153 832 9
signal_cache 452 630 256 15
sighand_cache 448 465 2112 3
task_struct 476 492 2000 2
anon_vma 665 1248 24 156
shared_policy_node 0 0 56 69
numa_policy 82 450 16 225
size-131072(DMA) 0 0 131072 1
size-131072 12 12 131072 1
size-65536(DMA) 0 0 65536 1
size-65536 126 126 65536 1
size-32768(DMA) 0 0 32768 1
size-32768 0 0 32768 1
size-16384(DMA) 0 0 16384 1
size-16384 1210 1210 16384 1
size-8192(DMA) 0 0 8192 1
size-8192 2615 2616 8192 1
size-4096(DMA) 0 0 4096 1
size-4096 488 496 4096 1
size-2048(DMA) 0 0 2048 2
size-2048 9050 9102 2048 2
size-1620(DMA) 0 0 1664 4
size-1620 88 108 1664 4
size-1024(DMA) 0 0 1024 4
size-1024 13138 14816 1024 4
size-512(DMA) 0 0 512 8
size-512 854 2752 512 8
Cache Num Total Size Pages
size-256(DMA) 0 0 256 15
size-256 6495 7770 256 15
size-128(DMA) 0 0 128 31
size-128 198380 198586 128 31
size-64(DMA) 0 0 64 61
size-64 20477 36478 64 61
size-32(DMA) 0 0 32 119
size-32 43283 50932 32 119
kmem_cache 180 180 256 15
The collectl stats during the rsync is available at
http://cluster.mathstat.dal.ca/lustre2-20071225-000104.raw.gz
It shows the cache getting built up after 4 am in the morning.
Thanks very much for any recommendations and help. We still have a bit of
headroom in the available ram. Hope these caches don't continue to build
everyday and crash the OSS again.
Regards
Balagopal
> On Dec 23, 2007 18:01 -0400, Balagopal Pillai wrote:
> > The cluster is made idle on the weekend to look at the Lustre
> > ram consumpton issue. The ram used during yesterday's rsync is still not
> > freed up. Here is the output from free
> >
> > total used free shared buffers cached
> > Mem: 4041880 3958744 83136 0 876132 144276
> > -/+ buffers/cache: 2938336 1103544
> > Swap: 4096564 240 4096324
>
> Note that this is normal behaviour for Linux. Ram that is unused provides
> no value, so all available RAM is used for cache until something else is
> needing to use this memory.
>
> > Looking at vmstat -m, there is something odd. Seems like
> > ext3_inode_cache and dentry_cache seems to be the biggest occupants of
> > ram. ldiskfs_inode_cache comparatively smaller.
> > -
> >
> > Cache Num Total Size Pages
> > ldiskfs_inode_cache 430199 440044 920 4
> > ldlm_locks 10509 12005 512 7
> > ldlm_resources 10291 11325 256 15
> > buffer_head 230970 393300 88 45
>
> > ext3_inode_cache 1636505 1636556 856 4
> > dentry_cache 1349923 1361216 240 16
>
> This is odd, because Lustre doesn't use ext3 at all. It uses ldiskfs
> (which is ext3 renamed + patches), so it is some non-Lustre filesystem
> usage which is consuming most of your memory.
>
> >
> > Is there anything in proc as explained in
> > http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/ref-guide/s1-proc-directories.html
> > that can force the kernel to flush out the dentry_cache and
> > ext3_inode_cache when the rsync is over and cache is not needed anymore?
> > Thanks very much.
>
> Only to unmount and remount the filesystem, on the server. On Lustre
> clients there is a mechanism to flush Lustre cache, but that doesn't
> help you here.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
More information about the lustre-discuss
mailing list