[Lustre-discuss] lustre ram usage (contd)
Mark Seger
Mark.Seger at hp.com
Mon Dec 24 08:38:20 PST 2007
In my opinion there are a couple of problems with cron jobs that do
monitoring. On the positive note they're quick and easy, but on the
downside you have extra work to do it you want timestamps and then,
there's the issue about all the other potential system metrics you're
missing out on. The neat thing about collectl is it essentially does it
all! In the case of lustre that means if you run it with the defaults
you'll get cpu, memory, network, and more in addition to the slab data.
However, if you really want to get crazy, you can get the performance by
ost and even the rpc stats. The one negative with collectl is while it
can do a lot, that translates into a lot of options which can be
confusing at first.
-mark
Balagopal Pillai wrote:
> Thanks Mark. This looks handy. I was about to put a cron job with vmstat
> to see how the memory utilization progresses with the early morning rsync .
> Since i put another 4G on both OSS today morning, hopefully it should be
> enough for its operation.
>
> Regards
> Balagopal
>
>
> Mark Seger wrote:
>
>> If you're really interesting in tracking memory utilization, collectl
>> - see http://collectl.sourceforge.net/ - when run as a daemon will
>> collect/log all slab data once a minute and you can change the
>> frequency to anything you like. You can then later play it back and
>> see exactly what is happening over time. As another approach you can
>> run interactively and if you specify the -oS switch, you'll only see
>> changes as they occur. Including the 'T' will time stamp them as in
>> the example below:
>>
>> [root at cag-dl380-01 root]# collectl -sY -oST -i:1
>> # SLAB DETAIL
>> #
>> <-----------Objects----------><---------Slab Allocation------>
>> # Name InUse Bytes Alloc Bytes
>> InUse Bytes Total Bytes
>> 11:02:02 size-512 146 74752 208 106496
>> 21 86016 26 106496
>> 11:02:07 sigqueue 319 42108 319 42108
>> 11 45056 11 45056
>> 11:02:07 size-512 208 106496 208 106496 26
>> 106496 26 106496
>>
>> Since this isn't a lustre system there isn't a whole lot of activity...
>>
>> -mark
>>
>> Andreas Dilger wrote:
>>
>>> On Dec 23, 2007 18:01 -0400, Balagopal Pillai wrote:
>>>
>>>
>>>> The cluster is made idle on the weekend to look at the
>>>> Lustre ram consumpton issue. The ram used during yesterday's rsync
>>>> is still not freed up. Here is the output from free
>>>> total used free shared buffers
>>>> cached
>>>> Mem: 4041880 3958744 83136 0 876132
>>>> 144276
>>>> -/+ buffers/cache: 2938336 1103544
>>>> Swap: 4096564 240 4096324
>>>>
>>>>
>>> Note that this is normal behaviour for Linux. Ram that is unused
>>> provides
>>> no value, so all available RAM is used for cache until something else is
>>> needing to use this memory.
>>>
>>>
>>>
>>>> Looking at vmstat -m, there is something odd. Seems like
>>>> ext3_inode_cache and dentry_cache seems to be the biggest occupants
>>>> of ram. ldiskfs_inode_cache comparatively smaller. -
>>>>
>>>> Cache Num Total Size Pages
>>>> ldiskfs_inode_cache 430199 440044 920 4
>>>> ldlm_locks 10509 12005 512 7
>>>> ldlm_resources 10291 11325 256 15
>>>> buffer_head 230970 393300 88 45
>>>>
>>>>
>>>
>>>
>>>> ext3_inode_cache 1636505 1636556 856 4
>>>> dentry_cache 1349923 1361216 240 16
>>>>
>>>>
>>> This is odd, because Lustre doesn't use ext3 at all. It uses ldiskfs
>>> (which is ext3 renamed + patches), so it is some non-Lustre filesystem
>>> usage which is consuming most of your memory.
>>>
>>>
>>>
>>>> Is there anything in proc as explained in
>>>> http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/ref-guide/s1-proc-directories.html
>>>>
>>>> that can force the kernel to flush out the dentry_cache and
>>>> ext3_inode_cache when the rsync is over and cache is not needed
>>>> anymore? Thanks very much.
>>>>
>>>>
>>> Only to unmount and remount the filesystem, on the server. On Lustre
>>> clients there is a mechanism to flush Lustre cache, but that doesn't
>>> help you here.
>>>
>>> Cheers, Andreas
>>> --
>>> Andreas Dilger
>>> Sr. Staff Engineer, Lustre Group
>>> Sun Microsystems of Canada, Inc.
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at clusterfs.com
>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>
>>>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
More information about the lustre-discuss
mailing list