[Lustre-discuss] lustre ram usage
Balagopal Pillai
pillai at mathstat.dal.ca
Sat Dec 22 15:15:45 PST 2007
Hi,
One of our Lustre installations have the following hardware -
one dell pe1950 with 4 MD1000 connected and have 8 OST exported. The
second OSS server, a dell 2950 (which is also functioning as the MGS and
MDS) also has 4 MD 1000 connected and have 8 OST exported. 16 OST from
both OSS serve 6 volumes of about 50 TB in total. Both servers have dual
quad core xeons and 4 GB ram. The 2950 has a second perc 5 too
that serves MGS and MDS.
Now one of the volumes is getting full and the overnight rsync
crashes the 2950 (and sometimes 1950 too), likely due to exhausting of
ram. The crash always seems to happen about half an hour after the rsync is
scheduled early morning. Today i manually did the rsync and found that
it takes almost 3G of ram on the 2950 and >2G on the 1950 for the rsync to
complete. Rsync is run from one of the compute nodes, where lustre volumes
are mounted. I reduced the journal size on all OST to 128M as per the
advice on one of the emails in the list, but that still doesn't reduce
the memory consumption on the Lustre OSS when a lustre client runs an
rsync backup. Here is the current status on 2950 -
total used free shared buffers
cached
Mem: 4041880 3985712 56168 0 883388
53448
-/+ buffers/cache: 3048876 993004
Swap: 4096564 240 4096324
Even after i terminated the rsync, the used ram doesn't
seem to get freed up. On hind sight, i could have got 16GB per OSS. But i wasn't expecting such
high memory usage by Lustre. The version of Lustre is 1.6.3. The ram
utilization on OSS seems to go up in the "building file list" stage in the
rsync. Other than the occassional hang for the NFS server that re-exports
Lustre volumes to other servers, Lustre is working fine and is stable
when cluster is fully loaded with compute jobs.
Is there anything extra that can be tried (may be
ost_num_threads for example), other than adding more ram to both OSS, that
could solve the high memory consumption problem during rsync backup on
the two OSS? Thanks in advance.
Regards
Balagopal Pillai
More information about the lustre-discuss
mailing list