[Lustre-discuss] noatime or atime_diff for Lustre 1.8.7?

Grigory Shamov gas5x at yahoo.com
Thu Dec 6 11:06:47 PST 2012


Hi,

On our cluster, when there is a load on Lustre FS, at some points it slows down precipitously, and there are very very many "slow IO " and "slow setattr" messages on the OSS servers:

=======
[2988758.408968] Lustre: scratch-OST0004: slow i_mutex 51s due to heavy IO load
[2988758.408974] Lustre: Skipped 276 previous similar messages
[2988760.309388] Lustre: scratch-OST0004: slow setattr 50s due to heavy IO load
[2988822.617865] Lustre: scratch-OST0004: slow setattr 62s due to heavy IO load
[2988822.689819] Lustre: scratch-OST0004: slow journal start 48s due to heavy IO load
[2988822.690627] Lustre: scratch-OST0004: slow journal start 56s due to heavy IO load
[2988823.125410] Lustre: scratch-OST0004: slow parent lock 55s due to heavy IO load
[2988823.125419] Lustre: Skipped 1 previous similar message
[2988823.125432] Lustre: scratch-OST0004: slow preprw_write setup 55s due to heavy IO load
[2988856.236914] Lustre: scratch-OST0004: slow direct_io 33s due to heavy IO load
[2988856.236922] Lustre: Skipped 323 previous similar messages
[2988892.543942] Lustre: scratch-OST0004: slow i_mutex 48s due to heavy IO load
[2988892.543950] Lustre: Skipped 280 previous similar messages
[2988892.545310] Lustre: scratch-OST0004: slow setattr 55s due to heavy IO load
[2988892.547328] Lustre: scratch-OST0004: slow parent lock 42s due to heavy IO load
[2988892.547334] Lustre: Skipped 4 previous similar messages
[2988958.306720] Lustre: scratch-OST0004: slow setattr 52s due to heavy IO load
[2988958.306724] Lustre: Skipped 1 previous similar message
[2988958.310818] Lustre: scratch-OST0004: slow parent lock 59s due to heavy IO load
[2989040.406738] Lustre: scratch-OST0004: slow setattr 50s due to heavy IO load
=========

I wonder if mounting it on clients with "noatime" and/or changing the atime_diff would help to rid off of these Lustre slowdowns? Right now we have:  /proc/fs/lustre/mds/scratch-MDT0000/atime_diff on our MDS server is 60.

I've tried to Google it first, and found that apparently "noatime " is not supported for 1.8, and changing atime_diff is the preferred way? 

Could you please advise me, which way is better/possible, and how does one change atime_diff?  Will it help? Does it require, say, client's remount, etc.?

Any ideas and advice would be greatly appreciated! Thank you very much in advance.


--
Grigory Shamov
HPC Analyst, Westgrid/Compute Canada
E2-588 EITC Building, University of Manitoba
(204) 474-9625





More information about the lustre-discuss mailing list