[Lustre-discuss] noatime or atime_diff for Lustre 1.8.7?

Colin Faber colin_faber at xyratex.com
Thu Dec 6 11:28:00 PST 2012


Hi,

The messages indicate overloaded backend storage. You could try this, 
another option may be to statically set the maximum number of threads on 
the OSS, this should reduce load to the system and push the backlogs to 
your clients (hopefully)

-cf


On 12/06/2012 12:06 PM, Grigory Shamov wrote:
> Hi,
>
> On our cluster, when there is a load on Lustre FS, at some points it slows down precipitously, and there are very very many "slow IO " and "slow setattr" messages on the OSS servers:
>
> =======
> [2988758.408968] Lustre: scratch-OST0004: slow i_mutex 51s due to heavy IO load
> [2988758.408974] Lustre: Skipped 276 previous similar messages
> [2988760.309388] Lustre: scratch-OST0004: slow setattr 50s due to heavy IO load
> [2988822.617865] Lustre: scratch-OST0004: slow setattr 62s due to heavy IO load
> [2988822.689819] Lustre: scratch-OST0004: slow journal start 48s due to heavy IO load
> [2988822.690627] Lustre: scratch-OST0004: slow journal start 56s due to heavy IO load
> [2988823.125410] Lustre: scratch-OST0004: slow parent lock 55s due to heavy IO load
> [2988823.125419] Lustre: Skipped 1 previous similar message
> [2988823.125432] Lustre: scratch-OST0004: slow preprw_write setup 55s due to heavy IO load
> [2988856.236914] Lustre: scratch-OST0004: slow direct_io 33s due to heavy IO load
> [2988856.236922] Lustre: Skipped 323 previous similar messages
> [2988892.543942] Lustre: scratch-OST0004: slow i_mutex 48s due to heavy IO load
> [2988892.543950] Lustre: Skipped 280 previous similar messages
> [2988892.545310] Lustre: scratch-OST0004: slow setattr 55s due to heavy IO load
> [2988892.547328] Lustre: scratch-OST0004: slow parent lock 42s due to heavy IO load
> [2988892.547334] Lustre: Skipped 4 previous similar messages
> [2988958.306720] Lustre: scratch-OST0004: slow setattr 52s due to heavy IO load
> [2988958.306724] Lustre: Skipped 1 previous similar message
> [2988958.310818] Lustre: scratch-OST0004: slow parent lock 59s due to heavy IO load
> [2989040.406738] Lustre: scratch-OST0004: slow setattr 50s due to heavy IO load
> =========
>
> I wonder if mounting it on clients with "noatime" and/or changing the atime_diff would help to rid off of these Lustre slowdowns? Right now we have:  /proc/fs/lustre/mds/scratch-MDT0000/atime_diff on our MDS server is 60.
>
> I've tried to Google it first, and found that apparently "noatime " is not supported for 1.8, and changing atime_diff is the preferred way?
>
> Could you please advise me, which way is better/possible, and how does one change atime_diff?  Will it help? Does it require, say, client's remount, etc.?
>
> Any ideas and advice would be greatly appreciated! Thank you very much in advance.
>
>
> --
> Grigory Shamov
> HPC Analyst, Westgrid/Compute Canada
> E2-588 EITC Building, University of Manitoba
> (204) 474-9625
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list