[Lustre-discuss] un-even distribution of data over OSTs

Wed Mar 7 12:43:17 PST 2012

On 2012-03-08, at 0:33, Roland Laifer <roland.laifer at kit.edu> wrote:
> I recently had a similar problem: Very bad OST space occupation but I could not 
> find a corresponding very large file. 
> 
> Finally I found a process with parent 1 that was still appending data to a 
> file which was already deleted by the user, i.e. this was the reason why 
> I could not find a corresponding large file with "find". 

It is possible to find open-unlinked files on the clients by using "lsof | grep deleted", since deleted files get " (deleted)" added at the end. 

Sometimes this is normal, for temp files that should be unlinked when the process exits, but usually not. 

These files can be accessed via /proc/{PID}/{fileno}, though I've never checked if "lfs getstripe" would work there or not. 

> I found that process because the corresponding client was reporting 
> "The ost_write operation failed with -28" LustreError messages and because 
> I was lucky that only few user processes were running on that client. 
> The owner of that process had Lustre quotas of 1.5 TB but "du -hs" on his 
> home directory only showed 80 GB. After killing the process Lustre quotas 
> went down and "lfs df" showed that OST usage was going down, too. 
> 
> Regards, 
>  Roland 
> 
> 
> On Wed, Mar 07, 2012 at 07:41:28AM -0800, Grigory Shamov wrote:
>> Dear Lustre-Users,
>> 
>> Recently we had an issue with file data distribution over our Lustre OSTs. We have a Lustre storage cluster here, of two OSS servers in active-active failover mode. The version of luster is 1.8, possibly with DDN patches. 
>> 
>> The cluster has 12 OSTs, 7.3Tb each. Normally, they are occupied to about 60% of the space (4.5Tb or so); but recently, one of them got completely filled (99%) with two other also keeping up (80%). The rest of OSTs stayed at the usual 60%. 
>> 
>> Why would that happen, shouldn't' Lustre try to distribute the space evenly? I have checked the filled OSTs for large files; there were no files that can be called large enough to explain the difference (with size of the order of magnitude of the difference between 99% and 60% occupation, i.e. 2-3Tb); some users did have large directories, but the files were of about 5-10Gb size.
>> 
>> I have checked our Lustre parameters, the qos_prio_free seems to be default 90%, qos_threshold_rr is 16%, and stripe count is 1. 
>> 
>> Could you please suggest what might have caused such behavior of Lustre, are there any tunables/better values of tresholds, etc. to change to avoid such imbalances, etc.? 
>> 
>> Thank you very much in advance!
>> 
>> --
>> Grigory Shamov
>> HPC Analyst,
>> University of Manitoba
>> Winnipeg MB Canada
>> 
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> 
> -- 
> Karlsruhe Institute of Technology (KIT)
> Steinbuch Centre for Computing (SCC)
> 
> Roland Laifer
> Scientific Computing and Simulation (SCS)
> 
> Zirkel 2, Building 20.21, Room 209
> 76131 Karlsruhe, Germany
> Phone: +49 721 608 44861
> Fax: +49 721 32550
> Email: roland.laifer at kit.edu
> Web: http://www.scc.kit.edu
> 
> KIT – University of the State of Baden-Wuerttemberg and 
> National Laboratory of the Helmholtz Association
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss