[lustre-discuss] MDT partition getting full

Radu Popescu radu.popescu at amoma.com
Fri Apr 24 00:21:07 PDT 2015


Ok, so I got some automated reply that my post is too big. I’ve uploaded the print screens some place.

So, first, here’s the listing of some random folders in /O/1/

debugfs:  ls -l /O/1/d14
 30408754   40700 (2)      0      0   12288 24-Apr-2015 05:36 .
 30408739   40755 (2)      0      0    4096  6-Apr-2015 08:53 ..
    216  100644 (17)      0      0   4153280 23-Apr-2015 14:20 7662
    339  100644 (17)      0      0   4153280 23-Apr-2015 19:24 7790
    369  100644 (17)      0      0   4153280 23-Apr-2015 20:29 7822
    402  100644 (17)      0      0   4153280 23-Apr-2015 21:37 7854
    433  100644 (17)      0      0   4153280 23-Apr-2015 22:56 7886
    491  100644 (17)      0      0   4153280 24-Apr-2015 01:52 7950
    547  100644 (17)      0      0   4153280 24-Apr-2015 05:36 8014
   7575  100644 (17)      0      0   4153280 23-Apr-2015 07:34 7534
    120  100644 (17)      0      0   4153280 23-Apr-2015 09:29 7566
    153  100644 (17)      0      0   4153280 23-Apr-2015 11:08 7598
    184  100644 (17)      0      0   4153280 23-Apr-2015 12:51 7630
    248  100644 (17)      0      0   4153280 23-Apr-2015 15:46 7694
    308  100644 (17)      0      0   4153280 23-Apr-2015 18:15 7758
    464  100644 (17)      0      0   4153280 24-Apr-2015 00:20 7918
    521  100644 (17)      0      0   4153280 24-Apr-2015 03:35 7982

debugfs:  ls -l /O/1/d13
 30408753   40700 (2)      0      0   12288 24-Apr-2015 05:31 .
 30408739   40755 (2)      0      0    4096  6-Apr-2015 08:53 ..
    152  100644 (17)      0      0   4153280 23-Apr-2015 11:04 7597
    215  100644 (17)      0      0   4153280 23-Apr-2015 14:17 7661
    401  100644 (17)      0      0   4153280 23-Apr-2015 21:35 7853
    463  100644 (17)      0      0   4153280 24-Apr-2015 00:18 7917
    490  100644 (17)      0      0   4153280 24-Apr-2015 01:49 7949
    546  100644 (17)      0      0   4153280 24-Apr-2015 05:31 8013
    119  100644 (17)      0      0   4153280 23-Apr-2015 09:26 7565
    183  100644 (17)      0      0   4153280 23-Apr-2015 12:48 7629
    247  100644 (17)      0      0   4153280 23-Apr-2015 15:44 7693
    278  100644 (17)      0      0   4153280 23-Apr-2015 17:02 7725
    307  100644 (17)      0      0   4153280 23-Apr-2015 18:13 7757
    338  100644 (17)      0      0   4153280 23-Apr-2015 19:22 7789
    370  100644 (17)      0      0   4153280 23-Apr-2015 20:27 7821
    432  100644 (17)      0      0   4153280 23-Apr-2015 22:53 7885
    522  100644 (17)      0      0   4153280 24-Apr-2015 03:31 7981

debugfs:  ls -l /O/1/d12
 30408752   40700 (2)      0      0   16384 24-Apr-2015 05:27 .
 30408739   40755 (2)      0      0    4096  6-Apr-2015 08:53 ..
    214  100644 (17)      0      0   4153280 23-Apr-2015 14:14 7660
    246  100644 (17)      0      0   4153280 23-Apr-2015 15:42 7692
    306  100644 (17)      0      0   4153280 23-Apr-2015 18:10 7756
    337  100644 (17)      0      0   4153280 23-Apr-2015 19:20 7788
    462  100644 (17)      0      0   4153280 24-Apr-2015 00:16 7916
    489  100644 (17)      0      0   4153280 24-Apr-2015 01:46 7948
    151  100644 (1)      0      0   4153280 23-Apr-2015 11:01 7596
    118  100644 (17)      0      0   4153280 23-Apr-2015 09:23 7564
    400  100644 (17)      0      0   4153280 23-Apr-2015 21:33 7852
    544  100644 (17)      0      0   4153280 24-Apr-2015 05:27 8012
    182  100644 (17)      0      0   4153280 23-Apr-2015 12:46 7628
    431  100644 (17)      0      0   4153280 23-Apr-2015 22:51 7884

Apparently, all of them have 4MB and it seems the number of files is growing. Some of them were created yesterday and some today.
Don’t know if I can attach some print screens to this email and will be shown in the thread, but I’ll try anyway. Basically, after unmounting / re-mounting lustre partitions on the server yesterday (and I actually tried the same on 2 of them), I could see on the graphs that the used storage is decreasing (actually, from the graphs, which show free storage, is increasing). So maybe, is it some sort of caching? 

The links:
http://5.175.193.50/ps1.png <http://5.175.193.50/ps1.png>
http://5.175.193.50/ps2.png <http://5.175.193.50/ps2.png>

Thanks,
Radu

> On 23 Apr 2015, at 21:01, Mohr Jr, Richard Frank (Rick Mohr) <rmohr at utk.edu> wrote:
> 
> 
>> On Apr 23, 2015, at 1:07 PM, Colin Faber <cfaber at gmail.com> wrote:
>> 
>> 
>> Based on the directory structure here, this appears to be an OST. are you sure your targets are correctly named?
>> 
> 
> That is what I would have guessed until I took a look at my own MDT.  Sure enough, I have the directories /O/1/d[0-31] and each one seems to have 3 files that are about 3.5MB each (along with some other smaller ones).  Here is what one of those directories looks like:
> 
> debugfs:  ls -l /O/1/d14
> 16777293   40700 (2)      0      0    4096 20-Apr-2015 14:09 .
> 16777278   40755 (2)      0      0    4096 13-May-2014 16:51 ..
>  58129  100644 (1)      0      0    8256 13-May-2014 16:51 14
>  58162  100644 (1)      0      0    8256 13-May-2014 16:51 46
>  58197  100644 (1)      0      0    8256 13-May-2014 16:51 78
>  58237  100644 (1)      0      0   37632 13-May-2014 19:28 110
>  58271  100644 (1)      0      0   38464 13-May-2014 19:28 142
>  58305  100644 (1)      0      0   37888 13-May-2014 19:28 174
>  58343  100644 (1)      0      0   37184 13-May-2014 19:28 206
>  58396  100644 (1)      0      0   37312 13-May-2014 19:28 238
>  58429  100644 (1)      0      0   36160 13-May-2014 19:28 270
>  12824  100644 (1)      0      0   3623232 20-Apr-2015 14:09 43150
>  12915  100644 (1)      0      0   3800960 20-Apr-2015 14:09 43182
>  12954  100644 (1)      0      0   3769216 20-Apr-2015 14:09 43214
> 
> The three large files seem to have been created the last time the MDT was mounted.  The timestamps for the other smaller files coincides with the Lustre upgrade we performed last year.  But I am not sure what is contained in these files.
> 
> Radu: Are the timestamps for all of your files the same?  Or is the system gradually accumulating them over time for some reason?
> 
> --
> Rick Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20150424/7279bb36/attachment.htm>


More information about the lustre-discuss mailing list