[Lustre-discuss] No space left on device for just one file

Michael Robbert mrobbert at mines.edu
Wed Jan 13 06:37:44 PST 2010


Andreas,
I never saw this message either. It is showing up in the output of dmesg, but is not written to any log files that I can find. I miscounted the number of files in my original message. The actual number is a little more than 11 million files. I am in the process of working with the user to decrease the number of files needed in a single directory. At this point I think that we've given up trying to use this directory for anything and will just purge it. I expect that will be a long process. Does anybody have any suggestions for making the removal of a directory with 11 million files a little less painless?

As for moving forward I'm waiting for the user to get some code changes that will allow these files to be split into 8 separate directories as well as the possibility of stacking time step files for further reduction of number of files. I still expect the number of files to be fairly large and was considering using the loopback file system trick to store these since once they are created they will be read only. Any suggestions for doing this? Initial tests indicate that having the loopback file on a local disk for writing may be faster. Then I can copy it to Lustre and come up with some kind of configuration on the compute nodes so that the user can mount it RO when his jobs start.

Thanks,
Mike Robbert

On Jan 12, 2010, at 7:49 PM, Andreas Dilger wrote:

> On 2010-01-12, at 15:30, Bernd Schubert wrote:
>> you really should fill a ticket to us (DDN). I think your problem is
>> from
>> these MDS messages:
>>
>> LDISKFS-fs warning (device dm-1): ldiskfs_dx_add_entry: Directory
>> index full!
>> LDISKFS-fs warning (device dm-1): ldiskfs_dx_add_entry: Directory
>> index full!
>
> Hmm, I didn't see this in any emails.  That definitely would have made
> it obvious what the problem is.  I didn't think that the size of the
> index would be a problem, since the filename is only 27 characters
> long, and Michael said there were only a million files in the
> directory.  That works out to a directory size of about 40MB, and
> isn't close to the upper limit.
>
> There might be a problem if you have a 1M file directory and are
> repeatedly creating and deleting long filenames in the same directory,
> which might leave some directory leaf blocks full, but the block
> cannot be split to redistribute the values therein.
>
> It should be possible to unmount the MDT, run "e2fsck -fD /dev/
> {mdsdev}" so it will re-index the directory and reduce the number of
> blocks the directory is using.
>
>> And /dev/dm-1 is also the scratch MDT.
>>
>>
>> Cheers,
>> Bernd
>>
>> On Tuesday 12 January 2010, Michael Robbert wrote:
>>> Andreas,
>>> Here are the results of my debugging. This problem does show up on
>>> multiple
>>> (presumably all) clients. I followed your instructions, changing
>>> lustre to
>>> lnet in step 2, and got debug output on both machines, but the -28
>>> text
>>> only showed up on the client.
>>>
>>> [root at ra 18X11]# grep -- "-28" /tmp/debug.client
>>> 00000100:00000200:5:1263315233.100525:0:22069:0:(client.c:
>>> 841:ptlrpc_check_
>>> reply()) @@@ rc = 1 for  req at 00000103a5820800 x200609397/t0
>>> o36->scratch-MDT0000_UUID at 172.16.34.1@o2ib:12/10 lens 376/424 e 0
>>> to 1 dl
>>> 1263315433 ref 1 fl Rpc:R/0/0 rc 0/-28
>>> 00000100:00000200:5:1263315233.100538:0:22069:0:(events.c:
>>> 95:reply_in_call
>>> back()) @@@ type 5, status 0  req at 00000103a5820800 x200609397/t0
>>> o36->scratch-MDT0000_UUID at 172.16.34.1@o2ib:12/10 lens 376/424 e 0
>>> to 1 dl
>>> 1263315433 ref 1 fl Rpc:R/0/0 rc 0/-28
>>> 00000100:00100000:5:1263315233.100543:0:22069:0:(events.c:
>>> 115:reply_in_cal
>>> lback()) @@@ unlink  req at 00000103a5820800 x200609397/t0
>>> o36->scratch-MDT0000_UUID at 172.16.34.1@o2ib:12/10 lens 376/424 e 0
>>> to 1 dl
>>> 1263315433 ref 1 fl Rpc:R/0/0 rc 0/-28
>>> 00000100:00000040:5:1263315233.100565:0:22069:0:(client.c:
>>> 863:ptlrpc_check
>>> _status()) @@@ status is -28  req at 00000103a5820800 x200609397/t0
>>> o36->scratch-MDT0000_UUID at 172.16.34.1@o2ib:12/10 lens 376/424 e 0
>>> to 1 dl
>>> 1263315433 ref 1 fl Rpc:R/0/0 rc 0/-28
>>> 00000100:00000001:5:1263315233.100570:0:22069:0:(client.c:
>>> 869:ptlrpc_check
>>> _status()) Process leaving (rc=18446744073709551588 : -28 :
>>> ffffffffffffffe4)
>>> 00000100:00000001:5:1263315233.100578:0:22069:0:(client.c:
>>> 955:after_reply(
>>> )) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4)
>>> 00000100:00100000:5:1263315233.100581:0:22069:0:(lustre_net.h:
>>> 984:ptlrpc_r
>>> qphase_move()) @@@ move req "Rpc" -> "Interpret"
>>> req at 00000103a5820800
>>> x200609397/t0 o36->scratch-MDT0000_UUID at 172.16.34.1@o2ib:12/10 lens
>>> 376/424 e 0 to 1 dl 1263315433 ref 1 fl Rpc:R/0/0 rc 0/-28
>>> 00000100:00000001:5:1263315233.100586:0:22069:0:(client.c:
>>> 2094:ptlrpc_queu
>>> e_wait()) Process leaving (rc=18446744073709551588 : -28 :
>>> ffffffffffffffe4)
>>> 00000002:00000040:5:1263315233.100590:0:22069:0:(mdc_reint.c:
>>> 67:mdc_reint(
>>> )) error in handling -28
>>> 00000002:00000001:5:1263315233.100593:0:22069:0:(mdc_reint.c:
>>> 227:mdc_creat
>>> e()) Process leaving (rc=18446744073709551588 : -28 :
>>> ffffffffffffffe4)
>>> 00000080:00000001:5:1263315233.100596:0:22069:0:(namei.c:
>>> 881:ll_new_node()
>>> ) Process leaving via err_exit (rc=18446744073709551588 : -28 :
>>> ffffffffffffffe4)
>>> 00000100:00000040:5:1263315233.100600:0:22069:0:(client.c:
>>> 1629:__ptlrpc_re
>>> q_finished()) @@@ refcount now 0  req at 00000103a5820800 x200609397/t0
>>> o36->scratch-MDT0000_UUID at 172.16.34.1@o2ib:12/10 lens 376/424 e 0
>>> to 1 dl
>>> 1263315433 ref 1 fl Interpret:R/0/0 rc 0/-28
>>> 00000080:00000001:5:1263315233.100620:0:22069:0:(namei.c:
>>> 930:ll_mknod_gene
>>> ric()) Process leaving (rc=18446744073709551588 : -28 :
>>> ffffffffffffffe4)
>>>
>>> Finally here is the lfs df output:
>>>
>>> [root at ra 18X11]# lfs df
>>> UUID                 1K-blocks      Used Available  Use% Mounted on
>>> home-MDT0000_UUID    5127574032   2034740 4832512272    0%
>>> /lustre/home[MDT:0] home-OST0000_UUID    5768577552 1392861480
>>> 4082688968
>>> 24% /lustre/home[OST:0] home-OST0001_UUID    5768577552 1206861808
>>> 4268688824   20% /lustre/home[OST:1] home-OST0002_UUID    5768577552
>>> 1500109508 3975439928   26% /lustre/home[OST:2] home-OST0003_UUID
>>> 5768577552 1233475740 4242074712   21% /lustre/home[OST:3]
>>> home-OST0004_UUID    5768577552 1197398768 4278150628   20%
>>> /lustre/home[OST:4] home-OST0005_UUID    5768577552 1186058976
>>> 4289491656
>>> 20% /lustre/home[OST:5]
>>>
>>> filesystem summary:  34611465312 7716766280 25136534716   22% /
>>> lustre/home
>>>
>>> UUID                 1K-blocks      Used Available  Use% Mounted on
>>> scratch-MDT0000_UUID 5127569936   9913156 4824629964    0%
>>> /lustre/scratch[MDT:0] scratch-OST0000_UUID 5768577552 4446029104
>>> 1029519960   77% /lustre/scratch[OST:0] scratch-OST0001_UUID
>>> 5768577552
>>> 3914730392 1560819220   67% /lustre/scratch[OST:1] scratch-
>>> OST0002_UUID
>>> 5768577552 4268932844 1206616396   74% /lustre/scratch[OST:2]
>>> scratch-OST0003_UUID 5768577552 4307085048 1168464192   74%
>>> /lustre/scratch[OST:3] scratch-OST0004_UUID 5768577552 3920023888
>>> 1555525724   67% /lustre/scratch[OST:4] scratch-OST0005_UUID
>>> 5768577552
>>> 3590710852 1884838760   62% /lustre/scratch[OST:5] scratch-
>>> OST0006_UUID
>>> 5768577552 4649048836 826500028   80% /lustre/scratch[OST:6]
>>> scratch-OST0007_UUID 5768577552 4089658692 1385890920   70%
>>> /lustre/scratch[OST:7] scratch-OST0008_UUID 5768577552 4151458292
>>> 1324090948   71% /lustre/scratch[OST:8] scratch-OST0009_UUID
>>> 5768577552
>>> 4116646240 1358902348   71% /lustre/scratch[OST:9] scratch-
>>> OST000a_UUID
>>> 5768577552 3750259568 1725290032   65% /lustre/scratch[OST:10]
>>> scratch-OST000b_UUID 5768577552 4346406836 1129141752   75%
>>> /lustre/scratch[OST:11] scratch-OST000c_UUID 5768577552 4376152100
>>> 1099396768   75% /lustre/scratch[OST:12] scratch-OST000d_UUID
>>> 5768577552
>>> 4312773056 1162776184   74% /lustre/scratch[OST:13] scratch-
>>> OST000e_UUID
>>> 5768577552 4900307080 575242532   84% /lustre/scratch[OST:14]
>>> scratch-OST000f_UUID 5768577552 4044304276 1431243940   70%
>>> /lustre/scratch[OST:15] scratch-OST0010_UUID 5768577552 3827521672
>>> 1648026552   66% /lustre/scratch[OST:16] scratch-OST0011_UUID
>>> 5768577552
>>> 3789120072 1686427400   65% /lustre/scratch[OST:17] scratch-
>>> OST0012_UUID
>>> 5768577552 4023497048 1452052192   69% /lustre/scratch[OST:18]
>>> scratch-OST0013_UUID 5768577552 4133682544 1341866324   71%
>>> /lustre/scratch[OST:19] scratch-OST0014_UUID 5768577552 3690021408
>>> 1785527832   63% /lustre/scratch[OST:20] scratch-OST0015_UUID
>>> 5768577552
>>> 3891559096 1583990144   67% /lustre/scratch[OST:21] scratch-
>>> OST0016_UUID
>>> 5768577552 4404600712 1070948896   76% /lustre/scratch[OST:22]
>>> scratch-OST0017_UUID 5768577552 4792223084 683326528   83%
>>> /lustre/scratch[OST:23] scratch-OST0018_UUID 5768577552 4486070024
>>> 989478844   77% /lustre/scratch[OST:24] scratch-OST0019_UUID
>>> 5768577552
>>> 4471754448 1003795164   77% /lustre/scratch[OST:25] scratch-
>>> OST001a_UUID
>>> 5768577552 4517349052 958199536   78% /lustre/scratch[OST:26]
>>> scratch-OST001b_UUID 5768577552 3989325372 1486223000   69%
>>> /lustre/scratch[OST:27] scratch-OST001c_UUID 5768577552 4024754964
>>> 1450793904   69% /lustre/scratch[OST:28] scratch-OST001d_UUID
>>> 5768577552
>>> 3883873220 1591676392   67% /lustre/scratch[OST:29] scratch-
>>> OST001e_UUID
>>> 5768577552 4928383088 547166152   85% /lustre/scratch[OST:30]
>>> scratch-OST001f_UUID 5768577552 4291418836 1184130776   74%
>>> /lustre/scratch[OST:31]
>>>
>>> filesystem summary:  184594481664 134329681744 40887889340   72%
>>> /lustre/scratch
>>>
>>> [root at ra 18X11]# lfs df -i
>>> UUID                    Inodes     IUsed     IFree IUse% Mounted on
>>> home-MDT0000_UUID    1287101228   5716405 1281384823    0%
>>> /lustre/home[MDT:0] home-OST0000_UUID    366288896    871143
>>> 365417753
>>> 0% /lustre/home[OST:0] home-OST0001_UUID    366288896    900011
>>> 365388885
>>>  0% /lustre/home[OST:1] home-OST0002_UUID    366288896    804892
>>> 365484004    0% /lustre/home[OST:2] home-OST0003_UUID    366288896
>>> 836213 365452683    0% /lustre/home[OST:3] home-OST0004_UUID
>>> 366288896
>>>  836852 365452044    0% /lustre/home[OST:4] home-OST0005_UUID
>>> 366288896    850446 365438450    0% /lustre/home[OST:5]
>>>
>>> filesystem summary:  1287101228   5716405 1281384823    0% /lustre/
>>> home
>>>
>>> UUID                    Inodes     IUsed     IFree IUse% Mounted on
>>> scratch-MDT0000_UUID 1453492963 174078773 1279414190   11%
>>> /lustre/scratch[MDT:0] scratch-OST0000_UUID 337257280   6621404
>>> 330635876
>>>  1% /lustre/scratch[OST:0] scratch-OST0001_UUID 366288896   6697629
>>> 359591267    1% /lustre/scratch[OST:1] scratch-OST0002_UUID 366288896
>>> 5272904 361015992    1% /lustre/scratch[OST:2] scratch-OST0003_UUID
>>> 366288896   5161903 361126993    1% /lustre/scratch[OST:3]
>>> scratch-OST0004_UUID 366288896   5327683 360961213    1%
>>> /lustre/scratch[OST:4] scratch-OST0005_UUID 366288896   5582579
>>> 360706317
>>>  1% /lustre/scratch[OST:5] scratch-OST0006_UUID 285040431   5158974
>>> 279881457    1% /lustre/scratch[OST:6] scratch-OST0007_UUID 366288896
>>> 5307157 360981739    1% /lustre/scratch[OST:7] scratch-OST0008_UUID
>>> 366288896   5387313 360901583    1% /lustre/scratch[OST:8]
>>> scratch-OST0009_UUID 366288896   5426523 360862373    1%
>>> /lustre/scratch[OST:9] scratch-OST000a_UUID 366288896   5424803
>>> 360864093
>>>  1% /lustre/scratch[OST:10] scratch-OST000b_UUID 360664073   5122378
>>> 355541695    1% /lustre/scratch[OST:11] scratch-OST000c_UUID
>>> 353235316
>>> 5129413 348105903    1% /lustre/scratch[OST:12] scratch-OST000d_UUID
>>> 366288896   5053936 361234960    1% /lustre/scratch[OST:13]
>>> scratch-OST000e_UUID 222189585   5122229 217067356    2%
>>> /lustre/scratch[OST:14] scratch-OST000f_UUID 366288896   5281196
>>> 361007700
>>>   1% /lustre/scratch[OST:15] scratch-OST0010_UUID 366288896
>>> 5274738
>>> 361014158    1% /lustre/scratch[OST:16] scratch-OST0011_UUID
>>> 366288896
>>> 5409560 360879336    1% /lustre/scratch[OST:17] scratch-OST0012_UUID
>>> 366288896   5369406 360919490    1% /lustre/scratch[OST:18]
>>> scratch-OST0013_UUID 366288896   5502974 360785922    1%
>>> /lustre/scratch[OST:19] scratch-OST0014_UUID 366288896   5521406
>>> 360767490
>>>   1% /lustre/scratch[OST:20] scratch-OST0015_UUID 366288896
>>> 5550606
>>> 360738290    1% /lustre/scratch[OST:21] scratch-OST0016_UUID
>>> 345993048
>>> 4999552 340993496    1% /lustre/scratch[OST:22] scratch-OST0017_UUID
>>> 249051056   4963064 244087992    1% /lustre/scratch[OST:23]
>>> scratch-OST0018_UUID 325734426   5108454 320625972    1%
>>> /lustre/scratch[OST:24] scratch-OST0019_UUID 329427010   5222114
>>> 324204896
>>>   1% /lustre/scratch[OST:25] scratch-OST001a_UUID 317921820
>>> 5115591
>>> 312806229    1% /lustre/scratch[OST:26] scratch-OST001b_UUID
>>> 366288896
>>> 5353229 360935667    1% /lustre/scratch[OST:27] scratch-OST001c_UUID
>>> 366288896   5383473 360905423    1% /lustre/scratch[OST:28]
>>> scratch-OST001d_UUID 366288896   5411890 360877006    1%
>>> /lustre/scratch[OST:29] scratch-OST001e_UUID 216236615   6188887
>>> 210047728
>>>   2% /lustre/scratch[OST:30] scratch-OST001f_UUID 366288896
>>> 6465049
>>> 359823847    1% /lustre/scratch[OST:31]
>>>
>>> filesystem summary:  1453492963 174078773 1279414190   11% /lustre/
>>> scratch
>>>
>>>
>>> Thanks,
>>> Mike Robbert
>>>
>>> On Jan 11, 2010, at 7:24 PM, Andreas Dilger wrote:
>>>> On 2010-01-11, at 15:59, Michael Robbert wrote:
>>>>> The filename is not very unique. I can create a file with the same
>>>>> name in another directory or on another Lustre filesystem. It is
>>>>> just this exact path on this filesystem. The full path is:
>>>>> /lustre/scratch/smoqbel/Cenval/CLM/Met.Forcing/18X11/NLDAS.APCP.
>>>>> 007100.pfb.00164
>>>>> The mount point for this filesystem is /lustre/scratch/
>>>>
>>>> Robert,
>>>> does the same problem happen on multiple client nodes, or is it only
>>>> happening on a single client?  Are there any messages on the MDS
>>>> and/
>>>> or the OSSes when this problem is happening?  This problem is
>>>> somewhat
>>>> unusual, since I'm not aware of any places outside the disk
>>>> filesystem
>>>> code that would cause ENOSPC when creating a file.
>>>>
>>>> Can you please do a bit of debugging on the system:
>>>>
>>>>   {client}# cd /lustre/scratch/smoqbel/Cenval/CLM/Met.Forcing/18X11
>>>> {mds,client}# echo -1 > /proc/sys/lustre/debug       # enable full
>>>> debug
>>>> {mds,client}# lctl clear                             # clear debug
>>>> logs
>>>>   {client}# touch NLDAS.APCP.007100.pfb.00164
>>>> {mds,client}# lctl dk > /tmp/debug.{mds,client}      # dump debug
>>>> logs
>>>>
>>>> For now, please extract the ENOSPC error from the logs will be much
>>>> shorter, and may be enough to identify where the problem is located,
>>>> and will be a lot friendlier to the list.
>>>>
>>>> grep -- "-28" /tmp/debug.{mds,client} > /tmp/debug-28.{mds,client}::
>>>>
>>>> along with the "lfs df" and "lfs df -i" output.
>>>>
>>>> If this is only on a single client, just dropping the locks on the
>>>> client might be enough to resolve the problem:
>>>>
>>>> for L in /proc/fs/lustre/ldlm/namespaces/*; do
>>>>   echo clear > $L
>>>> done
>>>>
>>>> If, on the other hand, this same problem is happening on all clients
>>>> then the problem is likely on the MDS.
>>>>
>>>>>> On Fri, Jan 8, 2010 at 1:36 PM, Michael Robbert
>>>>>>
>>>>>> <mrobbert at mines.edu> wrote:
>>>>>>> I have a user that reported a problem creating a file on our
>>>>>>> Lustre filesystem. When I investigated I found that the problem
>>>>>>> appears to be unique to just one filename in one directory. I
>>>>>>> have
>>>>>>> tried numerous ways of creating the file including echo, touch,
>>>>>>> and "lfs setstripe" all return "No space left on device". I have
>>>>>>> checked the filesystem with df and "lfs df" both show that the
>>>>>>> filesystem and all OSTs are far from being full for both blocks
>>>>>>> and inodes. Slight changes in the filename are created fine. We
>>>>>>> had a kernel panic on the MDS yesterday and it was quite possible
>>>>>>> that the user had a compute job working in this directory at the
>>>>>>> time of that problem. I am guessing we have some kind of
>>>>>>> corruption with the directory. This directory has around 1
>>>>>>> million
>>>>>>> files so moving the data around may not be a quick operation, but
>>>>>>> we're willing to do it. I just want to know the best way, short
>>>>>>> of
>>>>>>> taking the filesystem offline, to fix this problem.
>>>>>>>
>>>>>>> Any ideas? Thanks in advance,
>>>>>>> Mike Robbert
>>>>>>> _______________________________________________
>>>>>>> Lustre-discuss mailing list
>>>>>>> Lustre-discuss at lists.lustre.org
>>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>
>>>>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>> Cheers, Andreas
>>>> --
>>>> Andreas Dilger
>>>> Sr. Staff Engineer, Lustre Group
>>>> Sun Microsystems of Canada, Inc.
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>
>>
>> --
>> Bernd Schubert
>> DataDirect Networks
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>




More information about the lustre-discuss mailing list