[Lustre-discuss] OST node filling up and aborting write
Nick Jennings
nick at creativemotiondesign.com
Sat Feb 28 06:22:44 PST 2009
Hi Brian,
(Thanks for pointing out the -1 as opposed to 1, I missed that)
Brian J. Murrell wrote:
> On Sat, 2009-02-28 at 02:34 +0100, Nick Jennings wrote:
>> Hi Everyone,
>
> Hi Nick,
>
>> I've got 4 OSTs (each 2gigs in size) on one lustre file system. I dd a
>> 4 gig file to the filesystem and after the first OST fills up, the write
>> fails (not enough space on device):
>
> Writes to do not "cascade" over to another OST when one fills up.
I see. I guess I have a misunderstanding of the way striping works.
If you set the stripesize=1MB, and stripecount=-1 - Then I would assume
this means: Split each write process into 1MB chunks, stripe across all
OSTs. By write process I mean 1 single file being written to disk. I've
read over Chapter 25 as well but it doesn't seem to clarify this for me
(I'm probably letting something fly over my head).
>> I initially thought this could be solved by enabling striping, but from
>> HowTo (which doesn't say much on the subject admittedly) I gathered
>> striping was already enabled?
>
> No. By default, stripesize == 1. In order to get a single file onto
> multiple OSTs you will need to explicitly set a striping policy either
> on the file you are going to write into or the directory the file is in.
Then what is stripesize=-1 used for? (when specified for the filesystem,
and not a file or a directory). Can you give me an example?
--
Write Test #2
--
# lctl conf_param testfs-MDT0000.lov.stripecount=-1
/proc/fs/lustre/lov/testfs-clilov-c464c000/stripecount:-1
/proc/fs/lustre/lov/testfs-clilov-c464c000/stripeoffset:0
/proc/fs/lustre/lov/testfs-clilov-c464c000/stripesize:1048576
/proc/fs/lustre/lov/testfs-clilov-c464c000/stripetype:1
/proc/fs/lustre/lov/testfs-mdtlov/stripecount:-1
/proc/fs/lustre/lov/testfs-mdtlov/stripeoffset:0
/proc/fs/lustre/lov/testfs-mdtlov/stripesize:1048576
/proc/fs/lustre/lov/testfs-mdtlov/stripetype:1
# dd if=/dev/zero of=/mnt/testfs/testfile1 bs=4096 count=614400
dd: writing `/mnt/testfs/testfile1': No space left on device
437506+0 records in
437505+0 records out
1792020480 bytes (1.8 GB) copied, 52.5727 seconds, 34.1 MB/s
# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/hda1 15G 7.7G 5.9G 57% /
tmpfs 252M 0 252M 0% /dev/shm
/dev/hda5 4.1G 198M 3.7G 6% /mnt/lustre/mdt
/dev/hda6 1.9G 1.8G 68K 100% /mnt/lustre/ost0
/dev/hda7 1.9G 80M 1.7G 5% /mnt/lustre/ost1
/dev/hda8 1.9G 80M 1.7G 5% /mnt/lustre/ost2
/dev/hda9 1.9G 80M 1.7G 5% /mnt/lustre/ost3
192.168.0.149 at tcp0:/testfs
7.4G 2.0G 5.1G 29% /mnt/testfs
Thanks for your help,
-Nick
More information about the lustre-discuss
mailing list