[Lustre-discuss] lustre no longer allows reads/writes (stopped working)?

Arden Wiebe albert682 at yahoo.com
Fri Jan 30 15:15:26 PST 2009





>I have setup a lustre system for testing consisting of four OST's and one
>MDT. It seems to work fine for about a day. At the end of about 24 hours,
>the clients can no longer read or write the mount point (although a file
>listing (ls) works). 

That is the problem.  Your clients are mounting wrong.  You have used incorrect formatting of the nodes.

>For example, a mkdir yields a "cannot create directory
>'/datafs/temp': Identifier removed", and the temp dir does not exist. 
>A file listing of the /datafs directory comes back complete and correct,
>but if I try to ls a subdirectory it gives me the erorr "ls: /datafs     >/test2:
>Identifier removed". 

Please review via your bash history the exact commands you used to make the underlying filesystem.  Be certain everything is pointing to the correct filesystem and to the correct directories.

>The client is mounting the dir to /datafs. This worked fine eariler, I >left
>for the day, came back in and this error is occurring on all clients >(albeit
>I only have three clients for testing). All clients/servers are running
>RHEL5, and the lustre was installed via rpms as per the manual. 

The client if you followed the manual 100% (takes practice) should be mounting your combined MDT/MGS node at the MDT/MGS Node IP address via your network for example, tcp0 on a local mountpoint likened to /mnt/datafs.

I found that changing the manual representations of your new filesystem to something other than datafs or testfs or spfs.  In your case I would recommend the word litefs.  Also there are some ambiguities with slashes in the examples and I might ad use or misuse of the = sign after fsname.

By far the best example is further into the manual about mounting external journals.  Also it is best to have the MGS and MDT separate from everything I have read.   Otherwise you must on your combined MDT/MGS node have two mount points /mnt/mgs and /mnt/data/mdt.  

>Out of curiosity, if I go to the server and do an ls on /mnt/data/mdt or
>to the OST server and do an ls on /mnt/data/ost1, I get an error that
>it is not a directory (although that could be normal, I am not sure). 

Yes that is normal because those are mount points not directories.

>A cat of /proc/fs/lustre/devices on the mdt does not show anything out >of place
>(or at least, it is the same as when I started the lustre and mounted
>the servers/clients) 

So we assume your combined MDT/MGS is up and running but is it formatted properly and mounted properly?

>I have configured it all according to 
>http://manual.lustre.org/manual/LustreManual16_HTML >/ConfiguringLustreExamples.html#50548848_pgfId-1286919
>as per section 6.1.1.2 Configuration Generation and Application, using >one server
>for the MGT and MDS, and I have four OSTs, just like the example. 

>Has anyone seen this before? 

Yes and it is common until you become good enough at creating your Lustre filesystem and knowing which formatting and mounting procedures interact to make a live filesystem that you adopt and know to be sound.

Robert to simplify things I'll include some of my .bash_history on the nodes for you to examine.  This should considerably decrease your initial configuration timeframe.

My configuration differs in that I opt for seperate MGS and MDT. This obviously is from the MDT.

umount /mnt/mgs
mdadm -S /dev/md2
mdadm -S /dev/md1
mdadm -S /dev/md0
mdadm --zero-superblock /dev/sdb
mdadm --zero-superblock /dev/sdc
mdadm --zero-superblock /dev/sdd
mdadm --zero-superblock /dev/sde
mdadm --zero-superblock /dev/sdf
mdadm -v --create --assume-clean /dev/md0 --level=raid10 --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde
sfdisk -uC /dev/sdf << EOF
mke2fs -b 4096 -O journal_dev /dev/sdf1
cat /proc/mdstat
mkfs.lustre --mgs --fsname=ioio --mkfsoptions="-J device=/dev/sdf1" --reformat /dev/md0
rm /etc/mdadm.conf
mdadm --detail --scan --verbose > /etc/mdadm.conf
mount -t lustre /dev/md0 /mnt/mgs
e2label /dev/md0
vi /etc/fstab
e2label /dev/md0
cat /proc/mdstat
mount -t lustre 192.168.0.7 at tcp0:/ioio /mnt/ioio
lctl dl
lfs df -h

This shows a single MGS with an external journal on /dev/sdf1.  The MGS is mounted on /mnt/mgs by the /dev/md0 devices.  The e2label of which will be label=MGS followed by mount options in the /etc/fstab.  Here you can see I connect a client to the MGS to test the filesystem but only after the MDT is mounted and the OSS are mounted.

On the MDT

umount /mnt/data/mdt
mdadm -S /dev/md2
mdadm -S /dev/md0
mdadm -S /dev/md1
mdadm --zero-superblock /dev/sdb
mdadm --zero-superblock /dev/sdc
mdadm --zero-superblock /dev/sdd
mdadm --zero-superblock /dev/sde
mdadm --zero-superblock /dev/sdf
mdadm -v --create --assume-clean /dev/md0 --level=raid10 --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde
sfdisk -uC /dev/sdf << EOF
mke2fs -b 4096 -O journal_dev /dev/sdf1
cat /proc/mdstat
mkfs.lustre --mdt --fsname=ioio --mgsnode=192.168.0.7 at tcp0 --mkfsoptions="-J device=/dev/sdf1" --reformat /dev/md0
mount -t lustre /dev/md0 /mnt/data/mdt
rm /etc/mdadm.conf
mdadm --detail --scan --verbose > /etc/mdadm.conf
e2label /dev/md0
vi /etc/fstab
shutdown -r -t secs: 0

When this MDT comes back online your filesystems shall be mounted correctly as identified by lctl dl.  

And typical of an OST.  Choose whatever raid level you require.

umount /mnt/data/ost0
cat /proc/mdstat
mdadm -S /dev/md0
mdadm --zero-superblock /dev/sdb
mdadm --zero-superblock /dev/sdc
mdadm --zero-superblock /dev/sdd
mdadm --zero-superblock /dev/sde
mdadm --zero-superblock /dev/sdf
mdadm --zero-superblock /dev/sdg
mdadm --zero-superblock /dev/sdh
mdadm -v --create --assume-clean /dev/md0 --level=raid10 --raid-devices=6 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh
cat /proc/mdstat
sfdisk -uC /dev/sdb << EOF
mke2fs -b 4096 -O journal_dev /dev/sdb1
mkfs.lustre --ost --fsname=ioio --mgsnode=192.168.0.7 at tcp0 --mkfsoptions="-J device=/dev/sdb1" --reformat /dev/md0
mount -t lustre /dev/md0 /mnt/data/ost0
rm /etc/mdadm.conf
mdadm --detail --scan --verbose > /etc/mdadm.conf
e2label /dev/md0
vi /etc/fstab
cat /proc/mdstat
shutdown -r -t secs: 0

When this box comes back up the newly formatted OST should be mounted.  If not your e2label is incorrect as does happen and is mentioned in the manual that e2label won't report correctly until the devices is mounted the first time.

Robert I hope this helps to speed your testing deployment.  It will take you probably 2 or three attempts to get a viable filesystem with all the variables in play and your naming conventions.  Eventually you will end up wanting to have external journals as laid out above.  Also be sure to follow your directory naming conventions right through.  For example you mount the OST and subsequent /dev/md0 device on /mnt/data/ost0 don't be shortening the path as I suspect you have on your OST mounts.



Robert
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss



      



More information about the lustre-discuss mailing list