[Lustre-discuss] lustre error following SAN outage

Andreas Dilger adilger at sun.com
Wed Feb 4 01:54:28 PST 2009


On Feb 04, 2009  16:22 +1000, Marcus Schull wrote:
> We have been using lustre for a few months now to serve a few TB's of  
> data to multiple client computers.  Earlier today as I created some  
> new volumes on a Sun StorageTek 6140 FC disk controller, it appears to  
> have resulted in a short outage of some FC connections and resulted in  
> I/O errors on the lustre server (which is actually acting as an MGS,  
> MDS and OSS).  We are hoping to move to a better more robust  
> architecture with separate nodes, failover etc in the near future.
> Having said that, we have been running the current setup (on RHEL 5.2  
> 64bit with Lustre 1.6.5.1) for a few months without issue.
> 
> While the initial cause of the I/O errors has passed (a change in the  
> disk configuration exported by the 6140 presumably triggering some  
> kind of SAN outage - which we have involved Sun), we are still  
> getting  the error below on an at-least secondly basis since then:
> 
> lustre1 kernel: LustreError: 9643:0:(filter_io_26.c: 
> 707:filter_commitrw_write()) error starting transaction: rc = -30

Lustre generally reports the standard Linux/POSIX error codes, so -30
is -EROFS (per /usr/include/asm-generic/errno-base.h, which would be
a lot easier if it were stored in the top-level /usr/include/errno.h
header).

In any case, this means your OST filesystem has remounted itself
read-only after seeing some underlying error to avoid corruption.
You need to unmount/remount the OST in order to clear it.

> [root at sarton srs]# touch anewfile
> touch: setting times of `anewfile': Read-only file system

This is "-EROFS" as mentioned:
#define EROFS           30      /* Read-only file system */

> Secondly, is there an fsck.lustre command that we could/should run  
> following situations where I/O errors are known to have occurred.

That would be "fsck.ext3", as supplied by the e2fsprogs package on
the Sun download site.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list