[Lustre-discuss] Lustre 1.6.5.1 on X4200 and STK 6140 Issues
Malcolm Cowe
Malcolm.Cowe at Sun.COM
Mon Oct 6 02:58:28 PDT 2008
Hi Folks,
We are trying to create a small lustre environment on behalf of a
customer. There are 2 X4200m2 MDS servers, both dual-attached to an STK
6140 array over FC. This is an active-passive arrangement with a single
shared volume. Heartbeat is used to co-ordinate file system failover.
There is a single X4500 OSS server, the storage for which is split into
6 OSTs. Finally, we have 2 X4600m2 clients, just for kicks.
All systems are connected together over ethernet and infiniband, with
the IB network being used for Lustre and every system is running RHEL
4.5 AS. The X4500 OST volumes are created using software RAID, while the
X4200m2 MDT is accessed using DM Multipath. We downloaded the Lustre
binary packages from SUN's web site and installed them onto each of the
servers.
Unfortunately, the resulting system is very unstable and is prone to
lock-ups on the servers (uptimes are measured in hours). These lock-ups
happen without warning, and with very little, if any, debug information
in the system logs. We have also observed the servers locking up on
shutdown (kernel panics). Based on the documentation in the Lustre
operations manual, we installed the RPMs as follows:
rpm -Uvh --force e2fsprogs-1.40.7.sun3-0redhat.x86_64.rpm
rpm -ivh kernel-lustre-smp-2.6.9-67.0.7.EL_lustre.1.6.5.1.x86_64.rpm
rpm -ivh kernel-lustre-source-2.6.9-67.0.7.EL_lustre.1.6.5.1.x86_64.rpm
rpm -ivh
lustre-modules-1.6.5.1-2.6.9_67.0.7.EL_lustre.1.6.5.1smp.x86_64.rpm #
(many "unknown symbol" warnings)
rpm -ivh lustre-1.6.5.1-2.6.9_67.0.7.EL_lustre.1.6.5.1smp.x86_64.rpm
rpm -ivh lustre-source-1.6.5.1-2.6.9_67.0.7.EL_lustre.1.6.5.1smp.x86_64.rpm
rpm -ivh
lustre-ldiskfs-3.0.4-2.6.9_67.0.7.EL_lustre.1.6.5.1smp.x86_64.rpm #
(many "unknown symbol" warnings)
mv /etc/init.d/openibd /etc/init.d/openibd.rhel4default
rpm -ivh --force kernel-ib-1.3-2.6.9_67.0.7.EL_lustre.1.6.5.1smp.x86_64.rpm
cp /etc/init.d/openibd /etc/init.d/openibd.lustre.1.6.5.1
We then reboot the system and load RHEL using the Lustre kernel. Now we
install the Voltaire OFED software:
1. Copy the kernel config used to build the Lustre patched kernel
into the Lustre kernel source tree:
cp /boot/config-2.6.9-67.0.7.EL_lustre.1.6.5.1smp \
/usr/src/linux-2.6.9-67.0.7.EL_lustre.1.6.5.1/.config
2. Change into the Lustre kernel source and edit the Makefile. Change
"custom" suffix to "smp" in the variable "EXTRAVERSION".
3. Change into the lustre kernel source and run these setup commands:
make oldconfig || make menuconfig
make include/asm
make include/linux/version.h
make SUBDIRS=scripts
4. Change into the "-obj" directory and run these setup commands:
cd /usr/src/linux-2.6.9-67.0.7.EL_lustre.1.6.5.1-obj/x86_64/smp
ln -s /usr/src/linux-2.6.9-67.0.7.EL_lustre.1.6.5.1/include .
5. Unpack the Voltaire OFED tar-ball:
tar zxf VoltaireOFED-5.1.3.1_5.tgz
6. Change to the unpacked software directory and run the installation
script. To build the OFED packages with the Voltaire certified
configuration, run the following commands:
cd VoltaireOFED-5.1.3.1_5
./install.pl -c ofed.conf.Volt
7. Once complete, reboot.
8. Configure any IPoIB interfaces as required.
9. Add the following into /etc/modprobe.conf:
options lnet networks="o2ib0(ib0)"
10. Load the Lustre LNET kernel module.
modprobe lnet
11. Start the Lustre core networking service.
lctl network up
12. Check the system log (/var/log/messages) for confirmation.
Create the MGS/MDT Lustre Volume:
1. Format the MGS/MDT device.
mkfs.lustre [ --reformat ] --fsname lfs01 --mdt --mgs
--failnode=mds-2 at o2ib0 /dev/dm-0
2. Create the MGS/MDT file system mount point.
mkdir -p /lustre/mdt/lfs01
3. Mount the file system. This will initiate MGS and MDT services for
Lustre.
mount -t lustre /dev/dm-0 /lustre/mdt/lfs01
With the exception of the OST volume creation, we use an equivalent
process to bring the OSS online.
The cabling has been checked and verified. So we re-built the system
from scratch and applied only SUN's RDAC modules and Voltaire OFED to
the stock RHEL 4.5 kernel (2.6.9-55.ELsmp). We removed the second MDS
from the h/w configuration and did not install Heartbeat. The shared
storage was re-formatted as a regular EXT3 file system using the DM
multipathing device, /dev/dm-0, and mounted onto the host. Running I/O
tests onto the mounted file system over an extended period did not
elicit a single error or warning message in the log related to the
multipathing or the SCSI device.
Once we were confident that the system was running in a consistent and
stable manner, we re-installed the Lustre packages, omitting the
kernel-ib packages. We had to re-build and re-install the RDAC support
as well. This means that the system has support for the Lustre file
system but no infiniband support at all. /etc/modprobe.conf is updated
such that the lnet networks option is set to "tcp". The MDS/MGS volume
is recreated on the DM device.
We have tried the following configurations on the X4200m2:
* RHEL vanilla kernel, multipathd, RDAC. EXT-3 file system. PASSED.
* RHEL vanilla kernel, multipathd, RDAC, Voltaire OFED. EXT-3 file
system. PASSED.
* Lustre supplied kernel, Lustre software. No IB. MDS/MGS file
system. FAILED.
* Lustre supplied kernel, Lustre software, RDAC. No IB. MDS/MGS file
system (Full Lustre FS over Ethernet). FAILED.
* Lustre supplied kernel, Lustre software, RDAC, Voltaire OFED.
EXT-3 file system. FAILED.
* Lustre supplied kernel, Lustre software. RDAC, Voltaire OFED.
MDS/MGS file system (Full Lustre FS over IB). FAILED.
Our findings indicate that there is a problem within the binary
distribution of Lustre. This may be due to the fact that we are applying
the 2.6.9-67 RHEL kernel to a platform based upon 2.6.9.-55, or it may
be a more subtle issue based on the interaction with the underlying
hardware. We could use some advice on how best to proceed, since our
deadline fast approaches. For example, is our build process, as
documented above, clean? Currently, we're looking at building from
source, to see if this results in a more stable environment.
Regards,
Malcolm.
--
<http://www.sun.com>
*Malcolm Cowe*
/Solutions Integration Engineer/
*Sun Microsystems, Inc.*
Blackness Road
Linlithgow, West Lothian EH49 7LR UK
Phone: x73602 / +44 1506 673 602
Email: Malcolm.Cowe at Sun.COM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20081006/76d65993/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 1257 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20081006/76d65993/attachment.gif>
More information about the lustre-discuss
mailing list