[Lustre-discuss] Plateau around 200MiB/s bond0
Arden Wiebe
albert682 at yahoo.com
Sat Jan 24 18:04:21 PST 2009
1-2948-SFP Plus Baseline 3Com Switch
1-MGS bond0(eth0,eth1,eth2,eth3,eth4,eth5) raid1
1-MDT bond0(eth0,eth1,eth2,eth3,eth4,eth5) raid1
2-OSS bond0(eth0,eth1,eth2,eth3,eth4,eth5) raid6
1-MGS-CLIENT bond0(eth0,eth1,eth2,eth3,eth4,eth5)
1-CLIENT bond0(eth0,eth1)
1-CLIENT eth0
1-CLIENT eth0
I fail so far creating external journal for MDT, MGS and OSSx2. How to add the external journal to /etc/fstab specifically the output of e2label /dev/sdb followed by what options for fstab?
[root at lustreone ~]# cat /proc/fs/lustre/devices
0 UP mgs MGS MGS 17
1 UP mgc MGC192.168.0.7 at tcp 876c20af-aaec-1da0-5486-1fc61ec8cd15 5
2 UP lov ioio-clilov-ffff810209363c00 7307490a-4a12-4e8c-56ea-448e030a82e4 4
3 UP mdc ioio-MDT0000-mdc-ffff810209363c00 7307490a-4a12-4e8c-56ea-448e030a82e4 5
4 UP osc ioio-OST0000-osc-ffff810209363c00 7307490a-4a12-4e8c-56ea-448e030a82e4 5
5 UP osc ioio-OST0001-osc-ffff810209363c00 7307490a-4a12-4e8c-56ea-448e030a82e4 5
[root at lustreone ~]# lfs df -h
UUID bytes Used Available Use% Mounted on
ioio-MDT0000_UUID 815.0G 534.0M 767.9G 0% /mnt/ioio[MDT:0]
ioio-OST0000_UUID 3.6T 28.4G 3.4T 0% /mnt/ioio[OST:0]
ioio-OST0001_UUID 3.6T 18.0G 3.4T 0% /mnt/ioio[OST:1]
filesystem summary: 7.2T 46.4G 6.8T 0% /mnt/ioio
[root at lustreone ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.2.4 (January 28, 2008)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
802.3ad info
LACP rate: slow
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 1
Actor Key: 17
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00
Slave Interface: eth0
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:1b:21:28:77:db
Aggregator ID: 1
Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:1b:21:28:77:6c
Aggregator ID: 2
Slave Interface: eth3
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:22:15:06:3a:94
Aggregator ID: 3
Slave Interface: eth2
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:22:15:06:3a:93
Aggregator ID: 4
Slave Interface: eth4
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:22:15:06:3a:95
Aggregator ID: 5
Slave Interface: eth5
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:22:15:06:3a:96
Aggregator ID: 6
[root at lustreone ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb[0] sdc[1]
976762496 blocks [2/2] [UU]
unused devices: <none>
[root at lustreone ~]# cat /etc/fstab
LABEL=/ / ext3 defaults 1 1
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
LABEL=MGS /mnt/mgs lustre defaults,_netdev 0 0
192.168.0.7 at tcp0:/ioio /mnt/ioio lustre defaults,_netdev,noauto 0 0
[root at lustreone ~]# ifconfig
bond0 Link encap:Ethernet HWaddr 00:1B:21:28:77:DB
inet addr:192.168.0.7 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::21b:21ff:fe28:77db/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:9000 Metric:1
RX packets:5457486 errors:0 dropped:0 overruns:0 frame:0
TX packets:4665580 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:12376680079 (11.5 GiB) TX bytes:34438742885 (32.0 GiB)
eth0 Link encap:Ethernet HWaddr 00:1B:21:28:77:DB
inet6 addr: fe80::21b:21ff:fe28:77db/64 Scope:Link
UP BROADCAST RUNNING SLAVE MULTICAST MTU:9000 Metric:1
RX packets:3808615 errors:0 dropped:0 overruns:0 frame:0
TX packets:4664270 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:12290700380 (11.4 GiB) TX bytes:34438581771 (32.0 GiB)
Base address:0xec00 Memory:febe0000-fec00000
>From what I have read not having an external journal configured for the OST's is a sure recipie for slowness which I would rather not have considering the goal is around 350MiB/s or more which should be obtainable.
Here is how I formated the raid6 device on both OSS's that have identical
[root at lustrefour ~]# fdisk -l
Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 121601 976760001 83 Linux
Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdb doesn't contain a valid partition table
Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdc doesn't contain a valid partition table
Disk /dev/sdd: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdd doesn't contain a valid partition table
Disk /dev/sde: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sde doesn't contain a valid partition table
Disk /dev/sdf: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdf doesn't contain a valid partition table
Disk /dev/sdg: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdg doesn't contain a valid partition table
Disk /dev/sdh: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdh doesn't contain a valid partition table
Disk /dev/md0: 4000.8 GB, 4000819183616 bytes
2 heads, 4 sectors/track, 976762496 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Disk /dev/md0 doesn't contain a valid partition table
[root at lustrefour ~]#
[root at lustrefour ~]# mdadm --create --assume-clean /dev/md0 --level=6 --chunk=128 --raid-devices=6 /dev/sd[cdefgh]
[root at lustrefour ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdc[0] sdh[5] sdg[4] sdf[3] sde[2] sdd[1]
3907049984 blocks level 6, 128k chunk, algorithm 2 [6/6] [UUUUUU]
in: 16674 reads, 16217479 writes; out: 3022788 reads, 32865192 writes
7712698 in raid5d, 8264 out of stripes, 25661224 handle called
reads: 0 for rmw, 1710975 for rcw. zcopy writes: 4864584, copied writes: 16115932
0 delayed, 0 bit delayed, 0 active, queues: 0 in, 0 out
0 expanding overlap
unused devices: <none>
Followed with:
[root at lustrefour ~]# mkfs.lustre --ost --fsname=ioio --mgsnode=192.168.0.7 at tcp0 --mkfsoptions="-J device=/dev/sdb1" --reformat /dev/md0
[root at lustrefour ~]# mke2fs -b 4096 -O journal_dev /dev/sdb1
But that is hard to reassemble on the reboot or at least was before I use e2label and label things right. Question how to label the external journal in fstab if at all? Right now only running
[root at lustrefour ~]# mkfs.lustre --fsname=ioio --ost --mgsnode=192.168.0.7 at tcp0 --reformat /dev/md0
So just raid6 no external journal.
[root at lustrefour ~]# cat /etc/fstab
LABEL=/ / ext3 defaults 1 1
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
LABEL=ioio-OST0001 /mnt/ost00 lustre defaults,_netdev 0 0
192.168.0.7 at tcp0:/ioio /mnt/ioio lustre defaults,_netdev,noauto 0 0
[root at lustrefour ~]#
[root at lustreone bin]# ./ost-survey -s 4096 /mnt/ioio
./ost-survey: 01/24/09 OST speed survey on /mnt/ioio from 192.168.0.7 at tcp
Number of Active OST devices : 2
Worst Read OST indx: 0 speed: 38.789337
Best Read OST indx: 1 speed: 40.017201
Read Average: 39.403269 +/- 0.613932 MB/s
Worst Write OST indx: 0 speed: 49.227064
Best Write OST indx: 1 speed: 78.673564
Write Average: 63.950314 +/- 14.723250 MB/s
Ost# Read(MB/s) Write(MB/s) Read-time Write-time
----------------------------------------------------
0 38.789 49.227 105.596 83.206
1 40.017 78.674 102.356 52.063
[root at lustreone bin]# ./ost-survey -s 1024 /mnt/ioio
./ost-survey: 01/24/09 OST speed survey on /mnt/ioio from 192.168.0.7 at tcp
Number of Active OST devices : 2
Worst Read OST indx: 0 speed: 38.559620
Best Read OST indx: 1 speed: 40.053787
Read Average: 39.306704 +/- 0.747083 MB/s
Worst Write OST indx: 0 speed: 71.623744
Best Write OST indx: 1 speed: 82.764897
Write Average: 77.194320 +/- 5.570577 MB/s
Ost# Read(MB/s) Write(MB/s) Read-time Write-time
----------------------------------------------------
0 38.560 71.624 26.556 14.297
1 40.054 82.765 25.566 12.372
[root at lustreone bin]# dd of=/mnt/ioio/bigfileMGS if=/dev/zero bs=1048576
3536+0 records in
3536+0 records out
3707764736 bytes (3.7 GB) copied, 38.4775 seconds, 96.4 MB/s
lustreonetwothreefour all have the same for modprobe.conf
[root at lustrefour ~]# cat /etc/modprobe.conf
alias eth0 e1000
alias eth1 e1000
alias scsi_hostadapter pata_marvell
alias scsi_hostadapter1 ata_piix
options lnet networks=tcp
alias eth2 sky2
alias eth3 sky2
alias eth4 sky2
alias eth5 sky2
alias bond0 bonding
options bonding miimon=100 mode=4
[root at lustrefour ~]#
When do the same from all clients I can watch ./usr/bin/gnome-system-monitor and the send and recieve from the various nodes reaches a 209 MiB/s plateau? Uggh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090124/3c4b417b/attachment.htm>
More information about the lustre-discuss
mailing list