[Lustre-discuss] Disappearing OSTs
jrs
botemout at gmail.com
Mon May 5 13:05:06 PDT 2008
Well, things have changed again as I'm trying to get back to something
that works but on one of the MDSs you see. Below that I have
the output of 'multipath -l'. I dual port HBAs and multiple paths
to the backend storage so it looks a little complex. I've modified
the /etc/multipathd.conf file to give the logical names you see, e.g.,
ost_lustre03-04_04_oss01_dm_7_mds01.
Even though it looks a little scary remember that things work fine and
can even survive a random number of reboots before an OST disappears.
Since the last time I posted I had an MST go away too.
Does anyone think that I might have better luck running Redhat?
I've looked through the /etc/init.d/* files but can't see anything
that might be destroying the partition.
Thanks
John
$ cat /proc/partition
major minor #blocks name
104 0 71652960 cciss/c0d0
104 1 2104483 cciss/c0d0p1
104 2 69545385 cciss/c0d0p2
8 0 5860157184 sda
8 16 5860157184 sdb
8 32 5860230912 sdc
8 48 5860156250 sdd
8 64 5860156250 sde
8 80 5860156250 sdf
8 96 5860156250 sdg
8 112 5860156250 sdh
8 128 5860156250 sdi
8 144 5860157184 sdj
8 160 5860157184 sdk
8 176 5860230912 sdl
8 192 5860156250 sdm
8 193 5860156216 sdm1
8 208 5860156250 sdn
8 224 5860156250 sdo
8 240 5860157184 sdp
65 0 5860157184 sdq
65 16 5860230912 sdr
65 32 5860156250 sds
65 33 5860156216 sds1
65 48 5860156250 sdt
65 64 5860156250 sdu
65 80 5860157184 sdv
65 96 5860157184 sdw
65 112 5860230912 sdx
65 128 5860156250 sdy
65 129 5860156216 sdy1
65 144 5860156250 sdz
65 160 5860156250 sdaa
65 176 5860157184 sdab
65 192 5860157184 sdac
65 208 5860230912 sdad
65 224 5860156250 sdae
65 225 5860156216 sdae1
65 240 5860156250 sdaf
66 0 5860156250 sdag
66 16 5860157184 sdah
66 32 5860157184 sdai
66 48 5860230912 sdaj
66 64 5860157184 sdak
66 80 5860157184 sdal
66 96 5860230912 sdam
66 112 5860156250 sdan
66 128 5860156250 sdao
66 144 5860156250 sdap
66 160 5860156250 sdaq
66 176 5860156250 sdar
66 192 5860156250 sdas
66 208 5860157184 sdat
66 224 5860157184 sdau
66 240 5860230912 sdav
253 0 5860156250 dm-0
253 1 5860157184 dm-1
253 2 5860157184 dm-2
253 3 5860230912 dm-3
253 4 5860156250 dm-4
253 5 5860156250 dm-5
253 6 5860157184 dm-6
253 7 5860157184 dm-7
253 8 5860230912 dm-8
253 9 5860156250 dm-9
253 10 5860156250 dm-10
253 11 5860156250 dm-11
253 12 5860156216 dm-12
$ multipath -l
ost_lustre03-04_04_oss01_dm_7_mds01 (36000402001fc308260c0ace100000000) dm-7 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:3:4 sdai 66:32 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:7:4 sdau 66:224 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:3:4 sdk 8:160 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:7:4 sdw 65:96 [active][undef]
ost_lustre01-02_04_oss01_dm_5_mds01 (36000402001fc14596ef496fd00000000) dm-5 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:2:1 sdaf 65:240 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:4:1 sdn 8:208 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:6:1 sdt 65:48 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:1 sdz 65:144 [active][undef]
ost_lustre03-04_02_oss01_dm_3_mds01 (36000402001fc308260c0af3700000000) dm-3 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:1:2 sdad 65:208 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:4:2 sdam 66:96 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:0:2 sdc 8:32 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:5:2 sdr 65:16 [active][undef]
ost_lustre01-02_02_oss01_dm_11_mds01 (36000402001fc14596ef497ee00000000) dm-11 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:5:5 sdap 66:144 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:6:5 sdas 66:192 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:1:5 sdf 8:80 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:2:5 sdi 8:128 [active][undef]
ost_lustre01-02_05_oss02_dm_0_mds01 (36000402001fc14596ef4970e00000000) dm-0 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:0:2 sdaa 65:160 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:2:2 sdag 66:0 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:4:2 sdo 8:224 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:6:2 sdu 65:64 [active][undef]
ost_lustre01-02_01_oss02_dm_10_mds01 (36000402001fc14596ef497dc00000000) dm-10 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:5:4 sdao 66:128 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:6:4 sdar 66:176 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:1:4 sde 8:64 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:2:4 sdh 8:112 [active][undef]
mdt_lustre03-04_00_dm_8_mds01 (36000402001fc308260c0ac9e00000000) dm-8 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:3:5 sdaj 66:48 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:7:5 sdav 66:240 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:3:5 sdl 8:176 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:7:5 sdx 65:112 [active][undef]
ost_lustre03-04_03_oss02_dm_6_mds01 (36000402001fc308260c0acc200000000) dm-6 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:3:3 sdah 66:16 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:7:3 sdat 66:208 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:3:3 sdj 8:144 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:7:3 sdv 65:80 [active][undef]
ost_lustre01-02_03_oss02_dm_4_mds01 (36000402001fc14596ef496ed00000000) dm-4 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:2:0 sdae 65:224 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:4:0 sdm 8:192 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:6:0 sds 65:32 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:0 sdy 65:128 [active][undef]
ost_lustre03-04_01_oss02_dm_2_mds01 (36000402001fc308260c0af1600000000) dm-2 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:1:1 sdac 65:192 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:4:1 sdal 66:80 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:0:1 sdb 8:16 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:5:1 sdq 65:0 [active][undef]
ost_lustre01-02_00_oss01_dm_9_mds01 (36000402001fc14596ef497cc00000000) dm-9 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:5:3 sdan 66:112 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:6:3 sdaq 66:160 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:1:3 sdd 8:48 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:2:3 sdg 8:96 [active][undef]
ost_lustre03-04_00_oss01_dm_1_mds01 (36000402001fc308260c0af5b00000000) dm-1 NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 1:0:1:0 sdab 65:176 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:4:0 sdak 66:64 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:0:0 sda 8:0 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 0:0:5:0 sdp 8:240 [active][undef]
Bernd Schubert wrote:
> On Mon, May 05, 2008 at 12:30:23PM -0400, jrs wrote:
>> I wonder if I'd have better luck, with the disappearing OST bug, if
>> I actually explictly partitioned the device and then used, to take
>> the example above
>>
>> /dev/mapper/ost_oss01_lustre0304_02-part1
>>
>> rather than the whole disk.
>>
>
> What does /proc/partitions say?
More information about the lustre-discuss
mailing list