[Lustre-discuss] lustre OSS IP change

Brendon b at brendon.com
Thu Jan 13 09:41:44 PST 2011


On Tue, Jan 11, 2011 at 3:35 PM, Wojciech Turek <wjt27 at cam.ac.uk> wrote:
> Hi Brendon,
>
> Can you please provide following:
> 1) output of ifconfig run on each OSS MDS and at least one client
> 2) output of lctl list_nids run on each OSS MDS and at least one client
> 3) output of tunefs.lustre --print --dryrun /dev/<OST_block_device> from
> each OSS
>
> Wojciech

After someone looked at the emails I sent out, they grabbed me on IRC.
We had a discussion and basically they interpreted the email as
everything should be working, I just needed to wait for a repair to
run and complete. What I then learned is that first, a client has to
connect for a repair to initiate. Secondly, the code isn't perfect.
The MDS kernel oops'ed twice before it finally completed a repair
successfully. I was in the process of disabling panic on oops, but it
finally completed successfully. Once that was done, I got a clean bill
of health.

Just to complete this discussion, I have listed the requested output.
I might still learn something :)

...Looks like I did learn something. OSS0 has an issue with the root
FS and was remounted RO which I discovered when running  tunefs.lustre
--print --dryrun /dev/sda5.

The fun never ends :)
-Brendon

1) ifconfig info
MDS: # ifconfig
eth0      Link encap:Ethernet  HWaddr 00:15:17:5E:46:64
         inet addr:10.1.1.1  Bcast:10.1.1.255  Mask:255.255.255.0
         inet6 addr: fe80::215:17ff:fe5e:4664/64 Scope:Link
         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
         RX packets:49140546 errors:0 dropped:0 overruns:0 frame:0
         TX packets:63644404 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:1000
         RX bytes:18963170801 (17.6 GiB)  TX bytes:65261762295 (60.7 GiB)
         Base address:0xcc00 Memory:f58e0000-f5900000

eth1      Link encap:Ethernet  HWaddr 00:15:17:5E:46:65
         inet addr:192.168.0.181  Bcast:192.168.0.255  Mask:255.255.255.0
         inet6 addr: fe80::215:17ff:fe5e:4665/64 Scope:Link
         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
         RX packets:236738842 errors:0 dropped:0 overruns:0 frame:0
         TX packets:458503163 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:100
         RX bytes:15562858193 (14.4 GiB)  TX bytes:686167422947 (639.0 GiB)
         Base address:0xc880 Memory:f5880000-f58a0000

OSS : # ifconfig
eth0      Link encap:Ethernet  HWaddr 00:1D:60:E0:5B:B2
         inet addr:10.1.1.2  Bcast:10.1.1.255  Mask:255.255.255.0
         inet6 addr: fe80::21d:60ff:fee0:5bb2/64 Scope:Link
         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
         RX packets:3092588 errors:0 dropped:0 overruns:0 frame:0
         TX packets:3547204 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:1000
         RX bytes:1320521551 (1.2 GiB)  TX bytes:2670089148 (2.4 GiB)
         Interrupt:233

client: # ifconfig
eth0      Link encap:Ethernet  HWaddr 00:1E:8C:39:E4:69
         inet addr:10.1.1.5  Bcast:10.1.1.255  Mask:255.255.255.0
         inet6 addr: fe80::21e:8cff:fe39:e469/64 Scope:Link
         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
         RX packets:727922 errors:0 dropped:0 overruns:0 frame:0
         TX packets:884188 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:1000
         RX bytes:433349006 (413.2 MiB)  TX bytes:231985578 (221.2 MiB)
         Interrupt:50



2) lctl list_nids

client: lctl list_nids
10.1.1.5 at tcp

MDS: lctl list_nids
10.1.1.1 at tcp

OSS: lctl list_nids
10.1.1.2 at tcp

3) tunefs.lustre --print --dryrun /dev/sda5
OSS0: ]# tunefs.lustre --print --dryrun /dev/sda5
checking for existing Lustre data: found CONFIGS/mountdata
tunefs.lustre: Can't create temporary directory /tmp/dirCZXt3k:
Read-only file system

tunefs.lustre FATAL: Failed to read previous Lustre data from /dev/sda5 (30)
tunefs.lustre: exiting with 30 (Read-only file system)

OSS1: # tunefs.lustre --print --dryrun /dev/sda5
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

  Read previous values:
Target:     mylustre-OST0001
Index:      1
Lustre FS:  mylustre
Mount type: ldiskfs
Flags:      0x2
             (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.1.1.1 at tcp


  Permanent disk data:
Target:     mylustre-OST0001
Index:      1
Lustre FS:  mylustre
Mount type: ldiskfs
Flags:      0x2
             (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.1.1.1 at tcp

exiting before disk write.


OSS2: # tunefs.lustre --print --dryrun /dev/sda5
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

  Read previous values:
Target:     mylustre-OST0002
Index:      2
Lustre FS:  mylustre
Mount type: ldiskfs
Flags:      0x2
             (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.1.1.1 at tcp


  Permanent disk data:
Target:     mylustre-OST0002
Index:      2
Lustre FS:  mylustre
Mount type: ldiskfs
Flags:      0x2
             (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.1.1.1 at tcp

exiting before disk write.



More information about the lustre-discuss mailing list