[lustre-discuss] Filesystem hanging....

Phill Harvey-Smith p.harvey-smith at warwick.ac.uk
Fri Aug 12 05:37:56 PDT 2016


On 11/08/2016 16:10, Colin Faber wrote:
>> First glance indicates you're having network connectivity problems,
>> (possibly driver issue with your NIC?)

I don't seem to have had any problems with any other services running on 
the cluster, and there are no messages in the journal or the /var/log 
files relating to network errors.

Oddly though when the /home filesystem hangs the /storage and /scratch 
filesystems also served by the same luster servers continue to respond
without problems.

What does semm top have some bearing on it is that the first few writes 
seem to succeed and then it will hang, though it was first noticed 
through samba, it also appears to also happen logged in to the console 
directly.

>> (Check MTU settings, etc?)

Pasting as quotation as it stops thunderbird from wrapping the text.....

> root at test-r710:~# ifconfig
> eno1      Link encap:Ethernet  HWaddr 00:26:b9:84:c7:8d
>           inet addr:192.168.1.80  Bcast:192.168.1.255  Mask:255.255.255.0
>           inet6 addr: fe80::226:b9ff:fe84:c78d/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:8516 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:23199 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:5297958 (5.2 MB)  TX bytes:3222616 (3.2 MB)
>
> eno2      Link encap:Ethernet  HWaddr 00:26:b9:84:c7:8f
>           inet addr:192.168.0.80  Bcast:192.168.0.255  Mask:255.255.255.0
>           inet6 addr: fe80::226:b9ff:fe84:c78f/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:1374513 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:168485 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:2026863011 (2.0 GB)  TX bytes:21861558 (21.8 MB)
>
> eno4      Link encap:Ethernet  HWaddr 00:26:b9:84:c7:93
>           inet addr:137.205.232.159  Bcast:137.205.232.255  Mask:255.255.255.128
>           inet6 addr: fe80::226:b9ff:fe84:c793/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:11483 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:10560 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:3504764 (3.5 MB)  TX bytes:5731764 (5.7 MB)


> root at test-r710:~# route -n
> Kernel IP routing table
> Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
> 0.0.0.0         137.205.232.254 0.0.0.0         UG    0      0        0 eno4
> 137.205.232.128 0.0.0.0         255.255.255.128 U     0      0        0 eno4
> 192.168.0.0     0.0.0.0         255.255.255.0   U     0      0        0 eno2
> 192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 eno1

Lustre mounts in fstab :> # Lustre mounted
> 192.168.0.4 at tcp0:/storage       /storage        lustre  defaults,_netdev,flock 0 0
> 192.168.0.4 at tcp0:/home          /home           lustre  defaults,_netdev,flock 0 0
> 192.168.0.4 at tcp0:/scratch       /scratch        lustre  defaults,_netdev,flock 0 0

I've also tried compiling the latest source and installing those modules 
: Lustre: Build Version: 2.8.56_26_g6fad3ab this does seem not to have 
the problem with matlab (mentioned about a month or so ago), but still 
has the hanging problem.

The lustre startup logs in the joural are here :
> Aug 12 12:57:10 test-r710 kernel: Lustre: Lustre: Build Version: 2.8.56_26_g6fad3ab
> Aug 12 12:57:10 test-r710 kernel: Lustre: Server MGS version (2.1.0.0) is much older than client. Consider upgrading server (2.8.56_26_g6fad3ab)
> Aug 12 12:57:10 test-r710 kernel: Lustre: Trying to mount a client with IR setting not compatible with current mgc. Force to use current mgc setting that is IR disabled.
> Aug 12 12:57:10 test-r710 kernel: Lustre: Mounted home-client


Cheers.

Phill.



Cheers.

Phill.





More information about the lustre-discuss mailing list