[Lustre-discuss] Failure to communicate with MDS via o2ib

Isaac Huang He.Huang at Sun.COM
Tue May 27 07:13:13 PDT 2008


On Tue, May 27, 2008 at 09:50:38AM -0400, Charles Taylor wrote:
>    Whoops, I meant to include the mount-time error message....
> 
> /etc/init.d/lustre-client start
> IB HCA detected - will try to sleep until link state becomes ACTIVE
>   State becomes ACTIVE
> Loading Lustre lnet module with option networks=o2ib:      [  OK  ]
> Loading Lustre kernel module:                              [  OK  ]
> mount -t lustre 10.13.24.40 at o2ib:/ufhpc /ufhpc/scratch:
> 
> 
> mount.lustre: mount 10.13.24.40 at o2ib:/ufhpc at /ufhpc/scratch failed: Cannot
> send after transport endpoint shutdown
>                                                            [FAILED]
> Error: Failed to mount 10.13.24.40 at o2ib:/ufhpc
> mount -t lustre 10.13.24.90 at o2ib:/crn /crn/scratch:  mount.lustre: mount
> 10.13.24.90 at o2ib:/crn at /crn/scratch failed: Cannot send after transport
> endpoint shutdown
>                                                            [FAILED]
> Error: Failed to mount 10.13.24.90 at o2ib:/crn
> mount -t lustre 10.13.24.85 at o2ib:/hpcdata /ufhpc/hpcdata:  mount.lustre: mount
> 10.13.24.85 at o2ib:/hpcdata at /ufhpc/hpcdata failed: Cannot send after transport
> endpoint shutdown
>                                                            [FAILED]
> Error: Failed to mount 10.13.24.85 at o2ib:/hpcdata

Was there any error message in 'dmesg'? Can you try 'lctl ping
10.13.24.90 at o2ib'? (and 'lctl list_nids' and 'lctl --net o2ib
peer_list' and 'lctl --net o2ib conn_list').

Isaac



More information about the lustre-discuss mailing list