[Lustre-discuss] Singlehomed to multihomed upgrade
Wojciech Turek
wjt27 at cam.ac.uk
Thu Jan 8 08:46:39 PST 2009
Hi,
Do you have just one lustre server which serves as OSS and MDS/MGS ?
Can you paste output from `lctl ping <server_nid>` run on client?
Does the ethernet client has only one interface or is there more?
Did you also set lnet option (in modprobe.conf) on the clients?
Can you send output from `lctl list_nids` run on server(s)
And also output from `tunefs.lustre --print /dev/<lustre_target>` run on
the server
Cheers
Wojciech
Lukas Hejtmanek wrote:
> Hello,
>
> I have a setup with Lustre server and Lustre clients using o2ib. It works.
> I decided to add more clients, unfortunately the new clients does not have IB
> card. So I added the option on the server:
> options lnet networks="o2ib,tcp0"
>
> /usr/local/lustre/sbin/lctl list_nids
> 10.0.0.1 at o2ib
> 192.168.0.1 at tcp
>
> However, a client using tcp complains about:
> mount -t lustre 192.168.0.1 at tcp:/spfs /mnt/lustre/
> mount.lustre: mount 192.168.0.1 at tcp:/spfs at /mnt/lustre failed: No such file or
> directory
> Is the MGS specification correct?
> Is the filesystem name correct?
> If upgrading, is the copied client log valid? (see upgrade docs)
>
> This is from dmesg:
> LustreError: 15342:0:(events.c:454:ptlrpc_uuid_to_peer()) No NID found for
> 10.0.0.1 at o2ib
> LustreError: 15342:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot find
> peer 10.0.0.1 at o2ib!
> LustreError: 15342:0:(ldlm_lib.c:321:client_obd_setup()) can't add initial
> connection
> LustreError: 17831:0:(connection.c:144:ptlrpc_put_connection()) NULL
> connection
> LustreError: 15342:0:(obd_config.c:336:class_setup()) setup
> spfs-MDT0000-mdc-ffff8801d1d67c00 failed (-2)
> LustreError: 15342:0:(obd_config.c:1074:class_config_llog_handler()) Err -2 on
> cfg command:
> Lustre: cmd=cf003 0:spfs-MDT0000-mdc 1:spfs-MDT0000_UUID 2:10.0.0.1 at o2ib
> LustreError: 15c-8: MGC192.168.0.1 at tcp: The configuration from log
> 'spfs-client' failed (-2). This may be the result of communication errors
> between this node and the MGS, a bad configuration, or other errors. See the
> syslog for more information.
> LustreError: 15314:0:(llite_lib.c:1063:ll_fill_super()) Unable to process log:
> -2
> LustreError: 15314:0:(obd_config.c:403:class_cleanup()) Device 2 not setup
> LustreError: 15314:0:(ldlm_request.c:984:ldlm_cli_cancel_req()) Got rc -108
> from cancel RPC: canceling anyway
> LustreError: 15314:0:(ldlm_request.c:1593:ldlm_cli_cancel_list())
> ldlm_cli_cancel_list: -108
> Lustre: client ffff8801d1d67c00 umount complete
> LustreError: 15314:0:(obd_mount.c:1957:lustre_fill_super()) Unable to mount
> (-2)
>
> Is there a way I can upgrade the singlehomed server to the multihomed server?
> Do I really need to setup a router? How does it work? Is there any slowdown
> due to routing?
>
>
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wjt27 at cam.ac.uk
Tel: (+)44 1223 763517
More information about the lustre-discuss
mailing list