[Lustre-discuss] Singlehomed to multihomed upgrade

Lukas Hejtmanek xhejtman at ics.muni.cz
Thu Jan 8 08:06:57 PST 2009


Hello,

I have a setup with Lustre server and Lustre clients using o2ib. It works.
I decided to add more clients, unfortunately the new clients does not have IB
card. So I added the option on the server:
options lnet networks="o2ib,tcp0"

/usr/local/lustre/sbin/lctl list_nids
10.0.0.1 at o2ib
192.168.0.1 at tcp

However, a client using tcp complains about:
mount -t lustre 192.168.0.1 at tcp:/spfs /mnt/lustre/
mount.lustre: mount 192.168.0.1 at tcp:/spfs at /mnt/lustre failed: No such file or
directory
Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)

This is from dmesg:
LustreError: 15342:0:(events.c:454:ptlrpc_uuid_to_peer()) No NID found for
10.0.0.1 at o2ib
LustreError: 15342:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot find
peer 10.0.0.1 at o2ib!
LustreError: 15342:0:(ldlm_lib.c:321:client_obd_setup()) can't add initial
connection
LustreError: 17831:0:(connection.c:144:ptlrpc_put_connection()) NULL
connection
LustreError: 15342:0:(obd_config.c:336:class_setup()) setup
spfs-MDT0000-mdc-ffff8801d1d67c00 failed (-2)
LustreError: 15342:0:(obd_config.c:1074:class_config_llog_handler()) Err -2 on
cfg command:
Lustre:    cmd=cf003 0:spfs-MDT0000-mdc  1:spfs-MDT0000_UUID  2:10.0.0.1 at o2ib  
LustreError: 15c-8: MGC192.168.0.1 at tcp: The configuration from log
'spfs-client' failed (-2). This may be the result of communication errors
between this node and the MGS, a bad configuration, or other errors. See the
syslog for more information.
LustreError: 15314:0:(llite_lib.c:1063:ll_fill_super()) Unable to process log:
-2
LustreError: 15314:0:(obd_config.c:403:class_cleanup()) Device 2 not setup
LustreError: 15314:0:(ldlm_request.c:984:ldlm_cli_cancel_req()) Got rc -108
from cancel RPC: canceling anyway
LustreError: 15314:0:(ldlm_request.c:1593:ldlm_cli_cancel_list())
ldlm_cli_cancel_list: -108
Lustre: client ffff8801d1d67c00 umount complete
LustreError: 15314:0:(obd_mount.c:1957:lustre_fill_super()) Unable to mount
(-2)

Is there a way I can upgrade the singlehomed server to the multihomed server?
Do I really need to setup a router? How does it work? Is there any slowdown
due to routing?

-- 
Lukáš Hejtmánek



More information about the lustre-discuss mailing list