[Lustre-discuss] odd mount behavior

John White jwhite at lbl.gov
Sun Mar 28 00:18:00 PDT 2010


Unfortunately for this case, there is a clear network path, no firewalls.  A quick telnet test at least confirms something is listening on the port and closes the connection pretty quickly.  If networking were the case, wouldn't I still see connection errors for the @tcp NID?
----------------
John White
High Performance Computing Services (HPCS)
(510) 486-7307
One Cyclotron Rd, MS: 50B-3209C
Lawrence Berkeley National Lab
Berkeley, CA 94720








On Mar 26, 2010, at 9:49 PM, Andreas Dilger wrote:

> On 2010-03-26, at 17:45, John White wrote:
>> 	We've got a new client we're trying to get to mount an existing file system.  The host cluster is set up with 2 NIDs for the MDT (o2ib, tcp), same with the client.  When I try mounting via tcp (mount -t lustre -o flock n0006.lustre at tcp:/vulcan /clusterfs/vulcan/pscratch), it just hangs there indefinitely with repeated messages like:
>> 
>> Lustre: Request x264 sent from vulcan-MDT0000-mdc-ffff8101e39bb400 to NID 10.4.200.6 at o2ib 5s ago has timed out (limit 5s).
>> 
>> I'm confused why it's attempting the o2ib NID repeatedly and never tries the tcp NID... Ideas?
> 
> 
> A common cause for newly-installed systems is hosts.deny or firewall rules that are preventing connections on port 988.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 




More information about the lustre-discuss mailing list