[Lustre-discuss] help needed.

Aaron Knister aaron at iges.org
Sun Dec 23 11:27:19 PST 2007


Can you check the firewall on each of those machines ( iptables -L )  
and paste that here. Also, is this network dedicated to Lustre? Lustre  
can easily saturate a network interface under load to the point it  
becomes difficult to login to a node if it only has one interface. I'd  
recommend using a different interface if you can.

On Dec 23, 2007, at 11:03 AM, Avi Gershon wrote:

> node 1 132.66.176.212
> node 2 132.66.176.215
>
> [root at x-math20 ~]# ssh 132.66.176.215
> root at 132.66.176.215's password:
> ssh(21957) Permission denied, please try again.
> root at 132.66.176.215's password:
> Last login: Sun Dec 23 14:32:51 2007 from x-math20.tau.ac.il
> [root at x-mathr11 ~]#  lctl ping 132.66.176.211 at tcp0
> failed to ping 132.66.176.211 at tcp: Input/output error
> [root at x-mathr11 ~]#  lctl list_nids
> 132.66.176.215 at tcp
> [root at x-mathr11 ~]# ssh 132.66.176.212
> The authenticity of host '132.66.176.212 (132.66.176.212)' can't be  
> established.
> RSA1 key fingerprint is 85:2a:c1:47:84:b7:b5:a6:cd:c4:57:86:af:ce:7e: 
> 74.
> Are you sure you want to continue connecting (yes/no)? yes
> ssh(11526) Warning: Permanently added '132.66.176.212 ' (RSA1) to  
> the list of kno
> wn hosts.
> root at 132.66.176.212's password:
> Last login: Sun Dec 23 15:24:41 2007 from x-math20.tau.ac.il
> [root at localhost ~]# lctl ping 132.66.176.211 at tcp0
> failed to ping 132.66.176.211 at tcp: Input/output error
> [root at localhost ~]# lctl list_nids
> 132.66.176.212 at tcp
> [root at localhost ~]#
>
> thanks for helping!!
> Avi
>
> On Dec 23, 2007 5:32 PM, Aaron Knister <aaron at iges.org> wrote:
> On the oss can you ping the mds/mgs using this command--
>
> lctl ping 132.66.176.211 at tcp0
>
> If it doesn't ping, list the nids on each node by running
>
> lctl list_nids
>
> and tell me what comes back.
>
> -Aaron
>
>
> On Dec 23, 2007, at 9:22 AM, Avi Gershon wrote:
>
>> HI I could use some help.
>> I installed lustre on 3 computers
>>  mdt/mgs :
>>
>> ************************************************************************************8
>> [root at x-math20 ~]#mkfs.lustre --reformat --fsname spfs --mdt --mgs / 
>> dev/hdb
>>
>>    Permanent disk data:
>> Target:     spfs-MDTffff
>> Index:      unassigned
>> Lustre FS:  spfs
>> Mount type: ldiskfs
>> Flags:      0x75
>>               (MDT MGS needs_index first_time update )
>> Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
>> Parameters:
>>
>> device size = 19092MB
>> formatting backing filesystem ldiskfs on /dev/hdb
>>         target name  spfs-MDTffff
>>         4k blocks     0
>>         options        -J size=400 -i 4096 -I 512 -q -O dir_index -F
>> mkfs_cmd = mkfs.ext2 -j -b 4096 -L spfs-MDTffff  -J size=400 -i  
>> 4096 -I 512 -q -O dir_index -F /dev/hdb
>> Writing CONFIGS/mountdata
>> [root at x-math20 ~]# df
>> Filesystem           1K-blocks      Used Available Use% Mounted on
>> /dev/hda1             19228276   4855244  13396284  27% /
>> none                    127432         0    127432   0% /dev/shm
>> /dev/hdb              17105436    455152  15672728   3% /mnt/test/mdt
>> [root at x-math20 ~]# cat /proc/fs/lustre/devices
>>   0 UP mgs MGS MGS 5
>>   1 UP mgc MGC132.66.176.211 at tcp  
>> 5f5ba729-6412-3843-2229-1310a0b48f71 5
>>   2 UP mdt MDS MDS_uuid 3
>>   3 UP lov spfs-mdtlov spfs-mdtlov_UUID 4
>>   4 UP mds spfs-MDT0000 spfs-MDT0000_UUID 3
>> [root at x-math20 ~]#
>> *************************************************************end  
>> mdt******************************8
>> so you can see that the MGS is up
>> ond on the ost's I get an error!! plz help...
>>
>> ost:
>> **********************************************************************
>> [root at x-mathr11 ~]# mkfs.lustre --reformat --fsname spfs --ost -- 
>> mgsnode=132.66. 176.211 at tcp0 /dev/hdb1
>>
>>    Permanent disk data:
>> Target:     spfs-OSTffff
>> Index:      unassigned
>> Lustre FS:  spfs
>> Mount type: ldiskfs
>> Flags:      0x72
>>               (OST needs_index first_time update )
>> Persistent mount opts: errors=remount-ro,extents,mballoc
>> Parameters: mgsnode=132.66.176.211 at tcp
>>
>> device size = 19594MB
>> formatting backing filesystem ldiskfs on /dev/hdb1
>>         target name  spfs-OSTffff
>>         4k blocks     0
>>         options        -J size=400 -i 16384 -I 256 -q -O dir_index -F
>> mkfs_cmd = mkfs.ext2 -j -b 4096 -L spfs-OSTffff  -J size=400 -i  
>> 16384 -I 256 -q -O dir_index -F /dev/hdb1
>> Writing CONFIGS/mountdata
>> [root at x-mathr11 ~]# /CONFIGS/mountdata
>> -bash: /CONFIGS/mountdata: No such file or directory
>> [root at x-mathr11 ~]# mount -t lustre /dev/hdb1 /mnt/test/ost1
>> mount.lustre: mount /dev/hdb1 at /mnt/test/ost1 failed: Input/ 
>> output error
>> Is the MGS running?
>> ***********************************************end  
>> ost********************************
>>
>> can any one point out the problem?
>> thanks Avi.
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at clusterfs.com
>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
> Aaron Knister
> Associate Systems Administrator/Web Designer
> Center for Research on Environment and Water
>
> (301) 595-7001
> aaron at iges.org
>
>
>
>

Aaron Knister
Associate Systems Administrator/Web Designer
Center for Research on Environment and Water

(301) 595-7001
aaron at iges.org



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20071223/a0ff434f/attachment.htm>


More information about the lustre-discuss mailing list