[Lustre-discuss] problem with few partitions

Giacinto Donvito giacinto.donvito at ba.infn.it
Tue Jan 26 06:48:25 PST 2010


Hi all, 

I have some problem in getting the client connected with few partition of a lustre file-system

In particular some days ago, we have serious issues on two raidset. 
After a reboot the partition becomes already available, at least locally to the file server. I tried to mount the partition, after an "e2fsck" on the partition. The e2fsck found some issue and fixed them. 

It seems that most of the nodes, keep connected to the partition that experienced problems, but the nodes on which those partition where "deactivated" are not able to re-join affected partitions.

In particular, on the server side I see those error on the logs,

Jan 26 14:43:59 dot1-se-01 kernel: LustreError: 6542:0:(filter_io_26.c:684:filter_commitrw_write()) error starting transaction: rc = -30
Jan 26 14:44:08 dot1-se-01 kernel: LustreError: 6578:0:(filter_io_26.c:684:filter_commitrw_write()) error starting transaction: rc = -30
Jan 26 14:44:18 dot1-se-01 kernel: LustreError: 6485:0:(filter_io_26.c:684:filter_commitrw_write()) error starting transaction: rc = -30
Jan 26 14:44:18 dot1-se-01 kernel: LustreError: 6555:0:(filter_io_26.c:684:filter_commitrw_write()) error starting transaction: rc = -30
Jan 26 14:44:26 dot1-se-01 kernel: LustreError: 6496:0:(filter_io_26.c:684:filter_commitrw_write()) error starting transaction: rc = -30
Jan 26 14:44:28 dot1-se-01 kernel: LustreError: 6512:0:(filter_io_26.c:684:filter_commitrw_write()) error starting transaction: rc = -30

while on the client I see: 

Jan 26 15:24:06 pccms35 kernel: LustreError: 11-0: an error occurred while communicating with 212.189.205.34 at tcp. The ost_connect operation failed with -30
Jan 26 15:24:06 pccms35 kernel: LustreError: Skipped 77 previous similar messages
Jan 26 15:25:46 pccms35 kernel: Lustre: 9624:0:(import.c:508:import_select_connection()) lustre-OST0001-osc-ffff81019f1d3800: tried all connections, increasing latency to 36s
Jan 26 15:25:46 pccms35 kernel: Lustre: 9624:0:(import.c:508:import_select_connection()) Skipped 77 previous similar messages

The same behavior  is shown also by "new" client joining the cluster. 

Any hint on this kind of issue? 

Best Regards, 
Cheers,
Giacinto

-- 
-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Giacinto Donvito    LIBI -- EGEE3 SA1 INFN - Bari ITALY
------------------------------------------------------------------
giacinto.donvito at ba.infn.it                   | GTalk/GMail: donvito.giacinto at gmail.com
tel. +39 080 5443244   Fax  +39 0805442470    | Skype: giacinto_it
VOIP:  +41225481596           | MSN: donvito.giacinto at hotmail.it
AIM/iChat: gdonvito1                          | Yahoo: eric1_it 
------------------------------------------------------------------
"At least once in a lifetime
it is convenient to put everything to discussion"
Descartes

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100126/f4e3c7cf/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1760 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100126/f4e3c7cf/attachment.bin>


More information about the lustre-discuss mailing list