<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
You can deactivate it on the MDT, that will make it RO, but leave it
alone on the clients so they can still access files from it.<br>
<br>
bob<br>
<br>
On 2/15/2011 1:57 PM, Jagga Soorma wrote:
<blockquote
cite="mid:AANLkTi=zpGDZVpwsWv1V7VPPANc6=dBdu=dHAtW+OGkC@mail.gmail.com"
type="cite">Hi Guys,<br>
<br>
One of my clients got a hung lustre mount this morning and I saw
the following errors in my logs:<br>
<br>
--<br>
..snip..<br>
Feb 15 09:38:07 reshpc116 kernel: LustreError: 11-0: an error
occurred while communicating with 10.0.250.47@o2ib3. The ost_write
operation failed with -28<br>
Feb 15 09:38:07 reshpc116 kernel: LustreError: Skipped 4755836
previous similar messages<br>
Feb 15 09:48:07 reshpc116 kernel: LustreError: 11-0: an error
occurred while communicating with 10.0.250.47@o2ib3. The ost_write
operation failed with -28<br>
Feb 15 09:48:07 reshpc116 kernel: LustreError: Skipped 4649141
previous similar messages<br>
Feb 15 10:16:54 reshpc116 kernel: Lustre:
6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1360125198261945 sent from reshpcfs-OST0005-osc-ffff8830175c8400
to NID 10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to
deadline).<br>
Feb 15 10:16:54 reshpc116 kernel: Lustre:
reshpcfs-OST0005-osc-ffff8830175c8400: Connection to service
reshpcfs-OST0005 via nid 10.0.250.47@o2ib3 was lost; in progress
operations using this service will wait for recovery to complete.<br>
Feb 15 10:16:54 reshpc116 kernel: LustreError: 11-0: an error
occurred while communicating with 10.0.250.47@o2ib3. The
ost_connect operation failed with -16<br>
Feb 15 10:16:54 reshpc116 kernel: LustreError: Skipped 2888779
previous similar messages<br>
Feb 15 10:16:55 reshpc116 kernel: Lustre:
6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1360125198261947 sent from reshpcfs-OST0005-osc-ffff8830175c8400
to NID 10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to
deadline).<br>
Feb 15 10:18:11 reshpc116 kernel: LustreError: 11-0: an error
occurred while communicating with 10.0.250.47@o2ib3. The
ost_connect operation failed with -16<br>
Feb 15 10:18:11 reshpc116 kernel: LustreError: Skipped 10 previous
similar messages<br>
Feb 15 10:20:45 reshpc116 kernel: LustreError: 11-0: an error
occurred while communicating with 10.0.250.47@o2ib3. The
ost_connect operation failed with -16<br>
Feb 15 10:20:45 reshpc116 kernel: LustreError: Skipped 21 previous
similar messages<br>
Feb 15 10:25:46 reshpc116 kernel: LustreError: 11-0: an error
occurred while communicating with 10.0.250.47@o2ib3. The
ost_connect operation failed with -16<br>
Feb 15 10:25:46 reshpc116 kernel: LustreError: Skipped 42 previous
similar messages<br>
Feb 15 10:31:43 reshpc116 kernel: Lustre:
reshpcfs-OST0005-osc-ffff8830175c8400: Connection restored to
service reshpcfs-OST0005 using nid 10.0.250.47@o2ib3.<br>
--<br>
<br>
Due to disk space issues on my lustre filesystem one of the OST's
were full and I deactivated that OST this morning. I thought that
operation just puts it in a read only state and that clients can
still access the data from that OST. After activating this OST
again the client connected again and was okay after this. How
else would you deal with a OST that is close to 100% full? Is it
okay to leave the OST active and the clients will know not to
write data to that OST?<br>
<br>
Thanks,<br>
-J<br>
<pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
Lustre-discuss mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Lustre-discuss@lists.lustre.org">Lustre-discuss@lists.lustre.org</a>
<a class="moz-txt-link-freetext" href="http://lists.lustre.org/mailman/listinfo/lustre-discuss">http://lists.lustre.org/mailman/listinfo/lustre-discuss</a>
</pre>
</blockquote>
</body>
</html>