[Lustre-discuss] Recovery Problem

Stefano Elmopi stefano.elmopi at sociale.it
Fri May 21 04:49:41 PDT 2010



Hi,

I realized that the time server differed much across machines,
there were at least a few hours of difference.
I'm doing the tests and have not been paying attention to time  
synchronization
but now I have aligned the time of all servers and I've configured  
ntpd service
and the problem no longer occurs.
I can imagine that the cause of the problem was just the time  
misalignment.
Thank and sorry for the trouble !



Cheers, Stefano


Ing. Stefano Elmopi
Gruppo Darco - Resp. ICT Sistemi
Via Ostiense 131/L Corpo B, 00154 Roma

cell. 3466147165
tel.  0657060500
email:stefano.elmopi at sociale.it

"Ai sensi e per effetti della legge sulla tutela  della  riservatezza  
personale
(D.lgs n. 196/2003),  questa @mail e' destinata  unicamente alle  
persone sopra
indicate e le informazioni in essa contenute sono da considerarsi  
strettamente
riservate. E' proibito leggere, copiare, usare o diffondere il  
contenuto della
presente @mail  senza  autorizzazione. Se avete ricevuto  questo  
messaggio per
errore, siete pregati di rispedire la stessa al mittente. Grazie"

Il giorno 20/mag/10, alle ore 13:28, Johann Lombardi ha scritto:

> On Thu, May 20, 2010 at 12:29:41PM +0200, Stefano Elmopi wrote:
>> Hi Andreas
>> My version of Lustre 1.8.3
>> Sorry for my bad English but I used the wrong word, "crash" is not  
>> the
>> right word.
>> I try to explain better, I start copying a large file on the file  
>> system
>> and while the copy process continues, I reboot the server OSS,
>> and the copy process enters state "- stalled -".
>> I expected that once the server back online, the copy process to  
>> resume
>> normal
>> and complete copy of the file, instead the copy process fault.
>> Therefore the copy process that goes wrong, Lustre continues to  
>> perform
>> good.
>
> May 19 13:46:31 mdt01prdpom kernel: LustreError: 167-0: This client  
> was
> evicted by lustre01-OST0000; in progress operations using this service
> will fail.
>
> The cp process failed because the client got evicted by the OSS.
> We need to look at the OSS logs to figure out the root cause of
> the eviction.
>
> Johann

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100521/4dc2d715/attachment.htm>


More information about the lustre-discuss mailing list