[lustre-discuss] lost files on ZFS

Thomas Roth t.roth at gsi.de
Sun Oct 30 05:33:12 PDT 2016


Hi all,

we have a larger amount of files that give ??? on 'ls' and the error 
"Cannot allocate memory"
The corresponding error on the OSS is
"lvbo_init failed for resource ... rc = -2"

This seems similar to LU-5457 (although the OSTs do not go into disconn 
state).
Our filesystem is on Lustre 2.5.3, zfs 0.6.3, from the start. So per 
Oleg's explanation,
"this could be fallout from earlier sync failures where OST announced it 
created some objects, failed to sync that to disk and then after dying 
and restarting the objects that were handed out by MDTs out of this pool 
are no longer there"

The affected OSTs are evenly distributed, however.
Finding the creation time of those files is difficult at best, but I am 
not aware of any series of crashes of so many OSSes in the recent months.
And how can this happen with ZFS-OSTs? Should this be possible so easily?

Regrard,
Thomas

------------------------------------------------------------------------
-- 
Thomas Roth
Department: Informationstechnologie
Location: SB3 1.250
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1
64291 Darmstadt
www.gsi.de

Gesellschaft mit beschränkter Haftung
Sitz der Gesellschaft: Darmstadt
Handelsregister: Amtsgericht Darmstadt, HRB 1528

Geschäftsführung: Ursula Weyrich
Professor Dr. Karlheinz Langanke
Jörg Blaurock

Vorsitzende des Aufsichtsrates: St Dr. Georg Schütte
Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt


More information about the lustre-discuss mailing list