[lustre-discuss] File writes blocking on Lustre 2.7.0
Bob Ball
ball at umich.edu
Fri Jun 5 11:25:27 PDT 2015
OK, this was just odd to me. We have a Lustre 2.7.0 system running now,
and, after setting up our first OSS, copying over all files from the old
system, we brought the new file system online. The new backingstore is
zfs. All was well with the world.
Meanwhile, all the old file servers were rebuilt, with zfs backing
store, and I am now using lfs_migrate to balance out the files.
So, two questions for which I would like opinions.
1. A single disk failed on one of the old, rebuilt file servers, and ALL
lfs_migrate threads blocked upon the failure, all at the same time.
This behavior was unexpected. Should I have been surprised? This was
not the case for Lustre 2.1.6 (well, that was upgraded from 1.8.x). Is
there a configuration that I can change that would change this
behavior? lfs_migrate threads resumed once the fail condition on that
one OSS/OST was cleared.
2. As I was starting to migrate 22M files off the original 2.7.0 server,
I deactivated those OST on the mgs/mdt combined machine. I saw at this
point that the occupied space was, apparently, not dropping on any OST
of that original OSS, while it was now growing in the other OSS/OST. I
found LU4825 about "lfs migrate not freeing space on OST".
Re-activating these OST re-established the used space correctly. Is
there another way to prevent new files from going to these "migrating
off from" OST than by deactivating them? It just seems to me that
such a huge llog replay, assuming I leave the OST deactivated during the
whole migration, is just not a good idea; or is that just my fear and
ignorance speaking?
Thanks for any advice on these 2 questions.
bob
More information about the lustre-discuss
mailing list