[lustre-discuss] Suspended jobs and rebooting lustre servers

Raj Ayyampalayam ansraj at gmail.com
Wed Feb 20 10:08:58 PST 2019


Hello,

We are planning on expanding our storage by adding more OSTs to our lustre
file system. It looks like it would be easier to expand if we bring the
filesystem down and perform the necessary operations. We are planning to
suspend all the jobs running on the cluster. We originally planned to add
new OSTs to the live filesystem.

We are trying to determine the potential impact to the suspended jobs if we
bring down the filesystem for the upgrade.
One of the questions we have is what would happen to the suspended
processes that hold an open file handle in the lustre file system when the
filesystem is brought down for the upgrade?
Will they recover from the client eviction?

We do have vendor support and have engaged them. I wanted to ask the
community and get some feedback.

Thanks,
-Raj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190220/38adae45/attachment.html>


More information about the lustre-discuss mailing list