[Lustre-devel] [slurm-dev] Lustre 1.8.2 client, Text file busy
Christopher J. Morrone
morrone2 at llnl.gov
Mon Mar 29 16:48:00 PDT 2010
We opened bug 22492 on this issue. Feel free to attach your reproducer
script and observations there!
Kent Engström wrote:
> [Cc: to the slurm-dev list as this has been discussed there.]
> After an upgrade to Lustre 1.8.2 (patchless client on top of Centos 5.4)
> on one of our compute clusters, we have been getting reports of
> spurious "Text file busy" messages.
> I have not seen any reports on the Lustre lists about this yet.
> A colleague of mine was able to reproduce it reliably, and I've written
> a small reproducer script:
> $ cat reproducer.sh
> rm myscript
> cat <<EOF >myscript
> echo "running"
> chmod +x myscript
> rm mycopy
> while :; do
> i=$(expr $i + 1)
> echo COPY $i
> cp myscript mycopy
> echo RUN $i
> sleep 1
> When I run this on a Lustre filesystem, I invariably get:
> $ ./reproducer.sh
> COPY 1
> RUN 1
> COPY 2
> RUN 2
> ./reproducer.sh: ./mycopy: /bin/sh: bad interpreter: Text file busy
> COPY 3
> RUN 3
> COPY 4
> RUN 4
> COPY 5
> RUN 5
> COPY 6
> RUN 6
> If I insert an "rm mycopy" command before the copy, I get no error.
> $ uname -r; rpm -q lustre
> (patchless client built from the 1.8.2 source with "make rpms")
> The servers for the filesystem are running
> I've tested the same code on another cluster that mounts the same
> filesystem. It runs CentOS 4 with patchless client
> The error cannot be reproduced there.
> I also expect that there will be no "Text file busy" error when I revert
> a node on the first cluster to 18.104.22.168 and run the test script, which I
> will proceed to do now.
More information about the lustre-devel