[Lustre-discuss] obdfilter-survey crashing

John Hammond jhammond at ices.utexas.edu
Wed Jan 5 09:25:29 PST 2011


On 01/04/2011 02:14 PM, robert wrote:
> Hi Everyone!
> 
> I just setup a lustre system on centos 5.5 and lustre 1.8.5. there are
> three identical oss with four osts each.
> 
> After having fantastic write rates but low read rates, I ran the
> obdfilter-survey script to get a hint of what may cause this.
> 
> Unfortnately obdfilter-survey in case=disk mode freezes on two of my
> three oss at the write task of the 4 objs, 16 threads line and leaves
> the system in an unstable state requiring a reboot. The other oss runs
> through the script without problems. To exclude a problem in the
> system´s setup, I booted one of the bad oss with the working oss´ disk -
> with the same faulty result. Creating a new filesystem on all osts of
> one of the problem oss neither did the trick.
> 
> Any ideas what may cause this behavior? Thanks!

Do you have panic_on_lbug set?

It's easy to LBUG Lustre by interrupting (Ctrl-C/SIGINT/Arrivederci Roma) a
running obdfilter-survey.  Using 1.8.4 on RHEL 5.5:

[root at oss21 obdfilter-survey]# nobjhi=2 thrhi=2 size=1024 case=disk sh
obdfilter-survey
Wed Jan  5 10:51:05 CST 2011 Obdfilter-survey for case=disk from
oss21.ranger.tacc.utexas.edu
ost  6 sz  6291456K rsz 1024K obj    6 thr    6 write
^C

[root at oss21 ~]# dmesg
[87251.960393] Lustre: 11759:(echo_client.c:1409:echo_client_cleanup())
ASSERTION(eco->eco_refcount == 0) failed
[87251.960451] Lustre: 11759:(echo_client.c:1409:echo_client_cleanup()) LBUG()
[87251.960482] Pid: 11759, comm: lctl
...

See https://bugzilla.lustre.org/show_bug.cgi?id=21745

-- 
John L. Hammond, Ph.D.
ICES, The University of Texas at Austin
jhammond at ices.utexas.edu
(512) 471-9304



More information about the lustre-discuss mailing list