[Lustre-discuss] Lustre-discuss Digest, Vol 35, Issue 6

Thu Dec 4 07:39:58 PST 2008

Hi Brian;

I guess I am not being very clear.  The next message in the log shows -
Dec  3 14:53:01 oss1 syslogd 1.4.1: restart.
When all the nodes locked up, I went to the console on the OSS where I
saw the OOPS - Kernel Panic message and the system was down.  The
messages I sent were the last messages written to the log before the
panic/restart.  I was hoping the messages would point to something I
should look at, which you suggested checking for the optimal threads.
Thanks and sorry I was not clear in my first email

Denise
On Thu, 2008-12-04 at 07:14 -0800,
lustre-discuss-request at lists.lustre.org wrote:
> Send Lustre-discuss mailing list submissions to
> 	lustre-discuss at lists.lustre.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.lustre.org/mailman/listinfo/lustre-discuss
> or, via email, send a message with subject or body 'help' to
> 	lustre-discuss-request at lists.lustre.org
> 
> You can reach the person managing the list at
> 	lustre-discuss-owner at lists.lustre.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Lustre-discuss digest..."
> 
> 
> Today's Topics:
> 
>    1. Low performance of 64bit OSS (Lu Wang)
>    2. Re: Mount OST with a new journal device (Ralf Utermann)
>    3. More:  setquota fails, mds adjust qunit failed (Thomas Roth)
>    4. Re: NFS Stale Handling with Lustre on RHEL 4	U7	x86_64
>       (Alex Lyashkov)
>    5. Re: Lustre-discuss Digest, Vol 35, Issue 5 (Denise Hummel)
>    6. Re: Lustre-discuss Digest, Vol 35, Issue 5 (Brian J. Murrell)
>    7. Re: Low performance of 64bit OSS (Brian J. Murrell)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 4 Dec 2008 18:36:24 +0800
> From: "Lu Wang" <wanglu at ihep.ac.cn>
> Subject: [Lustre-discuss] Low performance of 64bit OSS
> To: "lustre-discuss" <lustre-discuss at lists.lustre.org>
> Message-ID: <200812041836243753373 at ihep.ac.cn>
> Content-Type: text/plain;	charset="gb2312"
> 
> Dear list, 
> 
>     After upgrade our OSSs to  lustre-1.6.6-2.6.9_67.0.22.EL_lustre.1.6.6smp.x86_64, the phenomenon of frequent crash disappears. However, the two OSS provide low performance: 100MB/s read, and 200MB/s write. I/O wait on client nodes sometimes arrived more than 90%.  The OSSs server are attached with 10Gb/s Ethernet, and two 4Gb Express channel disk  arrays. Is it a problem caused by 32bit client and 64bit OSS server?
>     Thanks.
> 
> 
> --------------
> Lu Wang
> Computing Center 
> Insititute of High Energy Physics, China
> 
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 04 Dec 2008 14:03:25 +0100
> From: Ralf Utermann <ralf.utermann at physik.uni-augsburg.de>
> Subject: Re: [Lustre-discuss] Mount OST with a new journal device
> To: Andreas Dilger <adilger at sun.com>
> Cc: lustre-discuss at lists.lustre.org
> Message-ID: <4937D51D.1050900 at physik.uni-augsburg.de>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> Andreas Dilger wrote:
> [...]
> > 
> > 	tune2fs -O ^has_journal {filesystem_dev}
> > 	e2fsck -f {filesystem_dev}		 (not sure if required)
> > 	tune2fs -J device={journal_dev} {filesystem_dev}
> we are just starting with Lustre, so this has been more or less a
> test environment. Will set up the journal on shared storage for
> production.
> 
> For this case: I could not remove the has_journal entry, probably
> because I removed the old journal device before doing this:
> 
> alcc-ost1:~#  tune2fs -f -O ^has_journal /dev/vgoss1/ost1
> tune2fs 1.40.11.sun1 (17-June-2008)
> The needs_recovery flag is set.  Please run e2fsck before clearing
> the has_journal flag.
> alcc-ost1:~# e2fsck /dev/vgoss1/ost1
> e2fsck 1.40.11.sun1 (17-June-2008)
> External journal does not support this filesystem
> 
> Thanks for your help, Frank and Andreas,
> 
> Bye, Ralf
> -- 
>         Ralf Utermann
> _____________________________________________________________________
>         Universit?t Augsburg, Institut f?r Physik   --   EDV-Betreuer
>         Universit?tsstr.1             
>         D-86135 Augsburg                     Phone:  +49-821-598-3231
>         SMTP: Ralf.Utermann at Physik.Uni-Augsburg.DE         Fax: -3411
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Thu, 04 Dec 2008 14:46:54 +0100
> From: Thomas Roth <t.roth at gsi.de>
> Subject: [Lustre-discuss] More:  setquota fails, mds adjust qunit
> 	failed
> To: lustre-discuss at lists.lustre.org
> Message-ID: <4937DF4E.6020504 at gsi.de>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Hi,
> 
> I'm still having these problems with resetting and setting quota. My 
> Lustre system seems to be forever 'setquota failed: Device or resource 
> busy'.
> Right now, I have tried to write  as much as my current quota setting 
> allows:
> 
> # lfs quota -u troth /lustre
> Disk quotas for user troth:
>       Filesystem  kbytes   quota   limit   grace   files   quota   limit 
>    grace
>          /lustre       4  3072000  309200               1   11000   10000
> lust-MDT0000_UUID
>                        4*              1               1            6400
> lust-OST0000_UUID
>                        0           16384
> lust-OST0001_UUID
>                        0           22528
> ...
> 
> I wrote some ~ 100 MB with 'dd', deleted them and tried to copy a 
> directory - "Disk quota exceeded"
> Now there are several questions: the listing above indicates that on the 
> MDT I have exceeded my quota - there's a 4* - without any data in my 
> Lustre directory. But this is only 4kB - who nows what could take up 
> 4kB. (Another question is how I managed to set the quota on the MDT to 1 
> kB in the first place - unfortunately I did not write down my previous 
> "lfs setquota" commands while they were still successful.)
> Still - how can I write 1 file with 2MB in this situation, and why can I 
> not even make the directory (the one I wanted to copy), without any 
> files in it, before the quota blocks everything?
> But wait - the story goes on. When I try to write with dd of=/dev/zero 
> ..., the log of the MDT says
> 
>   Dec  4 14:20:39 lustre kernel: LustreError: 
> 3837:0:(quota_master.c:478:mds_quota_adjust()) mds adjust qunit failed! 
> (opc:4 rc:-16)
> 
> This is reproducible and correlates with my write attempts.
> 
> So something might be broken here?
> 
> I have read further on in the Lustre Manual about quota. It keeps 
> talking about parameters found "/proc/fs/lustre/lquota/..." I don't have 
> a subdirectory "lquota" there - neither on the MDT nor on the OSTs. The 
> parameters can be found, however, in "/proc/fs/lustre/mds/lust-MDT0000/" 
> and "/proc/fs/lustre/obdfilter/lust-OSTxxxx".
> Disturbingly enough, "/proc/fs/lustre/mds/lust-MDT0000/quota_type" reads 
> "off2"
> On one OST, I found it to be "off" . There, I tried "tunefs.lustre 
> --param ost.quota_type=ug /dev/sdb1 ", as mentioned in the manual. 
> Reading the parameters off the partition with tunefs tells me that the 
> quota_type is "ug", the entry 
> /proc/fs/lustre/mds/lust-MDT0000/quota_type is still "off".
> 
> 
> Now we have had problems with quotas before, but in these cases already 
> "lfs quotacheck" would fail. Now, on this system, not only quotacheck 
> worked but while I still had quotas set to sensible values before, the 
> quota mechanism itself worked as desired. I conclude that this trouble 
> is not because I have forgotten to activate quota in some earlier stage 
> as kernel compilation or formatting the Lustre partitions.
> 
> So I'm lost now and would appreciate any hint.
> 
> Oh, all of these servers are running Debian Etch 64bit, kernel 2.6.22, 
> Lustre 1.6.5.1
> 
> Thomas
> 
> Andrew Perepechko wrote:
> > Thomas,
> > 
> > setquota (from quota-tools) would not work with Lustre filesystems, so
> > you cannot run it like "~#  setquota -u troth 0 0 0 0 /lustre".
> > 
> > lfs can be used either to set quota limits or to reset them and
> > " ~#  lfs setquota -u troth 0 0 0 0 /lustre" is the correct way to
> > reset quotas.
> > 
> > AFAIU, the cause of "Device or resource busy" when setting quota
> > in your case could be that MDS was performing setquota or quota recovery
> > for the user roth. Could you check whether MDS is stuck inside
> > mds_set_dqblk or mds_quota_recovery functions (you can dump
> > strack traces of running threads into kernel log with alt-sysrq-t provided
> > sysctl variable kerne.sysrq equals 1)?
> > 
> > Andrew.
> > 
> > On Friday 28 November 2008 17:50:51 Thomas Roth wrote:
> >> Hi all,
> >>
> >> on an empty and unused Lustre 1.6.5.1 system I cannot reset or set the
> >>
> >> quota:
> >>  >  ~# lfs quota -u troth /lustre
> >>  > Disk quotas for user troth:
> >>  >      Filesystem  kbytes   quota   limit   grace   files   quota
> >>
> >> limit   grace
> >>
> >>  >         /lustre       4  3072000  309200               1   11000   10000
> >>  > MDT0000_UUID
> >>  >                       4*              1               1            6400
> >>  > OST0000_UUID
> >>  >                       0           16384
> >>
> >> Try to reset this quota:
> >>  > ~#  lfs setquota -u troth 0 0 0 0 /lustre
> >>  > setquota failed: Device or resource busy
> >>
> >> Use "some" values instead:
> >>  > ~# lfs setquota -u troth 104000000 105000000 100000 100000 /lustre
> >>  > setquota failed: Device or resource busy
> >>
> >> I know the manual says not to use "lfs setquota" to reset quotas but -
> >> that is yet another question - of course there is a command "setquota",
> >> but it doesn't know about Lustre
> >>
> >>  > ~#  setquota -u troth 0 0 0 0 /lustre
> >>  > setquota: Mountpoint (or device) /lustre not found.
> >>  > setquota: Not all specified mountpoints are using quota.
> >>
> >> as is to be expected. Mistake in the manual?
> >>
> >> However I'm mainly interested in what causes my system to be busy, when
> >> it is not - no writes, not even reads.
> >> I did rerun "lfs quotacheck", but that didn't help, either.
> >>
> >> Anybody got any hints what to do to manipulate quotas?
> >>
> >> Thanks,
> >> Thomas
> >> _______________________________________________
> >> Lustre-discuss mailing list
> >> Lustre-discuss at lists.lustre.org
> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> > 
> 
> -- 
> --------------------------------------------------------------------
> Thomas Roth
> Department: Informationstechnologie
> Location: SB3 1.262
> Phone: +49-6159-71 1453  Fax: +49-6159-71 2986
> 
> GSI Helmholtzzentrum f?r Schwerionenforschung GmbH
> Planckstra?e 1
> D-64291 Darmstadt
> www.gsi.de
> 
> Gesellschaft mit beschr?nkter Haftung
> Sitz der Gesellschaft: Darmstadt
> Handelsregister: Amtsgericht Darmstadt, HRB 1528
> 
> Gesch?ftsf?hrer: Professor Dr. Horst St?cker
> 
> Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph,
> Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Thu, 04 Dec 2008 15:53:09 +0200
> From: Alex Lyashkov <Alexey.Lyashkov at Sun.COM>
> Subject: Re: [Lustre-discuss] NFS Stale Handling with Lustre on RHEL 4
> 	U7	x86_64
> To: anil kumar <anil.k.kv at gmail.com>
> Cc: lustre-discuss at lists.lustre.org
> Message-ID: <1228398789.4233.10.camel at bear.shadowland>
> Content-Type: text/plain
> 
> On Thu, 2008-12-04 at 13:18 +0530, anil kumar wrote:
> > Alex,
> >  
> > We are working on checking the lustre scalability so that we can
> > uptake it in our production infrastructure. Below are the details of
> > our setup, tests conducted and the issues faced till now, 
> > Setup details :
> > --------------------
> > 
> > Hardware Used - HP DL360 
> > MDT/MGS - 1 
> > OST - 13 (13 HP DL360 servers used, 1 OSS = 1 OST, 700gb x 13 )
> > 
> > Issue1
> > ---------
> > Test Environment: 
> > 
> > Operating System - Redhat EL4 Update 7 ,x86_64
> > Lustre Version - 1.6.5.1 
> > Lustre Kernel -
> > kernel-lustre-smp-2.6.9-67.0.7.EL_lustre.1.6.5.1.x86_64
> I think this for server?
> 
> > Lustre Client - Xen Virtual Machines with 2.6.9-78.0.0.0.1.ELxenU
> > kernel( patchless ) 
> 2.6.9 kernel for patchless is dangerous - some problems can be fixed due
> kernel internal limitation. i suggest apply vfs_intent and dcache
> patches.
> 
> 
> 
> > 
> > Test Conducted: Performed heavy read/write ops from 190 lustre
> > clients. Each client tries to read & write 14000 files parallely. 
> > 
> > Errors noticed : Multiple cliens evicted while writting hugh number of
> > files.Lustre mount is not accessible in the evicted clients. We need
> > to umount and mount to make the lustre accessible in the affected
> > clients. 
> > 
> > server side errors noticed 
> > -----------------------------------------
> > Nov 26 01:03:48 kernel: LustreError:
> > 29774:0:(handler.c:1515:mds_handle()) operation 41 on unconnected MDS
> > from 12345-[CLIENT IP HERE]@tcp
> 
> > Nov 26 01:07:46 kernel: Lustre: farmres-MDT0000: haven't heard from
> > client 2379a0f4-f298-9c78-fce6-3d8db74f912b (at [CLIENT IP HERE]@tcp)
> > in 227 seconds. I think it's dead, and I am evicting it.
> > Nov 26 01:43:58 kernel: Lustre: MGS: haven't heard from client
> > 0c239c47-e1f7-47de-0b43-19d5819081e1 (at [CLIENT IP HERE]@tcp) in 227
> > seconds. I think it's dead, and I am evicting it.
> both - mds and mgs is evict client - is network link is OK ?
> 
> 
> > Nov 26 01:54:37 kernel: LustreError:
> > 29766:0:(handler.c:1515:mds_handle()) operation 101 on unconnected MDS
> > from 12345-[CLIENT IP HERE]@tcp
> > Nov 26 02:09:49 kernel: LustreError:
> > 29760:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ processing error
> > (-107) req at 000001080ba29400 x260230/t0 o101-><?>@<?>:0/0 lens 440/0 e
> > 0 to 0 dl 1227665489 ref 1 fl Interpret:/0/0 rc -107/0
> > Nov 27 01:06:07 kernel: LustreError:
> > 30478:0:(mgs_handler.c:538:mgs_handle()) lustre_mgs: operation 101 on
> > unconnected MGS
> > Nov 27 02:21:39 kernel: Lustre:
> > 18420:0:(ldlm_lib.c:525:target_handle_reconnect()) farmres-MDT0000:
> > 180cf598-1e43-3ea4-6cf6-0ee40e5a2d5e reconnecting
> > Nov 27 02:22:16 kernel: Lustre: Request x2282604 sent from
> > farmres-MDT0000 to NID [CLIENT IP HERE]@tcp 6s ago has timed out
> > (limit 6s).
> 
> > Nov 27 02:22:16 kernel: LustreError: 138-a: farmres-MDT0000: A client
> > on nid [CLIENT IP HERE]@tcp was evicted due to a lock blocking
> > callback to [CLIENT IP HERE]@tcp timed out: rc -107
> 
> 
> > Nov 27 08:58:46 kernel: LustreError:
> > 29755:0:(upcall_cache.c:325:upcall_cache_get_entry()) acquire timeout
> > exceeded for key 0
> > Nov 27 08:59:11 kernel: LustreError:
> > 18473:0:(upcall_cache.c:325:upcall_cache_get_entry()) acquire timeout
> > exceeded for key 0
> hm... as i know this bug on FS configuration. can you reset
> mdt.group_upcall to 'NONE' ?
> 
> 
> > Nov 27 13:23:25 kernel: Lustre:
> > 29752:0:(ldlm_lib.c:525:target_handle_reconnect()) farmres-MDT0000:
> > 3d5efff1-1652-6669-94de-c93ee73a4bc7 reconnecting
> > Nov 27 02:17:16 kernel: nfs_statfs: statfs error = 116
> > ------------------------
> > 
> > client errors 
> > ------------------------
> > 
> > cp: cannot stat
> > `/master/jdk16/sample/jnlp/webpad/src/version1/JLFAbstractAction.java': Cannot send after transport endpoint shutdown
> > -------------------------
> > 
> > Lustre supports Xen kernel 2.6.9-78.0.0.0.1.ELxenU as patchless ? 
> with some limitation. i suggest use 2.6.15 and up for patchless client.
> for 2.6.16 i know about one limitation - FMODE_EXEC patch is absent.
> 
> what is in clients /var/log/messages at same time ?
> > 
> 
> 
> 
> ------------------------------
> 
> Message: 5
> Date: Thu, 04 Dec 2008 07:40:16 -0700
> From: Denise Hummel <denise_hummel at nrel.gov>
> Subject: Re: [Lustre-discuss] Lustre-discuss Digest, Vol 35, Issue 5
> To: lustre-discuss at lists.lustre.org
> Message-ID: <1228401616.11163.152.camel at dhummel.nrel.gov>
> Content-Type: text/plain
> 
> Hi Brian;
> 
> Thanks for the advice.  The messages you saw were immediately prior to
> the kernel panic - the console showed the kernel panic and the messages
> on the console were about brw_writes and OST timeouts.
> I did do a baseline, so will try to determine the appropriate number of
> threads.  You are right that we were probably oversubscribing the
> storage and just recently became overloaded with the number of Gaussian
> jobs running.
> Is it typical for a kernel panic in this situation?
> 
> Thanks,
> Denise
> 
> 
> On Thu, 2008-12-04 at 01:27 -0800,
> lustre-discuss-request at lists.lustre.org wrote:
> 
> 
> 
> ------------------------------
> 
> Message: 6
> Date: Thu, 04 Dec 2008 09:51:19 -0500
> From: "Brian J. Murrell" <Brian.Murrell at Sun.COM>
> Subject: Re: [Lustre-discuss] Lustre-discuss Digest, Vol 35, Issue 5
> To: lustre-discuss at lists.lustre.org
> Message-ID: <1228402279.30988.2262.camel at pc.interlinx.bc.ca>
> Content-Type: text/plain; charset="us-ascii"
> 
> On Thu, 2008-12-04 at 07:40 -0700, Denise Hummel wrote:
> > Hi Brian;
> 
> Hi.
> 
> > Thanks for the advice.
> 
> NP.
> 
> > The messages you saw were immediately prior to
> > the kernel panic
> 
> There was no kernel panic in the messages you sent.  You need to
> understand that watch dog timeouts are not kernel panics although they
> do show a stack trace similar to kernel panics.
> 
> If you do have an actual kernel panic, it was not included in the
> messages you sent.
> 
> > I did do a baseline, so will try to determine the appropriate number of
> > threads.  You are right that we were probably oversubscribing the
> > storage and just recently became overloaded with the number of Gaussian
> > jobs running.
> > Is it typical for a kernel panic in this situation?
> 
> As I have said before, there was no kernel panic.
> 
> b.
> 
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: application/pgp-signature
> Size: 197 bytes
> Desc: This is a digitally signed message part
> Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081204/5e4a1ca4/attachment-0001.bin 
> 
> ------------------------------
> 
> Message: 7
> Date: Thu, 04 Dec 2008 10:14:00 -0500
> From: "Brian J. Murrell" <Brian.Murrell at Sun.COM>
> Subject: Re: [Lustre-discuss] Low performance of 64bit OSS
> To: lustre-discuss <lustre-discuss at lists.lustre.org>
> Message-ID: <1228403640.30988.2290.camel at pc.interlinx.bc.ca>
> Content-Type: text/plain; charset="us-ascii"
> 
> On Thu, 2008-12-04 at 18:36 +0800, Lu Wang wrote:
> > Dear list, 
> > 
> >     After upgrade our OSSs to  lustre-1.6.6-2.6.9_67.0.22.EL_lustre.1.6.6smp.x86_64, the phenomenon of frequent crash disappears.
> 
> Good.  What version did you upgrade from?
> 
> > However, the two OSS provide low performance: 100MB/s read, and 200MB/s write.
> 
> What was your read/write performance before the upgrade?  Is this 100
> and 200 MBs/ measurements measured at the clients or is it a measurement
> of the disk speed at the OSS?
> 
> > Is it a problem caused by 32bit client and 64bit OSS server?
> 
> No.  Mixed 32/64 bit clusters should not cause a slow-down.
> 
> b.
> 
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: application/pgp-signature
> Size: 197 bytes
> Desc: This is a digitally signed message part
> Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081204/f95805fd/attachment.bin 
> 
> ------------------------------
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> 
> End of Lustre-discuss Digest, Vol 35, Issue 6
> *********************************************