[Lustre-discuss] Thread might be hung, Heavy IO Load messages

Wed Feb 1 16:33:03 PST 2012

David,

The oss service threads is a function of your RAM size and CPUs. It's
difficult to say what would be a good upper limit without knowing the size
of your OSS, # clients, storage back-end and workload. But the good thing
you can give a try on the fly via lctl set_param command.

Assuming you are running lustre 1.8, here is a good explanation on how to
do it:
http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651263_
87260

Some remarks:
- reducing the number of OSS threads may impact the performance depending
on how is your workload.
- unfortunately I guess you will need to try and see what happens. I would
go for 128 and analyze the behavior of your OSSs (via log files) and also
keeping an eye on your workload. Seems to me that 300 is a bit too high
(but again, I don't know what you have on your storage back-end or OSS
configuration).

I can't tell you much about the lru_size, but as far as I understand the
values are dynamic and there's not much to do rather than clear the last
recently used queue or disable the lru sizing. I can't help much on this
other than pointing you out the explanation for it (see 31.2.11):

http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html

Regards,
Carlos

--
Carlos Thomaz | HPC Systems Architect
Mobile: +1 (303) 519-0578
cthomaz at ddn.com | Skype ID: carlosthomaz
DataDirect Networks, Inc.
9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
<http://twitter.com/ddn_limitless> | 1.800.TERABYTE

On 2/1/12 2:11 PM, "David Noriega" <tsk133 at my.utsa.edu> wrote:

>zone_reclaim_mode is 0 on all clients/servers
>
>When changing number of service threads or the lru_size, can these be
>done on the fly or do they require a reboot of either client or
>server?
>For my two OSTs, cat /proc/fs/lustre/ost/OSS/ost_io/threads_started
>give about 300(300, 359) so I'm thinking try half of that and see how
>it goes?
>
>Also checking lru_size, I get different numbers from the clients. cat
>/proc/fs/lustre/ldlm/namespaces/*/lru_size
>
>Client: MDT0 OST0 OST1 OST2 OST3 MGC
>head node: 0 22 22 22 22 400 (only a few users logged in)
>busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
>samba/nfs server: 4 440070 44370 44348 26282 1600
>
>So my understanding is the lru_size is set to auto by default thus the
>varying values, but setting it manually is effectively setting a max
>value? Also what does it mean to have a lower value(especially in the
>case of the samba/nfs server)?
>
>On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor <taylor at hpc.ufl.edu> wrote:
>>
>> You may also want to check and, if necessary, limit the lru_size on
>>your clients.   I believe there are guidelines in the ops manual.
>>We have ~750 clients and limit ours to 600 per OST.   That, combined
>>with the setting zone_reclaim_mode=0 should make a big difference.
>>
>> Regards,
>>
>> Charlie Taylor
>> UF HPC Center
>>
>>
>> On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:
>>
>>> Hi David,
>>>
>>> You may be facing the same issue discussed on previous threads, which
>>>is
>>> the issue regarding the zone_reclaim_mode.
>>>
>>> Take a look on the previous thread where myself and Kevin replied to
>>> Vijesh Ek.
>>>
>>> If you don't have access to the previous emails, look at your kernel
>>> settings for the zone reclaim:
>>>
>>> cat /proc/sys/vm/zone_reclaim_mode
>>>
>>> It should be set to 0.
>>>
>>> Also, look at the number of Lustre OSS service threads. It may be set
>>>to
>>> high...
>>>
>>> Rgds.
>>> Carlos.
>>>
>>>
>>> --
>>> Carlos Thomaz | HPC Systems Architect
>>> Mobile: +1 (303) 519-0578
>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>> DataDirect Networks, Inc.
>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>>> ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
>>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>>>
>>>
>>>
>>>
>>>
>>> On 2/1/12 11:57 AM, "David Noriega" <tsk133 at my.utsa.edu> wrote:
>>>
>>>> indicates the system was overloaded (too many service threads, or
>>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>> Charles A. Taylor, Ph.D.
>> Associate Director,
>> UF HPC Center
>> (352) 392-4036
>>
>>
>>
>
>
>
>-- 
>David Noriega
>System Administrator
>Computational Biology Initiative
>High Performance Computing Center
>University of Texas at San Antonio
>One UTSA Circle
>San Antonio, TX 78249
>Office: BSE 3.112
>Phone: 210-458-7100
>http://www.cbi.utsa.edu
>_______________________________________________
>Lustre-discuss mailing list
>Lustre-discuss at lists.lustre.org
>http://lists.lustre.org/mailman/listinfo/lustre-discuss