[Lustre-discuss] socknal_sd00 100% lower?

Brock Palen brockp at umich.edu
Fri Mar 7 10:27:42 PST 2008


On Mar 7, 2008, at 1:23 PM, Maxim V. Patlasov wrote:

> Brock,
>
>> Notice the amount of cpu time given to sd00  and how sd01 has done  
>> nothing.  What could cause this?
> Please try Isaac's recommendation:
>> So if you have multiple CPUs and a single NIC (or more precisely  
>> Lustre only uses a single NIC) I'd suggest to try:
>> options ksocklnd enable_irq_affinity=0

How do you do this on a live system?  Taking away the filesystem is  
'bad'

>
> Also, please note, that you need to have several heavy-loaded tcp  
> connections to get fair load balancing. In the case of point-to- 
> point test the recommendation above may not help.

It was saw only when I had 150+ serial gromacs jobs running trrjecov  
every 60 seconds.  We have gotten around the 60 seconds problem.

>
> Sincerely,
> Maxim
>
>




More information about the lustre-discuss mailing list