[Lustre-discuss] Luster clients getting evicted
Andreas Dilger
adilger at sun.com
Mon Feb 4 11:17:34 PST 2008
On Feb 04, 2008 13:17 -0500, Brock Palen wrote:
>> On Monday 04 February 2008 07:11:11 am Brock Palen wrote:
>>> on our cluster that has been running lustre for about 1 month. I have
>>> 1 MDT/MGS and 1 OSS with 2 OST's.
>>>
>>> Our cluster uses all Gige and has about 608 nodes 1854 cores.
>>
>> This seems to be a lot of clients for only one OSS (and thus for only
>> one GigE link to the OSS).
>
> Its more for evaluation, the 'real' file system is a NFS file system
> provided by a OnStor bobcat. So anything is a improvement. The cluster IS
> to big, but there isn't a person at the university who is willing to pay
> for anything other than more cluster nodes. Enough with politics.
I'd suggest increasing the lustre timeout, to avoid eviction if the system
is overloaded:
Temporarily (on the MDS, OSS, and all client nodes):
[root at mds]# sysctl -w lustre.timeout=300
If this helps you can set it permanently on the MGS (MDS) node:
mgs> lctl conf_param testfs-MDT0000.sys.timeout=300
replacing "testfs" with the actual name of your filesystem.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
More information about the lustre-discuss
mailing list