[Lustre-discuss] page allocation failure

Andreas Dilger adilger at sun.com
Fri Nov 28 13:56:55 PST 2008


On Nov 28, 2008  11:14 +0200, Alex wrote:
> On Thursday 27 November 2008 20:30, Andreas Dilger wrote:
> > The problem is with the 32-bit kernel.  Linux doesn't allow a 32-bit
> > kernel to use more than 900MB of memory on a 32-bit system, no matter
> > how much RAM is installed.  900MB/8192MB ~= 10% of RAM.  
> 
> I want to clarify this, because i don't understand why are you saying that it 
> can be used max 900MB of our RAM! Afaik, on 32bit system, we have the 
> following limits:
> - max 4GiB RAM using kernel without PAE (Physical Address Extension)
> - max 64GiB RAM using kernel with PAE (extend physical address size from 32
>   to 36bits)

This is a Linux kernel limitation.  The 32-bit address space is split
into 1GB for the kernel ("Normal" memory) and 3GB ("High" memory) for
user space applications.  As a result, the Lustre OST threads (which
run in the kernel) can only use at most 1GB of RAM on a 32-bit system.
Even for filesystems like NFS or ext3 they can cache only 1GB of
metadata.

There is no reason to use a 32-bit OSS node for systems that need to
have good performance these days.  Even the least expensive x86 CPU
is 64-bit.

> > Swap is not useful for the kernel.
> 
> Why?

Because that just isn't the way the Linux kernel works.  It is not
possible to swap memory allocated by the kernel.

Even if the Linux kernel allowed swapping kernel memory to disk, this
would be a foolish thing to do, because now the Lustre IO data which
might be going at 1GB/s to a fast storage system might first be swapped
to a slow single disk at 40MB/s (at best!) and then read back (< 40MB/s)
and then written to the fast storage.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list