[Lustre-discuss] page allocation failure

Andreas Dilger adilger at sun.com
Thu Nov 27 10:30:45 PST 2008


On Nov 26, 2008  19:04 +0800, Wang lu wrote:
> The %util of memory on OSS was always around 10% ,even when OSS was going to
> die. 
>  
> The OSS kernel is: 
>  2.6.9-67.0.7.EL_lustre.1.6.5smp(32bit)
> 
> Lustre version is 1.6.5.1
> 
> We have 8GB physical memory and 16GB(never been used) swap total. 
> 
> Is there a problem with memory management?

The problem is with the 32-bit kernel.  Linux doesn't allow a 32-bit
kernel to use more than 900MB of memory on a 32-bit system, no matter
how much RAM is installed.  900MB/8192MB ~= 10% of RAM.  Swap is not
useful for the kernel.

> Nov 20 19:40:45 boss02 kernel: Normal: 640*4kB 109*8kB 127*16kB 60*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7384kB
> Nov 20 19:40:45 boss02 kernel: HighMem: 376*4kB 1162*8kB 815*16kB 299*32kB 160*64kB 61*128kB 34*256kB 25*512kB 8*1024kB 1*2048kB 1786*4096kB = 7398656kB

As you can see, all of the memory is available in "highmem" and not in
the "normal" memory region that the kernel uses.

> Nov 21 05:48:44 boss02 kernel: ll_ost_io_114: page allocation failure.  order:4, mode:0x50

These are "order 4" allocations (64kB), which the kernel is bad at handling
under memory pressure in any case.  You can see in the "Normal" zone above
that all memory chunks 64kB and larger have no free memory to allocate.

> Nov 20 19:40:46 boss02 kernel:  [<c02b162a>] tcp_v4_do_rcv+0x1b/0xe9
> Nov 20 19:40:46 boss02 kernel:  [<fb18fd06>] ost_handle+0xe56/0x5790 

This appears that the memory allocation problems are due to the TCP
stack.  I would suspect that you are using TCP with jumbo packets.

The easiest solution is to run a 64-bit kernel, which I suspect should
be possible given that hardly any 32-bit machines allow more than 4GB
of RAM.  Next it would be possible to use regular ethernet frames, which
may help somewhat but it won't let you use the other 7GB of RAM in the
system.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list