[Lustre-discuss] Lustre on an Altix4700

Thu Jul 24 23:00:58 PDT 2008

On Jul 24, 2008  11:52 -0600, Craig Tierney wrote:
> Is anyone running, or does anyone know of someone
> running Lustre on an Altix 4700 (or other large
> Itanium SMP system)?  I was wondering if there
> are any quirks to getting very large aggregate
> performance to a single node (1024+ cores).

I believe there were some patches added to CVS (not sure if they are in
1.6.5 or not) that addressed allocation problems with per-CPU data
structures that were hit on 128-node system.

There are also patches in bug 11817 that are addressing issues in
many-core SMP clients, but there is likely still work to be done in
this area.

What kind of network do you have on such a system?  Do all of the
cores have equal access to the external network?  If not, it would
be good to e.g. bind the ptlrpcd thread to one of the IO nodes for
better performance.

There hasn't been any effort yet to e.g. have multiple ptlrpcd threads
(1 per IO node) to handle RPC requests from a thousand other cores.
If that became a bottleneck I suspect it wouldn't be too hard to bind
multiple ptlrpcd threads to multiple IO nodes, each having a ptlrpcd_pc
list and ptlrpc_add_set() could get some kind of smarts about locality
for which ptlrpcd_pc to add the outgoing request to.

There have been tests in the past to get 2GB/s+ from clients with
good networks and 32 IA64 CPUs, but depending on what kind of throughput
you are looking at there may still be a bunch of work to be done.

We'd be very interested to get feedback about any issues you hit on
such a large system, because we don't get much chance to test on a
single system with so many cores.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.