[Lustre-discuss] Is it safe to run MDS, MGS & OSS on the same machine ?

Rafal Maszkowski rzm at icm.edu.pl
Wed Mar 5 01:24:16 PST 2014


On Tue, Mar 04, 2014 at 10:55:05PM +0000, Dilger, Andreas wrote:
> On 2014/03/04, 2:38 AM, "邓尧" <torshie at gmail.com<mailto:torshie at gmail.com>> wrote:
> We're running low on physical machines, and want to deploy MGS, MDS and OSS on the same machine, is it officially supported ?
> I know that MGS and MDS can be put on the same machine, but not sure about OSS and MDS.
> This will work, but if the node fails then there is no recovery for operations in progress and the clients can get an IO error for operations in progress.

We mostly use this mode of operation and our experience is that after a
machine crash* the nodes and heavy computing programs on them survive
several hours of break.

R.
*The machines which crash are our aging Thumpers. We replace memory
chips but we still do not know how to interpret the ILOM messages like:
ID =  60c : 11/28/2013 : 16:39:08 : Memory : BIOS : Uncorrectable ECC Node 7 DIMM 1
ID =  60b : 11/28/2013 : 16:39:08 : Memory : BIOS : Uncorrectable ECC Node 7 DIMM 0
Thumpers have only two nodes with four memory chips in each. The crashes
are rare though so we cannot test various hypotheses easily.
-- 
"Walczy on z całym zapamiętaniem przeciwko intelektowi" - z akt personalnych prof. A. Baumlera



More information about the lustre-discuss mailing list