[Lustre-discuss] Need help on lustre filesystem setup..

linux freaker linuxfreaker at gmail.com
Mon Mar 18 00:52:28 PDT 2013


Hi,

I am trying to run Apache Hadoop project on parallel filesystem like
lustre. I have 1 MDS, 2 OSS/OST and 1 Lustre Client.

My lustre client shows:
Code:
[root at lustreclient1 ~]# lfs df -h
UUID                       bytes        Used   Available Use% Mounted on
lustre-MDT0000_UUID         4.5G      274.3M        3.9G   6% /mnt/lustre[MDT:0]
lustre-OST0000_UUID         5.9G      276.1M        5.3G   5% /mnt/lustre[OST:0]
lustre-OST0001_UUID         5.9G      276.1M        5.3G   5% /mnt/lustre[OST:1]
lustre-OST0002_UUID         5.9G      276.1M        5.3G   5% /mnt/lustre[OST:2]
lustre-OST0003_UUID         5.9G      276.1M        5.3G   5% /mnt/lustre[OST:3]
lustre-OST0004_UUID         5.9G      276.1M        5.3G   5% /mnt/lustre[OST:4]
lustre-OST0005_UUID         5.9G      276.1M        5.3G   5% /mnt/lustre[OST:5]
lustre-OST0006_UUID         5.9G      276.1M        5.3G   5% /mnt/lustre[OST:6]
lustre-OST0007_UUID         5.9G      276.1M        5.3G   5% /mnt/lustre[OST:7]
lustre-OST0008_UUID         5.9G      276.1M        5.3G   5% /mnt/lustre[OST:8]
lustre-OST0009_UUID         5.9G      276.1M        5.3G   5% /mnt/lustre[OST:9]
lustre-OST000a_UUID         5.9G      276.1M        5.3G   5%
/mnt/lustre[OST:10]
lustre-OST000b_UUID         5.9G      276.1M        5.3G   5%
/mnt/lustre[OST:11]

filesystem summary:        70.9G        3.2G       64.0G   5% /mnt/lustre
As I was unsure about which machine I need to install Hadoop
softwareon, I decided to go ahead with installing Hadoop on
LustreClient1.

I configured LustreClient1 with JAVA_HOME and HADOOP parameter with
the following files entry:
File: conf/core-site.xml
Code:
<property>
<name>fs.default.name</name>
<value>file:///mnt/lustre</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>${fs.default.name}/hadoop_tmp/mapred/system</value>
<description>The shared directory where MapReduce stores control
files.
</description>
</property>
I dint make changes in mapred-site.xml.

Now when I start 'bin/start-mapred.sh' which tried to ssh to my own
local machine. I am not sure if I am doing right.

Doubt> Do I need to have two Lustre Client for this to work?

Then I tried running wordcount program shown below:

Code:
 bin/hadoop jar hadoop-examples-1.1.1.jar wordcount /tmp/rahul
/tmp/rahul/rahul-output

ied 0 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1
SECONDS)
13/03/14 18:12:29 INFO ipc.Client: Retrying connect to server:
10.94.214.188/10.94.214.188:54311. Already tried 1 time(s); retry
policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
13/03/14 18:12:30 INFO ipc.Client: Retrying connect to server:
10.94.214.188/10.94.214.188:54311. Already tried 2 time(s); retry
policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
13/03/14 18:12:31 INFO ipc.Client: Retrying connect to server:
10.94.214.188/10.94.214.188:54311. Already tried 3 time(s); retry
policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
13/03/14 18:12:32 INFO ipc.Client: Retrying connect to server:
10.94.214.188/10.94.214.188:54311. Already tried 4 time(s); retry
policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
Question:1. As I have been comparing HDFS and Lustre for Hadoop, what
would be the right number of hardware nodes to compare?Say, I have 1
MDS, 2 OSS and 1 Lustre Client, on the other hand, 1 Namenode and 3
datanodes? How can I compare both FS?
Question:2. Do I really need 2 lustre client to setup Hadoop over
Lustre? if it is possible, how can I use OSS and MDS too for Hadoop
setup?
Question:3. As I read regarding the wordcount example, we need to
insert data into HDFS filesystem, do we need to do same for Lustre
too? Whats the command?
Question:4. What are the steps to confirm if HAdoop is actually using lustre FS?



More information about the lustre-discuss mailing list