[Lustre-discuss] Unable to get lmt to work properly
Peter Kjellstrom
cap at nsc.liu.se
Fri Jan 15 05:41:53 PST 2010
I first posted this on lmt-discuss but I'm not sure it's alive. Hopefully
someone here can shed some light on it.
The setup is fairly trivial, 2 OSTs per OSS using lustre 1.6.7.1 (more details
below).
Tia,
Peter
--------------------------------------------------------------
I have two problems:
1) I only see one (out of two) OSTs per OSS when I run "cerebro-stat -m lmt_ost":
[root at oss44 ~]# cerebro-stat -m lmt_ost | grep oss44
oss44: 1.0;oss44;test1-OST0001;303628200;303628288;4779243244;4781815224;2147483648;2147483648
[root at oss44 ~]# ls /proc/fs/lustre/obdfilter/
num_refs test1-OST0000 test1-OST0001
[root at oss44 ~]# df -t lustre
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/oss44_vg/ost1 4781819320 3620560 4729618276 1% /lustre/ost1
/dev/oss44_vg/ost2 4781815224 2571980 4730662800 1% /lustre/ost2
[root at oss44 ~]# uname -r
2.6.18-92.1.17.el5_lustre.1.6.7.1smp
[root at oss44 ~]# rpm -qa | grep lmt
lmt-client-2.6.3-1.ch4.2
lmt-server-agent-2.6.3-1.ch4.2
lmt-server-2.6.3-1.ch4.2
[root at oss44 ~]# rpm -qa | grep cerebro
cerebro-1.9-1
2) Something's probably strange with the database but I don't know what
really (originally no OST_INFO was generated):
Symptoms include "empty" ltop output (ltop -X shows all OSSes but only
CPU-usage is correct, all other "***")
And the cron-script writes log-files like this:
#####################
# OST - test1
#####################
Updating hourly ost agg table for test1
Updating OST_AGGREGATE_HOUR for test1...
[LMT::connect] - Unknown filesystem: test1
Updating other ost agg tables for test1
[LMT::connect] - Unknown filesystem: test1
Updating filesys-level ost tables for test1
[LMT::connect] - Unknown filesystem: test1
#####################
# ROUTER - test1
#####################
Updating hourly router agg table for test1
Updating ROUTER_AGGREGATE_HOUR for test1...
[LMT::connect] - Unknown filesystem: test1
Updating other router agg tables for test1
[LMT::connect] - Unknown filesystem: test1
#####################
# MDS - test1
#####################
Updating hourly mds agg table for test1
Updating MDS_AGGREGATE_HOUR for test1...
[LMT::connect] - Unknown filesystem: test1
Updating other mds agg tables for test1
[LMT::connect] - Unknown filesystem: test1
*** Aggregate Table Update Complete ***
Here's the database config (I think it includes the important bits).
mysql> show tables;
+----------------------------+
| Tables_in_filesystem_test1 |
+----------------------------+
| EVENT_DATA |
| EVENT_INFO |
| FILESYSTEM_AGGREGATE_DAY |
| FILESYSTEM_AGGREGATE_HOUR |
| FILESYSTEM_AGGREGATE_MONTH |
| FILESYSTEM_AGGREGATE_WEEK |
| FILESYSTEM_AGGREGATE_YEAR |
| FILESYSTEM_INFO |
| MDS_AGGREGATE_DAY |
| MDS_AGGREGATE_HOUR |
| MDS_AGGREGATE_MONTH |
| MDS_AGGREGATE_WEEK |
| MDS_AGGREGATE_YEAR |
| MDS_DATA |
| MDS_INFO |
| MDS_OPS_DATA |
| MDS_VARIABLE_INFO |
| OPERATION_INFO |
| OSS_DATA |
| OSS_INFO |
| OSS_INTERFACE_DATA |
| OSS_INTERFACE_INFO |
| OSS_VARIABLE_INFO |
| OST_AGGREGATE_DAY |
| OST_AGGREGATE_HOUR |
| OST_AGGREGATE_MONTH |
| OST_AGGREGATE_WEEK |
| OST_AGGREGATE_YEAR |
| OST_DATA |
| OST_INFO |
| OST_OPS_DATA |
| OST_VARIABLE_INFO |
| ROUTER_AGGREGATE_DAY |
| ROUTER_AGGREGATE_HOUR |
| ROUTER_AGGREGATE_MONTH |
| ROUTER_AGGREGATE_WEEK |
| ROUTER_AGGREGATE_YEAR |
| ROUTER_DATA |
| ROUTER_INFO |
| ROUTER_VARIABLE_INFO |
| TIMESTAMP_INFO |
| VERSION |
+----------------------------+
42 rows in set (0.00 sec)
mysql> select * from FILESYSTEM_INFO;
+---------------+-----------------+-----------------------+----------------+
| FILESYSTEM_ID | FILESYSTEM_NAME | FILESYSTEM_MOUNT_NAME | SCHEMA_VERSION |
+---------------+-----------------+-----------------------+----------------+
| 1 | test1 | | 1.1 |
+---------------+-----------------+-----------------------+----------------+
1 row in set (0.00 sec)
mysql> select * from MDS_INFO;
+--------+---------------+---------------+----------+-------------+
| MDS_ID | FILESYSTEM_ID | MDS_NAME | HOSTNAME | DEVICE_NAME |
+--------+---------------+---------------+----------+-------------+
| 1 | 1 | test1-MDT0000 | mds4 | |
+--------+---------------+---------------+----------+-------------+
1 row in set (0.00 sec)
mysql> select * from OSS_INFO where HOSTNAME like "oss4%" LIMIT 4;
+--------+---------------+----------+--------------+
| OSS_ID | FILESYSTEM_ID | HOSTNAME | FAILOVERHOST |
+--------+---------------+----------+--------------+
| 18 | 1 | oss40 | NULL |
| 19 | 1 | oss41 | NULL |
| 20 | 1 | oss42 | NULL |
| 21 | 1 | oss43 | NULL |
+--------+---------------+----------+--------------+
4 rows in set (0.00 sec)
mysql> select * from OST_INFO where HOSTNAME like "oss44" LIMIT 4;
+--------+--------+---------------+----------+---------+-------------+
| OST_ID | OSS_ID | OST_NAME | HOSTNAME | OFFLINE | DEVICE_NAME |
+--------+--------+---------------+----------+---------+-------------+
| 43 | 22 | test1-OST0000 | oss44 | NULL | NULL |
| 44 | 22 | test1-OST0001 | oss44 | NULL | NULL |
+--------+--------+---------------+----------+---------+-------------+
2 rows in set (0.01 sec)
mysql> select * from OST_DATA where OST_ID like "43" or OST_ID like "44" LIMIT 4;
+--------+-------+------------+-------------+---------+-------------+-------------+-------------+-------------+
| OST_ID | TS_ID | READ_BYTES | WRITE_BYTES | PCT_CPU | KBYTES_FREE | KBYTES_USED | INODES_FREE | INODES_USED |
+--------+-------+------------+-------------+---------+-------------+-------------+-------------+-------------+
| 44 | 51610 | 0 | 0 | NULL | 4781340404 | 474820 | 303628200 | 88 |
| 43 | 51610 | 0 | 0 | NULL | 4781344500 | 474820 | 303628200 | 88 |
| 44 | 51611 | 0 | 0 | NULL | 4781340404 | 474820 | 303628200 | 88 |
| 43 | 51611 | 0 | 0 | NULL | 4781344500 | 474820 | 303628200 | 88 |
+--------+-------+------------+-------------+---------+-------------+-------------+-------------+-------------+
4 rows in set (0.00 sec)
mysql> select * from MDS_DATA LIMIT 4;
+--------+-------+----------+-------------+-------------+-------------+-------------+
| MDS_ID | TS_ID | PCT_CPU | KBYTES_FREE | KBYTES_USED | INODES_FREE | INODES_USED |
+--------+-------+----------+-------------+-------------+-------------+-------------+
| 1 | 2 | 0.24975 | 71211604 | 463084 | 17802901 | 86 |
| 1 | 3 | 0.29985 | 71211604 | 463084 | 17802901 | 86 |
| 1 | 4 | 0.29985 | 71211604 | 463084 | 17802901 | 86 |
| 1 | 5 | 0.549176 | 71211604 | 463084 | 17802901 | 86 |
+--------+-------+----------+-------------+-------------+-------------+-------------+
4 rows in set (0.02 sec)
mysql> select * from OSS_DATA where OSS_ID like "22" LIMIT 4;
+--------+-------+----------+------------+
| OSS_ID | TS_ID | PCT_CPU | PCT_MEMORY |
+--------+-------+----------+------------+
| 22 | 2 | 0.62267 | 31.5018 |
| 22 | 3 | 0.1 | 31.5017 |
| 22 | 4 | 0.049975 | 31.5017 |
| 22 | 5 | 0.05 | 31.5017 |
+--------+-------+----------+------------+
4 rows in set (0.00 sec)
Any help/comments/pointers appreciated, tia,
Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100115/14dd94e3/attachment.pgp>
More information about the lustre-discuss
mailing list