[Lustre-discuss] Unable to get lmt to work properly

Peter Kjellstrom cap at nsc.liu.se
Fri Jan 15 05:41:53 PST 2010


I first posted this on lmt-discuss but I'm not sure it's alive. Hopefully
someone here can shed some light on it.

The setup is fairly trivial, 2 OSTs per OSS using lustre 1.6.7.1 (more details
below).

Tia,
 Peter

--------------------------------------------------------------

I have two problems:

1) I only see one (out of two) OSTs per OSS when I run "cerebro-stat -m lmt_ost":

 [root at oss44 ~]# cerebro-stat -m lmt_ost | grep oss44
 oss44: 1.0;oss44;test1-OST0001;303628200;303628288;4779243244;4781815224;2147483648;2147483648

 [root at oss44 ~]# ls /proc/fs/lustre/obdfilter/
 num_refs  test1-OST0000  test1-OST0001

 [root at oss44 ~]# df -t lustre
 Filesystem           1K-blocks      Used Available Use% Mounted on
 /dev/oss44_vg/ost1   4781819320   3620560 4729618276   1% /lustre/ost1
 /dev/oss44_vg/ost2   4781815224   2571980 4730662800   1% /lustre/ost2

 [root at oss44 ~]# uname -r
 2.6.18-92.1.17.el5_lustre.1.6.7.1smp

 [root at oss44 ~]# rpm -qa  | grep lmt
 lmt-client-2.6.3-1.ch4.2
 lmt-server-agent-2.6.3-1.ch4.2
 lmt-server-2.6.3-1.ch4.2
 [root at oss44 ~]# rpm -qa  | grep cerebro
 cerebro-1.9-1

2) Something's probably strange with the database but I don't know what
really (originally no OST_INFO was generated):

Symptoms include "empty" ltop output (ltop -X shows all OSSes but only
CPU-usage is correct, all other "***")

And the cron-script writes log-files like this:
#####################
#    OST - test1
#####################

Updating hourly ost agg table for test1
Updating OST_AGGREGATE_HOUR for test1...
[LMT::connect] - Unknown filesystem: test1
Updating other ost agg tables for test1
[LMT::connect] - Unknown filesystem: test1
Updating filesys-level ost tables for test1
[LMT::connect] - Unknown filesystem: test1
#####################
#   ROUTER - test1
#####################

Updating hourly router agg table for test1
Updating ROUTER_AGGREGATE_HOUR for test1...
[LMT::connect] - Unknown filesystem: test1
Updating other router agg tables for test1
[LMT::connect] - Unknown filesystem: test1
#####################
#    MDS - test1
#####################

Updating hourly mds agg table for test1
Updating MDS_AGGREGATE_HOUR for test1...
[LMT::connect] - Unknown filesystem: test1
Updating other mds agg tables for test1
[LMT::connect] - Unknown filesystem: test1

*** Aggregate Table Update Complete ***

Here's the database config (I think it includes the important bits).

mysql> show tables;
+----------------------------+
| Tables_in_filesystem_test1 |
+----------------------------+
| EVENT_DATA                 |
| EVENT_INFO                 |
| FILESYSTEM_AGGREGATE_DAY   |
| FILESYSTEM_AGGREGATE_HOUR  |
| FILESYSTEM_AGGREGATE_MONTH |
| FILESYSTEM_AGGREGATE_WEEK  |
| FILESYSTEM_AGGREGATE_YEAR  |
| FILESYSTEM_INFO            |
| MDS_AGGREGATE_DAY          |
| MDS_AGGREGATE_HOUR         |
| MDS_AGGREGATE_MONTH        |
| MDS_AGGREGATE_WEEK         |
| MDS_AGGREGATE_YEAR         |
| MDS_DATA                   |
| MDS_INFO                   |
| MDS_OPS_DATA               |
| MDS_VARIABLE_INFO          |
| OPERATION_INFO             |
| OSS_DATA                   |
| OSS_INFO                   |
| OSS_INTERFACE_DATA         |
| OSS_INTERFACE_INFO         |
| OSS_VARIABLE_INFO          |
| OST_AGGREGATE_DAY          |
| OST_AGGREGATE_HOUR         |
| OST_AGGREGATE_MONTH        |
| OST_AGGREGATE_WEEK         |
| OST_AGGREGATE_YEAR         |
| OST_DATA                   |
| OST_INFO                   |
| OST_OPS_DATA               |
| OST_VARIABLE_INFO          |
| ROUTER_AGGREGATE_DAY       |
| ROUTER_AGGREGATE_HOUR      |
| ROUTER_AGGREGATE_MONTH     |
| ROUTER_AGGREGATE_WEEK      |
| ROUTER_AGGREGATE_YEAR      |
| ROUTER_DATA                |
| ROUTER_INFO                |
| ROUTER_VARIABLE_INFO       |
| TIMESTAMP_INFO             |
| VERSION                    |
+----------------------------+
42 rows in set (0.00 sec)

mysql> select * from FILESYSTEM_INFO;
+---------------+-----------------+-----------------------+----------------+
| FILESYSTEM_ID | FILESYSTEM_NAME | FILESYSTEM_MOUNT_NAME | SCHEMA_VERSION |
+---------------+-----------------+-----------------------+----------------+
|             1 | test1           |                       | 1.1 |
+---------------+-----------------+-----------------------+----------------+
1 row in set (0.00 sec)

mysql> select * from MDS_INFO;
+--------+---------------+---------------+----------+-------------+
| MDS_ID | FILESYSTEM_ID | MDS_NAME      | HOSTNAME | DEVICE_NAME |
+--------+---------------+---------------+----------+-------------+
|      1 |             1 | test1-MDT0000 | mds4     |             |
+--------+---------------+---------------+----------+-------------+
1 row in set (0.00 sec)

mysql> select * from OSS_INFO where HOSTNAME like "oss4%" LIMIT 4;
+--------+---------------+----------+--------------+
| OSS_ID | FILESYSTEM_ID | HOSTNAME | FAILOVERHOST |
+--------+---------------+----------+--------------+
|     18 |             1 | oss40    | NULL         |
|     19 |             1 | oss41    | NULL         |
|     20 |             1 | oss42    | NULL         |
|     21 |             1 | oss43    | NULL         |
+--------+---------------+----------+--------------+
4 rows in set (0.00 sec)

mysql> select * from OST_INFO where HOSTNAME like "oss44" LIMIT 4;
+--------+--------+---------------+----------+---------+-------------+
| OST_ID | OSS_ID | OST_NAME      | HOSTNAME | OFFLINE | DEVICE_NAME |
+--------+--------+---------------+----------+---------+-------------+
|     43 |     22 | test1-OST0000 | oss44    |    NULL | NULL        |
|     44 |     22 | test1-OST0001 | oss44    |    NULL | NULL        |
+--------+--------+---------------+----------+---------+-------------+
2 rows in set (0.01 sec)

mysql> select * from OST_DATA where OST_ID like "43" or OST_ID like "44" LIMIT 4;
+--------+-------+------------+-------------+---------+-------------+-------------+-------------+-------------+
| OST_ID | TS_ID | READ_BYTES | WRITE_BYTES | PCT_CPU | KBYTES_FREE | KBYTES_USED | INODES_FREE | INODES_USED |
+--------+-------+------------+-------------+---------+-------------+-------------+-------------+-------------+
|     44 | 51610 |          0 |           0 |    NULL |  4781340404 |      474820 |   303628200 |          88 |
|     43 | 51610 |          0 |           0 |    NULL |  4781344500 |      474820 |   303628200 |          88 |
|     44 | 51611 |          0 |           0 |    NULL |  4781340404 |      474820 |   303628200 |          88 |
|     43 | 51611 |          0 |           0 |    NULL |  4781344500 |      474820 |   303628200 |          88 |
+--------+-------+------------+-------------+---------+-------------+-------------+-------------+-------------+
4 rows in set (0.00 sec)

mysql> select * from MDS_DATA  LIMIT 4;
+--------+-------+----------+-------------+-------------+-------------+-------------+
| MDS_ID | TS_ID | PCT_CPU  | KBYTES_FREE | KBYTES_USED | INODES_FREE | INODES_USED |
+--------+-------+----------+-------------+-------------+-------------+-------------+
|      1 |     2 |  0.24975 |    71211604 |      463084 |    17802901 |          86 |
|      1 |     3 |  0.29985 |    71211604 |      463084 |    17802901 |          86 |
|      1 |     4 |  0.29985 |    71211604 |      463084 |    17802901 |          86 |
|      1 |     5 | 0.549176 |    71211604 |      463084 |    17802901 |          86 |
+--------+-------+----------+-------------+-------------+-------------+-------------+
4 rows in set (0.02 sec)

mysql> select * from OSS_DATA where OSS_ID like "22" LIMIT 4;
+--------+-------+----------+------------+
| OSS_ID | TS_ID | PCT_CPU  | PCT_MEMORY |
+--------+-------+----------+------------+
|     22 |     2 |  0.62267 |    31.5018 |
|     22 |     3 |      0.1 |    31.5017 |
|     22 |     4 | 0.049975 |    31.5017 |
|     22 |     5 |     0.05 |    31.5017 |
+--------+-------+----------+------------+
4 rows in set (0.00 sec)

Any help/comments/pointers appreciated, tia,
 Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100115/14dd94e3/attachment.pgp>


More information about the lustre-discuss mailing list