[Lustre-discuss] high latency and hanging reads

robert spam.robert at risefx.com
Tue Mar 30 11:14:55 PDT 2010


Hi,

We are running a little Lustre test system with 3 oss, 9 osts and a
combined mds & client. All systems are running centos 5.2 and lustre
1.8.1.1 on a DDR IB network. All machines are single quadcore xeons with
2GB (oss) or 8GB (mds) of RAM. The OSTs are 8x1.5TB SATA disks in raid 5
config. The mds uses an Intel SSD as its MDT.

After first tests in the production environment we are experiencing two
problems:

1. All read access shows high latency even on low load. Browsing through
directories with several hundred files from 1..10MB takes up to
20seconds. Running rsync on a dir from the lustre mount counts as low as
100 files per second. I know that handling a huge number of files is not
a typical strength of lustre, but this feels a bit extreme. However,
once reading, the system delivers several hundred MB/s if the files are
big enough (few GB).

2. Copying or rsyncing a local dir from- and -to the lustre mount
sometimes hangs for a short time and sometimes hangs until stopped
manually.

The client also exports the lustre mount to samba for our windows
clients. Actually, this will be the main usage for a production system.
The messages on failed writes on the windows side show ether "invalid
paramter" or "permission denied"

On the mds/client dmesg shows something like this:
-----------------------
BUG: warning at fs/inotify.c:202/set_dentry_child_flags() (Tainted: G     )

Call Trace:
 [<ffffffff800f1788>] set_dentry_child_flags+0xef/0x14d
 [<ffffffff800f181e>] remove_watch_no_event+0x38/0x47
 [<ffffffff800f1845>] inotify_remove_watch_locked+0x18/0x3b
 [<ffffffff800f1980>] inotify_rm_wd+0x8d/0xb6
 [<ffffffff800f1ef6>] sys_inotify_rm_watch+0x46/0x63
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0
-----------------------
Anyone got an idea?

Thank you!

Robert





More information about the lustre-discuss mailing list