[Lustre-discuss] LBUGS on metadata servers

Nirmal Seenu nirmal at fnal.gov
Tue Apr 20 13:26:07 PDT 2010


We have been seeing the following LBUGS/errors on the Lustre Metadata 
servers for a while now and it these got triggered twice with in the 
last week.

Apr 20 09:10:25 lustre1 kernel: LustreError: 
5973:0:(ldlm_lock.c:165:ldlm_lock_put()) 
ASSERTION(atomic_read(&(lock->l_export)->exp_refcount) < 0x5a5a5a) failed

Apr 20 09:10:26 lustre1 kernel: LustreError: 
6196:0:(client.c:178:ptlrpc_free_bulk()) 
ASSERTION(atomic_read(&(desc->bd_export)->exp_refcount) < 0x5a5a5a) failed

Apr 20 09:10:26 lustre1 kernel: LustreError: 
6174:0:(service.c:843:ptlrpc_at_send_early_reply()) 
ASSERTION(atomic_read(&(reqcopy->rq_export)->exp_refcount) < 0x5a5a5a) 
failed

Apr 20 09:10:32 lustre1 kernel: LustreError: 
6154:0:(ldlm_lib.c:812:target_handle_connect()) 
ASSERTION(atomic_read(&(export)->exp_refcount) < 0x5a5a5a) failed

Apr 20 09:14:31 lustre1 kernel: LustreError: 
5956:0:(obd_config.c:1491:nid_export_put()) 
ASSERTION(atomic_read(&(exp)->exp_refcount) < 0x5a5a5a) failed

Apr 20 09:15:57 lustre1 kernel: LustreError: 
5992:0:(service.c:1361:ptlrpc_server_handle_request()) 
ASSERTION(atomic_read(&(export)->exp_refcount) < 0x5a5a5a) failed


This cluster uses GigE network with bonding enabled on the Lustre servers.
Lustre servers are currently running: 2.6.18-128.7.1.el5_lustre.1.8.1.1
Clients run the RHEL5 kernel 2.6.18-128.7.1.el5 with lustre 1.8.1.1 
patchless clients.

Could you please let me know if this is a known problems. If not, I 
would be more than happy to open a bugzilla entry with the relevant logs 
from the metadata server and one of the clients.

Thanks in advance.
Nirmal



More information about the lustre-discuss mailing list