[lustre-discuss] Lustre-2.10.6 frequently hangs during OST data migration

Sat Mar 30 10:18:27 PDT 2019

Hi Tung-Han,

Your stack trace looks similar to the one we’ve just seen yesterday on our 2.10.6 system.

I’ve open https://jira.whamcloud.com/browse/LU-12136 to track the issue.

Best,

Stephane

> On Mar 29, 2019, at 8:47 PM, Tung-Han Hsieh <thhsieh at twcp1.phys.ntu.edu.tw> wrote:
> 
> Dear All,
> 
> Our system was recently upgraded to lustre-2.10.6. We are doing the
> data migration from some almost full OSTs to a newly installed file
> server. But we often encountered file system freezed for about 30 secs,
> and then returned to normal (within 5 mins it may happen several times).
> 
> Our procedure is following.
> 
> 1. In the MDS, we prevented data writing to the OSTs which are almost full:
>   echo 0 > /proc/fs/lustre/osc/chome-OST0000-osc-MDT0000/max_create_count
>   echo 0 > /proc/fs/lustre/osc/chome-OST0001-osc-MDT0000/max_create_count
>   echo 0 > /proc/fs/lustre/osc/chome-OST0002-osc-MDT0000/max_create_count
>   ....
> 
> 2. In our system we have 40 OSTs, in which 36 OSTs are almost full so they
>   are all marked by the above command. Our total OST size is 286TB. We are
>   moving out part of their data to the remaining 4 new OSTs by the following
>   standard way:
> 
>   cp -a /path/to/data /path/to/data.tmp
>   mv /path/to/data.tmp /path/to/data
> 
> In the beginning, everything looks smoothly. But after one week of running,
> the progress becoming slower and slower. Then we found that the file system
> often got freezed for a while when the data migration is running. However,
> there is almost no any loading in the whole system.
> 
> It is also strange that during the past week, we did not see any logs in
> 'dmesg' messages of MDT, OSTs, and the client. Until last night, only MDT
> prompted up these 'dmesg' messages:
> 
> ==============================================================================
> [410649.811086] LNet: Service thread pid 3516 was inactive for 200.27s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
> [410649.811175] Pid: 3516, comm: mdt00_003
> [410649.811201]
> [410649.811202] Call Trace:
> [410649.811250]  [<ffffffff81033625>] ? check_preempt_curr+0x75/0xa0
> [410649.811278]  [<ffffffff8103366b>] ? ttwu_do_wakeup+0x1b/0xa0
> [410649.811306]  [<ffffffff810376c1>] ? ttwu_do_activate.constprop.160+0x61/0x70
> [410649.811336]  [<ffffffff8103c40a>] ? try_to_wake_up+0x1da/0x280
> [410649.811367]  [<ffffffff814106ba>] schedule+0x3a/0x50
> [410649.811393]  [<ffffffff81410a95>] schedule_timeout+0x145/0x210
> [410649.811421]  [<ffffffff8104d090>] ? process_timeout+0x0/0x10
> [410649.811451]  [<ffffffffa0b6dc88>] osp_precreate_reserve+0x328/0x8b0 [osp]
> [410649.811484]  [<ffffffffa014f026>] ? do_get_write_access+0x396/0x4d0 [jbd2]
> [410649.811515]  [<ffffffff81112a10>] ? __getblk+0x20/0x2e0
> [410649.811542]  [<ffffffff8103c4b0>] ? default_wake_function+0x0/0x10
> [410649.811571]  [<ffffffffa0b64759>] osp_declare_create+0x1a9/0x680 [osp]
> [410649.811603]  [<ffffffffa0ab2a10>] lod_sub_declare_create+0xe0/0x270 [lod]
> [410649.811633]  [<ffffffffa0aabdc7>] lod_qos_declare_object_on+0xc7/0x3d0 [lod]
> [410649.811664]  [<ffffffffa0aab7fe>] ? lod_statfs_and_check+0xae/0x5b0 [lod]
> [410649.811694]  [<ffffffffa0aacfd4>] lod_alloc_qos.constprop.10+0xe64/0x17b0 [lod]
> [410649.811741]  [<ffffffffa01741b0>] ? ldiskfs_map_blocks+0x180/0x1e0 [ldiskfs][410649.811772]  [<ffffffffa0ab07ea>] lod_qos_prep_create+0x12ea/0x2910 [lod]
> [410649.811803]  [<ffffffffa07c8c94>] ? qsd_op_begin+0x114/0x4d0 [lquota]
> [410649.811833]  [<ffffffffa0ab23f0>] lod_prepare_create+0x2c0/0x410 [lod]
> [410649.811863]  [<ffffffffa0aa7ccd>] lod_declare_striped_create+0x10d/0xa50 [lod]
> [410649.811908]  [<ffffffffa0aaa9b9>] lod_declare_create+0x1e9/0x5a0 [lod]
> [410649.811938]  [<ffffffffa0b1af26>] mdd_declare_create_object_internal+0x116/0x320 [mdd]
> [410649.811983]  [<ffffffffa0b00c9c>] mdd_declare_create_object.isra.19+0x3c/0xbb0 [mdd]
> [410649.812028]  [<ffffffffa0b00044>] ? mdd_linkea_prepare+0x294/0x590 [mdd]
> [410649.812058]  [<ffffffffa0b0f90e>] mdd_create+0x88e/0x27d0 [mdd]
> [410649.812088]  [<ffffffffa08173e0>] ? osd_xattr_get+0x80/0x890 [osd_ldiskfs]
> [410649.812120]  [<ffffffffa09fd3ff>] mdt_reint_open+0x225f/0x3890 [mdt]
> [410649.812158]  [<ffffffffa0431276>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
> [410649.812191]  [<ffffffffa02ea3fa>] ? upcall_cache_get_entry+0x29a/0x890 [obdclass]
> [410649.812237]  [<ffffffffa02ef409>] ? lu_ucred+0x19/0x30 [obdclass]
> [410649.812267]  [<ffffffffa09df7db>] ? ucred_set_jobid+0x5b/0x70 [mdt]
> [410649.812297]  [<ffffffffa09f1810>] mdt_reint_rec+0xa0/0x210 [mdt]
> [410649.812326]  [<ffffffffa09de91d>] mdt_reint_internal+0x63d/0xa50 [mdt]
> [410649.812356]  [<ffffffffa09df07a>] mdt_intent_reint+0x21a/0x430 [mdt]
> [410649.812385]  [<ffffffffa09da7ed>] mdt_intent_policy+0x5bd/0xde0 [mdt]
> [410649.812418]  [<ffffffffa03aa257>] ldlm_lock_enqueue+0x3a7/0x9c0 [ptlrpc]
> [410649.812453]  [<ffffffffa03d28e3>] ldlm_handle_enqueue0+0x9c3/0x1790 [ptlrpc]
> [410649.812490]  [<ffffffffa04211c0>] ? req_capsule_client_get+0x10/0x20 [ptlrpc]
> [410649.812541]  [<ffffffffa045d16c>] ? tgt_request_preprocess.isra.17+0x25c/0x1250 [ptlrpc]
> [410649.812593]  [<ffffffffa044ee95>] ? tgt_lookup_reply+0x35/0x1c0 [ptlrpc]
> [410649.812629]  [<ffffffffa045b74d>] tgt_enqueue+0x5d/0x250 [ptlrpc]
> [410649.812664]  [<ffffffffa045ed1d>] tgt_request_handle+0x8ad/0x15a0 [ptlrpc]
> [410649.812701]  [<ffffffffa03f84a4>] ? lustre_msg_get_transno+0x84/0x100 [ptlrpc]
> [410649.812752]  [<ffffffffa04084e1>] ptlrpc_main+0x1051/0x2a40 [ptlrpc]
> [410649.812780]  [<ffffffff8140ff44>] ? __schedule+0x294/0x940
> [410649.812815]  [<ffffffffa0407490>] ? ptlrpc_main+0x0/0x2a40 [ptlrpc]
> [410649.812844]  [<ffffffff8105c817>] kthread+0x87/0x90
> [410649.812870]  [<ffffffff81413c34>] kernel_thread_helper+0x4/0x10
> [410649.812898]  [<ffffffff8105c790>] ? kthread+0x0/0x90
> [410649.812923]  [<ffffffff81413c30>] ? kernel_thread_helper+0x0/0x10
> [410649.812950]
> [410649.812970] LustreError: dumping log to /tmp/lustre-log.1553888141.3516
> [410649.818730] wanted to write 3985 but wrote 3518
> [410749.275270] LNet: Service thread pid 3516 completed after 299.99s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
> ==============================================================================
> 
> Our lustre-2.10.6 was compiled with sles11sp3 kernel
> 
> 	linux-3.0.101-138.gcdbe806
> 
> using ldiskfs backend. The hardware spec of our MDS is:
> 
> CPU: Intel Xeon E5640 @ 2.67GHz (single CPU)
> RAM: 8 GB
> MGS: 1GB (under RAID 1)
> MDT: 230GB (under RAID 1)
> RAID controller: LSI ServeRAID M1015 SAS/SATA Controller
> 
> 
> Is there any suggestion to fix this problem ?
> 
> Thank you very much.
> 
> 
> T.H.Hsieh
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org