[Lustre-discuss] WARNING: data corruption issue found in 1.8.x releases

Johann Lombardi johann at sun.com
Wed Sep 9 08:00:21 PDT 2009


A bug has been identified in the 1.8 releases (1.8.0, 1.8.0.1 & 1.8.1  
are
impacted) that can cause data corruption on the OSTs. This problem is
related to the OSS read cache feature that has been introduced in 1.8.0.
This can happen when a bulk read or write request is aborted due to the
client being evicted or because the data transfer over the network has
timed out. More details are available in bug 20560:
https://bugzilla.lustre.org/show_bug.cgi?id=20560

A patch is under testing and will be included in 1.8.1.1.
Until 1.8.1.1 is available, we recommend to disable the OSS read cache
feature. This feature can be disabled by running the two following
commands on the OSSs:
# lctl set_param obdfilter.*.writethrough_cache_enable=0
# lctl set_param obdfilter.*.read_cache_enable=0

This has to be done each time an OST is restarted.

Best regards,
Johann, for the Lustre team



More information about the lustre-discuss mailing list