[lustre-discuss] Error on a zpool underlying an OST
Bob Ball
ball at umich.edu
Fri Mar 11 17:19:10 PST 2016
Hi, we have Lustre 2.7.58 in place on our OST and MDT/MGS (combined).
Underlying the lustre file system is a raid-z2 zfs pool.
A few days ago, we lost 2 disks at once from the raid-z2. I replaced
one and a resilver started, that seemed to choke. So, I put back both
disks with replacements, and the new re-silver shows the following now.
[root at umdist03 ~]# zpool status -v ost-007
pool: ost-007
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://zfsonlinux.org/msg/ZFS-8000-8A
scan: resilvered 972G in 9h25m with 1 errors on Fri Mar 11 19:12:37 2016
config:
NAME STATE READ WRITE CKSUM
ost-007 DEGRADED 0 0 1
raidz2-0 DEGRADED 0 0 4
replacing-0 DEGRADED 0 0 0
18280868502819750645 UNAVAIL 0 0 0
was /dev/disk/by-path/pci-0000:0c:00.0-scsi-0:2:20:0-part1/old
pci-0000:0c:00.0-scsi-0:2:20:0 ONLINE 0 0 0
pci-0000:0c:00.0-scsi-0:2:21:0 ONLINE 0 0 0
pci-0000:0c:00.0-scsi-0:2:22:0 ONLINE 0 0 0
pci-0000:0c:00.0-scsi-0:2:23:0 ONLINE 0 0 0
pci-0000:0c:00.0-scsi-0:2:24:0 ONLINE 0 0 0
pci-0000:0c:00.0-scsi-0:2:35:0 ONLINE 0 0 0
pci-0000:0c:00.0-scsi-0:2:36:0 ONLINE 1 0 0
pci-0000:0c:00.0-scsi-0:2:37:0 ONLINE 0 0 0
pci-0000:0c:00.0-scsi-0:2:38:0 ONLINE 0 0 0
replacing-9 UNAVAIL 0 0 0
14369532488179106769 UNAVAIL 0 0 0
was /dev/disk/by-path/pci-0000:0c:00.0-scsi-0:2:39:0-part1/old
pci-0000:0c:00.0-scsi-0:2:39:0 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
ost-007/ost0030:<0x2c90f>
what are my options here? If I don't care about the file, can I
identify it and then just delete it? Or is my only real option to drain
the pool and rebuild it cleanly?
Thanks for any help/advice.
bob
More information about the lustre-discuss
mailing list