[Lustre-discuss] md and mdadm in 1.8.7-wc1

Samuel Aparicio saparicio at bccrc.ca
Mon Mar 19 22:05:42 PDT 2012


I am wondering if anyone has experienced issues with md / mdadm in the 1.8.7-wc1 patched server kernels.?
we have historically used software raid on our OSS machines because it provided a 20-30% throughput in our hands, over
raid provided from our storage arrays (coraid ATA over ethernet shelves). In 1.8.5 this has worked more or less flawlessly,
but we now have new storage, with 3Tb rather than 2Tb disks and new servers with 1.8.7-wc1 patched kernels.

md is unable to reliably shut down and restart arrays after the machines have been rebooted (cleanly) - the disks are no
longer recognized as part of the arrays they were created within. In the kernel log we have seen the following messages below,
which include the following:

 md: bug in file drivers/md/md.c, line 1677

looking through the mdadm changelogs, it seems like there are some possible patches for md in 2.6.18 kernels but I cannot tell
if they are applied here, or whether this is even relevant.

I am not clear whether this is an issue with 3Tb disks, or something else related to mdadm and the patched server kernel. My suspicion
is that something has broken with  > 2.2Tb disks.

Does anyone have any ideas about this?

thanks
sam aparicio

---------------
Mar 19 21:34:48 OST3 kernel: md:        **********************************
Mar 19 21:34:48 OST3 kernel: 
Mar 19 21:35:20 OST3 kernel: md: bug in file drivers/md/md.c, line 1677
Mar 19 21:35:20 OST3 kernel: 
Mar 19 21:35:20 OST3 kernel: md:        **********************************
Mar 19 21:35:20 OST3 kernel: md:        * <COMPLETE RAID STATE PRINTOUT> *
Mar 19 21:35:20 OST3 kernel: md:        **********************************
Mar 19 21:35:20 OST3 kernel: md142: 
Mar 19 21:35:20 OST3 kernel: md141: 
Mar 19 21:35:20 OST3 kernel: md140: <etherd/e14.16><etherd/e14.15><etherd/e14.14><etherd/e14.13><etherd/e14.12><etherd/e14.11><etherd/e14.10><etherd/e14.9><etherd/e14.8><etherd/e14.7><etherd/e14.6><etherd
/e14.5><etherd/e14.4><etherd/e14.3><etherd/e14.2><etherd/e14.1><etherd/e14.0>
Mar 19 21:35:20 OST3 kernel: md: rdev etherd/e14.16, SZ:2930265344 F:0 S:0 DN:16
Mar 19 21:35:20 OST3 kernel: md: rdev superblock:
Mar 19 21:35:20 OST3 kernel: md:  SB: (V:1.0.0) ID:<9859f274.34313a61.00000030.00000000> CT:5d3314af
Mar 19 21:35:20 OST3 kernel: md:     L234772919 S861164367 ND:1970037550 RD:1919251571 md1667457582 LO:65536 CS:196610
Mar 19 21:35:20 OST3 kernel: md:     UT:00000800 ST:0 AD:1565563648 WD:1 FD:8 SD:0 CSUM:00000000 E:00000000
Mar 19 21:35:20 OST3 kernel:      D  0:  DISK<N:-1,(-1,-1),R:-1,S:-1>
Mar 19 21:35:20 OST3 kernel:      D  1:  DISK<N:-1,(-1,-1),R:-1,S:-1>
Mar 19 21:35:20 OST3 kernel:      D  2:  DISK<N:-1,(-1,-1),R:-1,S:-1>
Mar 19 21:35:20 OST3 kernel:      D  3:  DISK<N:-1,(-1,-1),R:-1,S:-1>
Mar 19 21:35:20 OST3 kernel: md:     THIS:  DISK<N:0,(0,0),R:0,S:0>
< output truncated >








Professor Samuel Aparicio BM BCh PhD FRCPath
Nan and Lorraine Robertson Chair UBC/BC Cancer Agency
675 West 10th, Vancouver V5Z 1L3, Canada.
office: +1 604 675 8200 lab website http://molonc.bccrc.ca










More information about the lustre-discuss mailing list