[lustre-discuss] free space on ldiskfs vs. zfs
Alexander I Kulyavtsev
aik at fnal.gov
Mon Aug 24 14:12:05 PDT 2015
Same question here.
6TB/65TB is 11% . In our case about the same fraction was "missing."
My speculation was, It may happen if at some point between zpool and linux the value reported in TB is interpreted as in TiB, and then converted to TB. Or unneeded conversion MB to MiB done twice, etc.
Here is my numbers:
We have 12* 4TB drives per pool, it is 48 TB (decimal).
zpool created as raidz2 10+2.
zpool reports 43.5T.
Pool size shall be 48T=4T*12, or 40T=4T*10 (depending what zpool shows, before raiding or after raiding).
>From the Oracle ZFS documentation, "zpool list" returns the total space without overheads, thus 48 TB shall be reported by zpool instead of 43.5TB.
In my case, it looked like conversion error/interpretation issue between TB and TiB:
48*1000*1000*1000*1000/1024/1024/1024/1024 = 43.65574568510055541992
At disk level:
~/sas2ircu 0 display
Device is a Hard disk
Enclosure # : 2
Slot # : 12
SAS Address : 5003048-0-015a-a918
State : Ready (RDY)
Size (in MB)/(in sectors) : 3815447/7814037167
Manufacturer : ATA
Model Number : HGST HUS724040AL
Firmware Revision : AA70
Serial No : PN2334PBJPW14T
GUID : 5000cca23de6204b
Protocol : SATA
Drive Type : SATA_HDD
One disk size is about 4 TB (decimal):
3815447*1024*1024 = 4000786153472
7814037167*512 = 4000787029504
vdev presents whole disk to zpool. There is some overhead, some space left on sdq9 .
[root at lfs1 scripts]# head -4 /etc/zfs/vdev_id.conf
alias s0 /dev/disk/by-path/pci-0000:03:00.0-sas-0x50030480015aa90c-lun-0
alias s1 /dev/disk/by-path/pci-0000:03:00.0-sas-0x50030480015aa90d-lun-0
alias s2 /dev/disk/by-path/pci-0000:03:00.0-sas-0x50030480015aa90e-lun-0
alias s3 /dev/disk/by-path/pci-0000:03:00.0-sas-0x50030480015aa90f-lun-0
...
alias s12 /dev/disk/by-path/pci-0000:03:00.0-sas-0x50030480015aa918-lun-0
...
[root at lfs1 scripts]# ls -l /dev/disk/by-path/
...
lrwxrwxrwx 1 root root 9 Jul 23 16:27 pci-0000:03:00.0-sas-0x50030480015aa918-lun-0 -> ../../sdq
lrwxrwxrwx 1 root root 10 Jul 23 16:27 pci-0000:03:00.0-sas-0x50030480015aa918-lun-0-part1 -> ../../sdq1
lrwxrwxrwx 1 root root 10 Jul 23 16:27 pci-0000:03:00.0-sas-0x50030480015aa918-lun-0-part9 -> ../../sdq9
Pool report:
[root at lfs1 scripts]# zpool list
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
zpla-0000 43.5T 10.9T 32.6T - 16% 24% 1.00x ONLINE -
zpla-0001 43.5T 11.0T 32.5T - 17% 25% 1.00x ONLINE -
zpla-0002 43.5T 10.8T 32.7T - 17% 24% 1.00x ONLINE -
[root at lfs1 scripts]#
[root at lfs1 ~]# zpool list -v zpla-0001
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
zpla-0001 43.5T 11.0T 32.5T - 17% 25% 1.00x ONLINE -
raidz2 43.5T 11.0T 32.5T - 17% 25%
s12 - - - - - -
s13 - - - - - -
s14 - - - - - -
s15 - - - - - -
s16 - - - - - -
s17 - - - - - -
s18 - - - - - -
s19 - - - - - -
s20 - - - - - -
s21 - - - - - -
s22 - - - - - -
s23 - - - - - -
[root at lfs1 ~]#
[root at lfs1 ~]# zpool get all zpla-0001
NAME PROPERTY VALUE SOURCE
zpla-0001 size 43.5T -
zpla-0001 capacity 25% -
zpla-0001 altroot - default
zpla-0001 health ONLINE -
zpla-0001 guid 5472902975201420000 default
zpla-0001 version - default
zpla-0001 bootfs - default
zpla-0001 delegation on default
zpla-0001 autoreplace off default
zpla-0001 cachefile - default
zpla-0001 failmode wait default
zpla-0001 listsnapshots off default
zpla-0001 autoexpand off default
zpla-0001 dedupditto 0 default
zpla-0001 dedupratio 1.00x -
zpla-0001 free 32.5T -
zpla-0001 allocated 11.0T -
zpla-0001 readonly off -
zpla-0001 ashift 12 local
zpla-0001 comment - default
zpla-0001 expandsize - -
zpla-0001 freeing 0 default
zpla-0001 fragmentation 17% -
zpla-0001 leaked 0 default
zpla-0001 feature at async_destroy enabled local
zpla-0001 feature at empty_bpobj active local
zpla-0001 feature at lz4_compress active local
zpla-0001 feature at spacemap_histogram active local
zpla-0001 feature at enabled_txg active local
zpla-0001 feature at hole_birth active local
zpla-0001 feature at extensible_dataset enabled local
zpla-0001 feature at embedded_data active local
zpla-0001 feature at bookmarks enabled local
Alex.
On Aug 19, 2015, at 8:18 AM, Götz Waschk <goetz.waschk at gmail.com> wrote:
> Dear Lustre experts,
>
> I have configured two different Lustre instances, both using Lustre
> 2.5.3, one with ldiskfs on RAID-6 hardware RAID and one using ZFS and
> RAID-Z2, using the same type of hardware. I was wondering, why I 24 TB
> less space available, when I should have the same amount of parity
> used:
>
> # lfs df
> UUID 1K-blocks Used Available Use% Mounted on
> fs19-MDT0000_UUID 50322916 472696 46494784 1%
> /testlustre/fs19[MDT:0]
> fs19-OST0000_UUID 51923288320 12672 51923273600 0%
> /testlustre/fs19[OST:0]
> fs19-OST0001_UUID 51923288320 12672 51923273600 0%
> /testlustre/fs19[OST:1]
> fs19-OST0002_UUID 51923288320 12672 51923273600 0%
> /testlustre/fs19[OST:2]
> fs19-OST0003_UUID 51923288320 12672 51923273600 0%
> /testlustre/fs19[OST:3]
> filesystem summary: 207693153280 50688 207693094400 0% /testlustre/fs19
> UUID 1K-blocks Used Available Use% Mounted on
> fs18-MDT0000_UUID 47177700 482152 43550028 1%
> /lustre/fs18[MDT:0]
> fs18-OST0000_UUID 58387106064 6014088200 49452733560 11%
> /lustre/fs18[OST:0]
> fs18-OST0001_UUID 58387106064 5919753028 49547068928 11%
> /lustre/fs18[OST:1]
> fs18-OST0002_UUID 58387106064 5944542316 49522279640 11%
> /lustre/fs18[OST:2]
> fs18-OST0003_UUID 58387106064 5906712004 49560109952 11%
> /lustre/fs18[OST:3]
> filesystem summary: 233548424256 23785095548 198082192080 11% /lustre/fs18
>
> fs18 is using ldiskfs, while fs19 is ZFS:
> # zpool list
> NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
> lustre-ost1 65T 18,1M 65,0T 0% 1.00x ONLINE -
> # zfs list
> NAME USED AVAIL REFER MOUNTPOINT
> lustre-ost1 13,6M 48,7T 311K /lustre-ost1
> lustre-ost1/ost1 12,4M 48,7T 12,4M /lustre-ost1/ost1
>
>
> Any idea on why my 6TB per OST went?
>
> Regards, Götz Waschk
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
More information about the lustre-discuss
mailing list