[lustre-discuss] free space on ldiskfs vs. zfs

Alexander I Kulyavtsev aik at fnal.gov
Mon Aug 24 14:12:05 PDT 2015


Same question here.

6TB/65TB is 11% . In our case about the same fraction was "missing."

My speculation was, It may happen if at some point between zpool and linux the value reported in TB is interpreted as in TiB, and then converted to TB. Or  unneeded conversion MB to MiB done twice, etc.

Here is my numbers:
We have 12* 4TB drives per pool, it is 48 TB (decimal).
zpool created as raidz2 10+2.
zpool reports  43.5T.
Pool size shall be 48T=4T*12, or 40T=4T*10 (depending what zpool shows, before raiding or after raiding).
>From the Oracle ZFS documentation, "zpool list" returns the total space without overheads, thus 48 TB shall be reported by zpool instead of 43.5TB.

In my case, it looked like conversion error/interpretation issue between TB and TiB:

48*1000*1000*1000*1000/1024/1024/1024/1024 = 43.65574568510055541992


At disk level:

~/sas2ircu 0 display

Device is a Hard disk
  Enclosure #                             : 2
  Slot #                                  : 12
  SAS Address                             : 5003048-0-015a-a918
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 3815447/7814037167
  Manufacturer                            : ATA     
  Model Number                            : HGST HUS724040AL
  Firmware Revision                       : AA70
  Serial No                               : PN2334PBJPW14T
  GUID                                    : 5000cca23de6204b
  Protocol                                : SATA
  Drive Type                              : SATA_HDD

One disk size is about 4 TB (decimal):

3815447*1024*1024 = 4000786153472
7814037167*512  = 4000787029504

vdev presents whole disk to zpool. There is some overhead, some space left on sdq9 .

[root at lfs1 scripts]# head -4 /etc/zfs/vdev_id.conf
alias s0  /dev/disk/by-path/pci-0000:03:00.0-sas-0x50030480015aa90c-lun-0
alias s1  /dev/disk/by-path/pci-0000:03:00.0-sas-0x50030480015aa90d-lun-0
alias s2  /dev/disk/by-path/pci-0000:03:00.0-sas-0x50030480015aa90e-lun-0
alias s3  /dev/disk/by-path/pci-0000:03:00.0-sas-0x50030480015aa90f-lun-0
...
alias s12  /dev/disk/by-path/pci-0000:03:00.0-sas-0x50030480015aa918-lun-0
...

[root at lfs1 scripts]# ls -l  /dev/disk/by-path/
...
lrwxrwxrwx 1 root root  9 Jul 23 16:27 pci-0000:03:00.0-sas-0x50030480015aa918-lun-0 -> ../../sdq
lrwxrwxrwx 1 root root 10 Jul 23 16:27 pci-0000:03:00.0-sas-0x50030480015aa918-lun-0-part1 -> ../../sdq1
lrwxrwxrwx 1 root root 10 Jul 23 16:27 pci-0000:03:00.0-sas-0x50030480015aa918-lun-0-part9 -> ../../sdq9

Pool report:

[root at lfs1 scripts]# zpool list
NAME        SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
zpla-0000  43.5T  10.9T  32.6T         -    16%    24%  1.00x  ONLINE  -
zpla-0001  43.5T  11.0T  32.5T         -    17%    25%  1.00x  ONLINE  -
zpla-0002  43.5T  10.8T  32.7T         -    17%    24%  1.00x  ONLINE  -
[root at lfs1 scripts]# 

[root at lfs1 ~]# zpool list -v zpla-0001
NAME   SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
zpla-0001  43.5T  11.0T  32.5T         -    17%    25%  1.00x  ONLINE  -
  raidz2  43.5T  11.0T  32.5T         -    17%    25%
    s12      -      -      -         -      -      -
    s13      -      -      -         -      -      -
    s14      -      -      -         -      -      -
    s15      -      -      -         -      -      -
    s16      -      -      -         -      -      -
    s17      -      -      -         -      -      -
    s18      -      -      -         -      -      -
    s19      -      -      -         -      -      -
    s20      -      -      -         -      -      -
    s21      -      -      -         -      -      -
    s22      -      -      -         -      -      -
    s23      -      -      -         -      -      -
[root at lfs1 ~]# 

[root at lfs1 ~]# zpool get all zpla-0001
NAME       PROPERTY                    VALUE                       SOURCE
zpla-0001  size                        43.5T                       -
zpla-0001  capacity                    25%                         -
zpla-0001  altroot                     -                           default
zpla-0001  health                      ONLINE                      -
zpla-0001  guid                        5472902975201420000         default
zpla-0001  version                     -                           default
zpla-0001  bootfs                      -                           default
zpla-0001  delegation                  on                          default
zpla-0001  autoreplace                 off                         default
zpla-0001  cachefile                   -                           default
zpla-0001  failmode                    wait                        default
zpla-0001  listsnapshots               off                         default
zpla-0001  autoexpand                  off                         default
zpla-0001  dedupditto                  0                           default
zpla-0001  dedupratio                  1.00x                       -
zpla-0001  free                        32.5T                       -
zpla-0001  allocated                   11.0T                       -
zpla-0001  readonly                    off                         -
zpla-0001  ashift                      12                          local
zpla-0001  comment                     -                           default
zpla-0001  expandsize                  -                           -
zpla-0001  freeing                     0                           default
zpla-0001  fragmentation               17%                         -
zpla-0001  leaked                      0                           default
zpla-0001  feature at async_destroy       enabled                     local
zpla-0001  feature at empty_bpobj         active                      local
zpla-0001  feature at lz4_compress        active                      local
zpla-0001  feature at spacemap_histogram  active                      local
zpla-0001  feature at enabled_txg         active                      local
zpla-0001  feature at hole_birth          active                      local
zpla-0001  feature at extensible_dataset  enabled                     local
zpla-0001  feature at embedded_data       active                      local
zpla-0001  feature at bookmarks           enabled                     local

Alex.

On Aug 19, 2015, at 8:18 AM, Götz Waschk <goetz.waschk at gmail.com> wrote:

> Dear Lustre experts,
> 
> I have configured two different Lustre instances, both using Lustre
> 2.5.3, one with ldiskfs on RAID-6 hardware RAID and one using ZFS and
> RAID-Z2, using the same type of hardware. I was wondering, why I 24 TB
> less space available, when I should have the same amount of parity
> used:
> 
> # lfs df
> UUID                   1K-blocks        Used   Available Use% Mounted on
> fs19-MDT0000_UUID       50322916      472696    46494784   1%
> /testlustre/fs19[MDT:0]
> fs19-OST0000_UUID    51923288320       12672 51923273600   0%
> /testlustre/fs19[OST:0]
> fs19-OST0001_UUID    51923288320       12672 51923273600   0%
> /testlustre/fs19[OST:1]
> fs19-OST0002_UUID    51923288320       12672 51923273600   0%
> /testlustre/fs19[OST:2]
> fs19-OST0003_UUID    51923288320       12672 51923273600   0%
> /testlustre/fs19[OST:3]
> filesystem summary:  207693153280       50688 207693094400   0% /testlustre/fs19
> UUID                   1K-blocks        Used   Available Use% Mounted on
> fs18-MDT0000_UUID       47177700      482152    43550028   1%
> /lustre/fs18[MDT:0]
> fs18-OST0000_UUID    58387106064  6014088200 49452733560  11%
> /lustre/fs18[OST:0]
> fs18-OST0001_UUID    58387106064  5919753028 49547068928  11%
> /lustre/fs18[OST:1]
> fs18-OST0002_UUID    58387106064  5944542316 49522279640  11%
> /lustre/fs18[OST:2]
> fs18-OST0003_UUID    58387106064  5906712004 49560109952  11%
> /lustre/fs18[OST:3]
> filesystem summary:  233548424256 23785095548 198082192080  11% /lustre/fs18
> 
> fs18 is using ldiskfs, while fs19 is ZFS:
> # zpool list
> NAME          SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
> lustre-ost1    65T  18,1M  65,0T     0%  1.00x  ONLINE  -
> # zfs list
> NAME               USED  AVAIL  REFER  MOUNTPOINT
> lustre-ost1       13,6M  48,7T   311K  /lustre-ost1
> lustre-ost1/ost1  12,4M  48,7T  12,4M  /lustre-ost1/ost1
> 
> 
> Any idea on why my 6TB per OST went?
> 
> Regards, Götz Waschk
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



More information about the lustre-discuss mailing list