ZFS

ZFS is a file system and volume manager developed by Sun Microsystems Inc. for enterprise computing systems. It runs on a single server with multiple storage drives, pooling and managing them as a single entity. Additional drives can be added for more capacity. ZFS is highly scalable and supports large file sizes. It stores at least two metadata copies with each data write, including disk sectors, block sizes, and checksums. Checksum algorithms verify data consistency and can repair damaged copies in mirrored or RAID setups.

Challenge

RAIDZ combines storage devices into a single storage with similarities to regular RAID, but it's specifically designed for use with the ZFS file system. While similar to regular RAID, RAIDZ has notable differences, including variable block sizes in each striped row. However, achieving optimal file system performance with modern hardware can be challenging due to the need for fine-tuning.

Tests indicate that RAIDZ and RAIDZ2 offer only half the maximum performance for writes, and reads on striped volumes are surprisingly worse than writes.

To enhance ZFS performance, xiRAID, a software RAID engine by Xinnor, can be utilized.

Potential Applications

  • 1Lustre parallel file system cluster that uses ZFS on object storage devices (OSD) and needs to provide high-performance back end.
  • 2All Flash Backup targets. The backup of modern AFAs requires keeping backups with the performance level in line with the primary storage.
  • 3Data capture solutions. Such solutions can be relevant in the cybersecurity space, 5G base station development and testing, and beyond.
  • 4Storage for video post-production and other Media and Entertainment sequential workloads.

Test Setup

CPU: AMD EPYC 7702P (64 cores @2.0 GHz base)

RAM: 128GB

NVMe: 16 Western Digital UltraStar SN840 3.2TB

OS: Oracle Linux with 5.4.17-2102.203.6.el8uek.x86_64 kernel

xiRAID and RAIDZ Comparison

Sequential writing on RAID6

Workload Performance, GB/s Avg CPU Load, % The most consumed cores
xiRAID RAIDZ2 xiRAID RAIDZ2 xiRAID RAIDZ2
1 queue 32 deep 7.7 0.9 1.8 3.2 1х4% 1х60%
2 queue 32 deep 14.8 1.6 3.5 5.6 2x5% 2х56%
4 queue 32 deep 28.5 3.5 6.8 7.8 4x7% 1х70%
8 queue 32 deep 37.3 3.6 10.3 8.1 8x11% 1х100%

Sequential reading with 2 failed drives

Workload Performance, GB/s Avg CPU Load, % The most consumed cores
xiRAID RAIDZ2 xiRAID RAIDZ2 xiRAID RAIDZ2
1 queue 32 deep 8.0 1.0 3.3 1 1х6% 1х45%
2 queue 32 deep 12.0 1.1 4.2 2.7 No spikes 3х30%
4 queue 32 deep 21.0 2.5 8.4 9 No spikes 4х26%
8 queue 32 deep 37.5 3.0 14.3 17 No spikes 8х20%