Disaggregated storage on DPU

xiRAID on BlueField3 DPU offers a powerful, efficient, and flexible solution for data-intensive applications. By offloading RAID calculations and leveraging advanced DPU technology, xiRAID delivers top-tier performance while reducing power consumption and operational complexity. This makes it an ideal solution for organizations looking to optimize their storage infrastructure for AI and other demanding applications.

Why use Disaggregated Storage?

Enhanced Security

Ensure robust security without needing specialized software or hardware installations, reducing user complexity.

Cost Savings

Minimize CPU consumption and eliminate the need for third-party storage solutions, resulting in significant cost savings.

Flexible Storage Management

Leverage disaggregation with SNAP technology for dynamic storage capacity changes, simplifying management and eliminating NVMe-oF complexities.

Offloading CPU to DPU

Implementation

Network drives are visible through BlueField3 network 200Gbs ports. xiRAID Opus implements RAID in BlueField3 DPU, exposing them to the host via SNAP.

Offloading CPU to DPU

Advantages:

  • Serverless Storage: Zero CPU consumption with serverless storage implementation.
  • Dynamic Capacity: Change storage capacity on the fly via SNAP.
  • Enhanced Security: No need for specialized software or hardware.

xiRAID Performance on BlueField3 DPU

ARM

Challenge:

Data-intensive applications like AI, machine learning, big data, and relational databases demand high-speed data access. While NVMe SSDs offer higher bandwidth, traditional RAID implementations lag behind. xiRAID is designed to provide the fastest possible RAID protection for NVMe SSD drives, while also reducing power consumption by offloading RAID calculations to the Arm-based DPU.

Performance Results:

Utilizing 6x Samsung PM9A3 3.84TB NVMe drives connected via the nvme-rdma driver over a 200Gbit/s InfiniBand port, xiRAID achieves outstanding performance. Testing shows that xiRAID reaches 60-100% of the theoretical maximum performance in both RAID5 and RAID6 configurations.

  Sequential
Write (GB/s)
Sequential
Read (GB/s)
Random Write
(K IOPS)
Random Read
(K IOPS)
Raw drives 16 24 2,064 4,080
xiRAID (RAID5) 11 24 447 2,351
xiRAID (RAID6) 8.2 24 328 2,352

xiRAID offloaded on BlueField3 DPU achieves 60-100% of theoretical performance in both RAID5 and RAID6.

Use Case: Serverless Disaggregated Storage for AI

xiRAID running on DPU is transforming storage implementation at scale. Cloud customers can now connect NVIDIA SuperPod GPU systems to drives over the network without dedicated storage servers. This significantly reduces the cost of deploying fast storage and the associated power requirements. xiRAID integrates seamlessly with any host operating system and hypervisor, ensuring compatibility and ease of deployment across diverse environments.

Use Case: Serverless Disaggregated Storage for AI

Benefits

Disaggregated Storage

Optimize server resources and reduce power consumption

Server Optimization

No need for dedicated storage
servers

Network Card Optimization

Leverage high-speed network connectivity

No Client-Side Software

Simplify deployments with no need for client-side software

Compatibility

Works with any host OS
and hypervisor