NVMe-oF RDMA

RDMA transport requires an RDMA-capable NIC with its corresponding OFED (OpenFabrics Enterprise Distribution) software package. Before starting the NVMe-oF target with the RDMA transport, you should load the InfiniBand and RDMA modules that allow user space processes to use InfiniBand/RDMA directly. Here is a list of kernel modules that may be required: ib_cm, ib_core, ib_ucm (for older kernel versions), ib_umad, ib_uverbs, iw_cm, rdma_cm, rdma_ucm. You need to detect RDMA-compatible NICs and assign them IP addresses. You may also need to install NIC-specific kernel modules like mlx5_core, mlx5_ib.

RDMA NICs impose a limit on the number of memory regions that can be registered. The xiRAID Opus engine may begin to fail in allocating more directly accessed memory (DMA) when this limit is reached. This issue is most likely to arise when there are too many 2MB hugepages reserved at runtime. One common memory bottleneck is the number of NIC memory regions, with some NICs having a maximum limit of 2048 memory regions. This effectively sets a 4GB memory limit with 2MB hugepages for the total memory regions. Using pre-reserved 1GB hugepages memory can help overcome this limitation.

Another known issue arises when using the E810 NICs in RoCE mode. Specifically, the NVMe-oF target may encounter difficulty in destroying a qpair due to unflushed posted work requests. This can result in the NVMe-oF target being unable to terminate cleanly.

Similar to TCP the RDMA target can be configured by the following commands:

xnr_cli nvmf transport create --trtype RDMA 
xnr_cli nvmf subsystem create --nqn nqn.2018-09.io.xinnor:node1 --serial-number XNR00001 --model-number 'Xinnor Raid' --allow-any-host 
xnr_cli nvmf listener add --nqn nqn.2018-09.io.xinnor:node1 --trtype rdma --adrfam ipv4 --traddr 10.10.10.10 --trsvcid 4420 
xnr_cli nvmf namespace add --nqn nqn.2018-09.io.xinnor:node1 --bdev xnraid

Then the RAID block device can be connected by an initiator

sudo modprobe nvme-fabrics nvme-rdma 
sudo nvme discover -t rdma -a 10.10.10.10 -s 4420 
sudo nvme connect -t rdma -a 10.10.10.10 -s 4420 -n nqn.2018-09.io.xinnor:node1

The RDMA connection can require additional configuration at the target or initiator side or at the switch.