Merge function

Data can be written to or read from drives in a RAID in sequential or random order. When incoming requests are sequential and the recording block sizes are small, it is beneficial to wait for them to accumulate and form a large block. Then, merge these requests and write or read data from the RAID in large blocks. Applying such access patterns can improve system workload performance, as this approach reduces the number of read-modify-write operations 1 on syndromic RAIDs.

Tip:

Enable Merge function when the access pattern is sequential and high threaded and the block sizes are small.

Manual Merge configuration

The merge parameters for requests accumulation for a RAID can be configured manually using xicli raid create and xicli raid modify commands.

The --merge_read_enabled parameter activates the Merge function for incoming read requests, allowing their accumulation. You can specify a waiting time between incoming read requests in sequential areas using the --merge_read_wait parameter. At the end of the waiting time, the requests are merged together if possible. You can use the --merge_read_max parameter to specify the maximum waiting time for request accumulation. Usually, large I/O sizes require large values for these parameters.

Example: Create the RAID 5 named ”media5” over 4 NVMe drives — ”nvme0n1”, ”nvme1n1”, ”nvme2n1”, ”nvme3n1”, with strip size equal to 64 KiB and enabled Merge function for read operations.
# xicli raid create -n media5 -l 5 -d /dev/nvme0n1 /dev/nvme1n1 /
dev/nvme2n1 /dev/nvme3n1 -ss 64 -mre 1

The --merge_write_enabled parameter activates the Merge function for incoming write requests, allowing their accumulation. You can specify a waiting time between incoming write requests in sequential areas using the --merge_write_wait parameter. At the end of the waiting time, the requests are merged together if possible. You can use the --merge_write_max parameter to specify the maximum waiting time for request accumulation. Usually, large I/O sizes require large values for these parameters.

Example: Change the RAID 5 named ”media5” waiting time between write requests from 300 to 500 ms.
# xicli raid modify -n media5 -mww 500

If the access pattern is mainly random or request queue depth is small, the waiting time will not allow merging requests.

The function only works when the condition is met:

data_drives * strip_size ≤ 1024

where

  • “data_drives” is a number of drives in the RAID (for RAIDs 5, 6, 7.3 or N+M) or in one RAID group (for RAIDs 50, 60, or 70) that are dedicated for data;

  • “strip_size” is a selected stripe size for the RAID (strip_size value) in KiB.

The “data_drives” value depending on a RAID level:

RAID level

Value of data_drives

RAID 5

Number of RAID drives minus 1

RAID 6

Number of RAID drives minus 2

RAID 7.3

Number of RAID drives minus 3

RAID N+M

Number of RAID drives minus M

RAID 50

Number of drives in one RAID group minus 1

RAID 60

Number of drives in one RAID group minus 2

RAID 70

Number of drives in one RAID group minus 3

Deactivate Merge when the request queue depth of user's workload is not enough to merge a full stripe. Activate Merge, if

iodepth * block_size >= data_drives * strip_size

where “block_size” is a block size of the RAID (the block_size value in RAID parameters) in KiB.

Automatic Merge configuration

Adaptive merge in xiRAID 4.1.0 is considered as experimental feature. This means that we continue to run our tests on a variety of configurations and environments, and based on our results and your feedback, we will plan and implement further improvements of the algorithm.

We recommend to use it in situations when your business applications originate a significant continuous sequential write load:

  • For the case when application logic generates a changing sequential load or there are several sources of concurrent sequential load, we recommend using the adaptive merge without the --single_run option, which is also the default mode.

  • The --single_run option is recommended for those special cases where you want to adapt merge parameters for only one specific scenario that reproduces inconsistently or can be interrupted/changed by sporadic application-level activities. In this case, enable the adaptive merge with the --single_run option during the desired scenario to permanently set the merge parameters.

In real business configurations, there could be additional factors unique to each particular setup. We highly recommend observing the performance of xiRAID with the adaptive merge option prior to enabling it in critical production environments to confirm that it brings a noticeable positive impact in each particular situation.

xiRAID offers an algorithm that automatically selects the waiting time for accumulating write requests. This algorithm determines whether the merging of write requests is enabled (--merge_write_enabled parameter), the wait time between write requests (--merge_write_wait parameter) and the maximum wait time for stripe accumulation (--merge_write_max parameter). The time between incoming requests is set to 0-3000 ms, and the maximum wait time for stripe accumulation is set to to 20000 ms.

You can enable the Automatic Merge function during RAID creation and modify it later using the --adaptive_merge parameter. Once the Automatic Merge algorithm determines the waiting time, you can prevent it from changing these settings later on using the --single run parameter.

Example: Enable the Adaptive Merge function for the RAID 5 named "media5" and prevent it from changing the determined settings later on. After the settings have been determined, the Adaptive Merge is automatically deactivated and displayed as 'False' in the xicli raid show output.

# xicli raid modify -n media5 --adaptive_merge 1 --single_run
Warning:

Do not use the Automatic Merge parameters when Manual Merge is enabled, and vice versa.

1 Writing data to a RAID in small blocks requires reading data from the drives, calculating new syndrome values, and writing them to the drives.