Drive I/O Error Counter
You can keep track of drives where I/O errors (faults) have started to appear so that you can replace such drives with healthy ones in a timely manner.
We recommend setting up email notifications (to learn more, see the Setting up Email Notifications) chapter to trace drives with I/O errors.
Fault threshold is the common number of faults for each drive, above which the drive will be removed from the RAID (marked as 'missing') or replaced with a suitable drive from the spare pool. You can set the fault threshold value in the range from 1 to 1000 using the command xicli settings faulty-count modify -t. If you change the fault threshold value, the current number of faults on the drives is reset.
When a drive is removed from a RAID because the fault threshold is exceeded:
- if the RAID has a SparePool with the suitable drive, the removed drive will be replaced and then the RAID reconstruction will start;
- if the removed drive has not been replaced in the RAID (automatically or manually), the drive will return in the RAID after resetting the current number of faults on that drive;
- the drive clean command applied to the removed drive resets the current number of faults and does not remove metadata from the drive.
To manage the threshold value of I/O errors for all drives, run
# xicli settings faulty-count modify <arg>
When you change any parameter of the xicli settings faulty-count modify command, the xiraid-scanner.service restarts.
Required argument |
||
-t |
--threshold |
The threshold value for all drives. If you set a new fault threshold value, the current numbers of faults are reset for all the drives. Possible values: integers from 1 to 1000. The default: 3. |
Example: Set the drive fault threshold value to 10:
# xicli settings faulty-count modify -t 10
To show the threshold value of I/O errors, run
# xicli settings faulty-count show
Optional argument |
||
-f |
--format |
Output format:
The default: table. |
# xicli drive faulty-count reset <arg>
The RAID that contains the drive must be loaded.
When you change any parameter of the xicli drive faulty-count reset command, the xiraid-scanner.service restarts.
Required argument |
||
-d |
--drives |
The list of block devices (/dev/sd*, /dev/mapper/mpath*, /dev/nvme*, /dev/dm-*) separated by a space to reset their current numbers of faults. |
Example: reset current values of fault count for drives /dev/sda, /dev/sdb, /dev/sdd:
# xicli drive faulty-count reset -d /dev/sd[a-b] /dev/sdd
To show the current numbers of faults for drives, run
# xicli drive faulty-count show [optional_args]
Mutually exclusive optional arguments |
||
-n |
--name |
The RAID name for which drives the current number of faults will be shown. If neither of the two arguments is specified, show the values for all drives. |
-d |
--drives |
The list of block devices (/dev/sd*, /dev/mapper/mpath*, /dev/nvme*, /dev/dm-*) separated by a space to show their current numbers of faults. If neither of the two arguments is specified, show the values for all drives. |
Optional argument |
||
-f |
--format |
Output format:
The default: table. |
Example: show current values of fault count for drives /dev/sda, /dev/sdb, /dev/sdd:
# xicli drive faulty-count show -d /dev/sd[a-b] /dev/sdd