When a drive fails, customers with managed servers or active monitoring are automatically contacted by our support team. For managed servers, we handle the replacement and system recovery independently.
This guide is intended for customers without managed or monitoring services (unmanaged customers) who want to perform the replacement and restore the ZFS mirror themselves.
The instructions assume NVMe drives but apply similarly to SSDs.
1. Identify the failed NVMe drive
List all drives to find the failed device:
lsblk
Note: If, for example, /dev/nvme0n1
is missing, that drive is likely faulty.
2. Install the new NVMe drive
Install the new NVMe drive and list drives again:
lsblk
3. Copy partition layout to the new NVMe
Copy the partition layout from the healthy drive (e.g., /dev/nvme1n1
) to the new one (/dev/nvme0n1
):
sgdisk /dev/nvme1n1 -R /dev/nvme0n1 # Copy layout from healthy drive to new drive sgdisk -G /dev/nvme0n1 # Generate new GUIDs for the new partitions
4. Find the device IDs of the new partitions
List the device IDs to find the correct identifiers:
ls -l /dev/disk/by-id/ | grep nvme0n1
Important:
-
Identify especially the boot partition ID (usually
part2
– EFI system partition). -
Also note the ID of the ZFS data partition (
part3
).
Example output:
/dev/disk/by-id/nvme-SSD123456-part2 /dev/disk/by-id/nvme-SSD123456-part3
5. Repair the ZFS mirror
Replace the failed ZFS partition (replace <OLD_PART3_ID>
and <NEW_PART3_ID>
with the actual IDs):
zpool replace -f rpool /dev/disk/by-id/<OLD_PART3_ID> /dev/disk/by-id/<NEW_PART3_ID>
Check progress:
zpool status
Wait until resilvering is complete (both drives show ONLINE
status).
6. Prepare the EFI partition
Format and initialize the EFI partition on the new drive (replace <NEW_PART2_ID>
):
proxmox-boot-tool format /dev/disk/by-id/<NEW_PART2_ID>
Explanation: Formats the EFI partition as FAT32 (required for the bootloader).
proxmox-boot-tool init /dev/disk/by-id/<NEW_PART2_ID>
Explanation: Initializes the EFI partition with systemd-boot and prepares it as boot partition.
7. Update and clean the boot configuration
Run these commands to update the bootloader configuration and remove outdated entries:
proxmox-boot-tool refresh
Explanation: Updates kernel and bootloader configuration on the EFI partition.
proxmox-boot-tool clean
Explanation: Removes orphaned or non-existing EFI partitions from the configuration.
proxmox-boot-tool status
Explanation: Shows current boot system status, active partitions, and bootloader setup.
Completion:
Once resilvering is finished and both drives show ONLINE
status, your system is redundant and bootable again. Reboot only after resilvering has completed.