Replacing a ZFS Boot Drive on Proxmox (ZFS Mirror with NVMe)

When a drive fails, customers with managed servers or active monitoring are automatically contacted by our support team. For managed servers, we handle the replacement and system recovery independently.

This guide is intended for customers without managed or monitoring services (unmanaged customers) who want to perform the replacement and restore the ZFS mirror themselves.

The instructions assume NVMe drives but apply similarly to SSDs.



1. Identify the failed NVMe drive

List all drives to find the failed device:

lsblk

Note: If, for example, /dev/nvme0n1 is missing, that drive is likely faulty.


2. Install the new NVMe drive

Install the new NVMe drive and list drives again:

lsblk

3. Copy partition layout to the new NVMe

Copy the partition layout from the healthy drive (e.g., /dev/nvme1n1) to the new one (/dev/nvme0n1):

sgdisk /dev/nvme1n1 -R /dev/nvme0n1 # Copy layout from healthy drive to new drive sgdisk -G /dev/nvme0n1 # Generate new GUIDs for the new partitions

4. Find the device IDs of the new partitions

List the device IDs to find the correct identifiers:

ls -l /dev/disk/by-id/ | grep nvme0n1

Important:

  • Identify especially the boot partition ID (usually part2 – EFI system partition).

  • Also note the ID of the ZFS data partition (part3).

Example output:

/dev/disk/by-id/nvme-SSD123456-part2 /dev/disk/by-id/nvme-SSD123456-part3

5. Repair the ZFS mirror

Replace the failed ZFS partition (replace <OLD_PART3_ID> and <NEW_PART3_ID> with the actual IDs):

zpool replace -f rpool /dev/disk/by-id/<OLD_PART3_ID> /dev/disk/by-id/<NEW_PART3_ID>

Check progress:

zpool status

Wait until resilvering is complete (both drives show ONLINE status).


6. Prepare the EFI partition

Format and initialize the EFI partition on the new drive (replace <NEW_PART2_ID>):

proxmox-boot-tool format /dev/disk/by-id/<NEW_PART2_ID>

Explanation: Formats the EFI partition as FAT32 (required for the bootloader).

proxmox-boot-tool init /dev/disk/by-id/<NEW_PART2_ID>

Explanation: Initializes the EFI partition with systemd-boot and prepares it as boot partition.


7. Update and clean the boot configuration

Run these commands to update the bootloader configuration and remove outdated entries:

proxmox-boot-tool refresh

Explanation: Updates kernel and bootloader configuration on the EFI partition.

proxmox-boot-tool clean

Explanation: Removes orphaned or non-existing EFI partitions from the configuration.

proxmox-boot-tool status

Explanation: Shows current boot system status, active partitions, and bootloader setup.


Completion:
Once resilvering is finished and both drives show ONLINE status, your system is redundant and bootable again. Reboot only after resilvering has completed.

Tags