Skip to content

btrfs raid1 to combine two separate partitions

A btrfs raid1 means mirroring all data and meta data on file system level. If you ever found some of your data lost after a btrfs scrub, you might know this output:

If you happen to have free space on a spare disk left, you can circumvent these problems via a btrfs raid1 on filesystem level. The following article will help you migrate.

Why use btrfs instead of mdadm for raid1?

Well, there are some advantages (and disadvantages).

First, you will gain a check on file system level which is easier to compute, because meta data and data are mirrored. It might take some more CPU to do these checks, but you will have the abillity to tell which of the copies is still healthy.

On the other hand btrfs raid1 volumes are detected by your initrd automatically. Both partitions (or disks) will be mounted as one single volume without any additional configuration.

Last, it is really easy to set up.

Some of the disadvantages are: Higher CPU usage for checking (scrubbing), and you need to create the check jobs manually ( btrfs scrub ).

Prepare your disk or btrfs partition for raid1

If you are going to use a btrfs raid1, both partitions should have the same size. This means, you need to align on the smallest of your disks. The easiest way to find out the correct number of sectors for this is using fdisk:

In row 10 you can see the number of sectors. This is the target size. Write it down and create a new partition /dev/sdc1  (in my case). Using the command   sudo cfdisk /dev/sdc  you can create the new partition via a text user interface. As the new partition size I entered   1465145344S eingegeben. Please note the trailing S! You don’t need to format this partition using  mkfs.btrfs . But you can use   sudo partprobe  to make your kernel re-read the partition table.

Mount the partitions

To mount the existing partition (containing the data), use this command.

The next step is to add the newly created partition to the existing mountpoint. The corresponding btrfs command is:

Creating the btrfs raid1

But we are not done yet! At this point, we created a JBOD (»Just a bunch of disks«), which means a single big file system. In my case about 1200 GiB. To convert this volume into a functional raid1 you can use two super easy commands. You could combine them into a single command, but I wouldn’t recommend to do so, because it is slower to copy both data and metadata at once.

Convert metadata to raid1

The first step converting the disk array is to convert the metadata to raid1.

Explanation:

  • fi balance  (short for  filesystem balance ) is a btrfs command, which will let you select a new profile for you data.
  • -m  is the argument to work with metadata.
  • -mconvert  is the argument to convert the metadata to a new profile. This profile will be applied to all the disks in the array.
  • raid1,soft  means we chose the profile btrfs raid1. soft  will select only data which has not been converted already. But if this is your first time running this command, it will catch all data anyway.

Convert data to raid1

In the second step all data is mirrored to the other disk.

This time the  -dconvert=  means to apply the new profile to all data on the disk array. This will take a little longer, depending on how much data you already have on your disk.

If you want to watch the progress in another shell, you can type this command:

Final test

Disk usage and single chunks

After applying the new profile on your btrfs raid1 file system, there may still be single chunks left which are located on only one physical disk. Usually these consume zero bytes internally, because all the data has been converted. You can find out using this command:

You can get rid of those empty (but space allocating) chunks by using yet another btrfs command.

The above command wil run a balancing on the volume and clean up any chunks not having allocated any data.

Mount options

Both disks are mounted to the btrfs volume automatically. If you want to make really sure this happens properly, or your initrd does not support btrfs detect, you can still use mount options for this matter. But you need to type in all devices which are used in the array. In this example I will use device names in my  /etc/fstab. The better option would be to use UUIDs for this matter.

You can list the needed UUIDs (optional) using this command:

Check your new volume

If you want to make really sure everything went well, you can check which devices belong to a volume:

You can see two devices (or partitions) which are about 700 GiB of size each. Both mirror about 120 GiB of data, but including metadata about 123 GiB is actually used on each physical node.

You can see the exact sizes using btrfs fi usage :

As you can see, the 123 GiB split up into data, metadata and system.

Maintenance using btrfs scrub

Now to activating the real advantages:  btrfs scrub . If you happen to have incorrect data on one of the disk, you want to btrfs notice. For this you use  btrfs scrub, which of course will work properly on a btrfs raid1. It will detect faulty data and/or metadata and will overwrite it using the correct data from the still healthy raid member. You can run the command via cron or systemd-timer. I chose systemd-timer, but did not have a specific reason to chose it over cron.

Install systemd-unit and timer

You can find the original sources I used in this gist on github: https://gist.github.com/gbrks/11b9d68d19394a265d70. But I needed to change some parts, just read on.

You should place the files at these locations:

  • /usr/local/bin/btrfs-scrub
  • /etc/systemd/system/btrfs-scrub.service
  • /etc/systemd/system/btrfs-scrub.timer

Please do replace one thing: Inside the service file replace Type=simple with  Type=oneshot  because the process will terminate (exist) after scrubbing.

Activate the btrfs raid1 scrub service

The timer and service need to be enabled to actually execute. The commands for systemd are:

The first command will tell systemd to re-read its configuration files.  After that, the actual services are being enabled and the timer is activated. The timer has to be started at least once.

Configure email notifications

You can configure the email notification for your needs. I did not had a mailgun account and wanted to use my existing gmail account. This is possible using curl, netrc and only a few changes.

The first thing you need to do is to create the file /root/.netrc :

If you don’t want to use your primary password (yes, you don’t!), you should create an app password over here. If you use two factor authentication, this is a mandatory step.

Now you can replace the gist from above with my fork containing the documentation and changes to the original scrub job.

Error corrections sent by mail

In case actual errors are found on your btrfs raid1 volume, those will be corrected and mailed to you. If there were errors, the output might look as documented in oracle’s blog over here.

As you can see, the errors are shown as corrected. Yay! \o/

Conclusion

Using btrfs you get a pretty nice btrfs raid1 implementation on your file system layer. It is set up quite fast. Disadvantages are a little to high cpu load and manually installing all the maintenance scripts for executing btrfs scrub on your btrfs raid1.

Published inHow Tos

One Comment

  1. Anonymous Anonymous

    What about SSD errors “non readable sectors”? How will treat them BTRFS?

    All forums talk about errors that assume data has been readed but its checksum is not correct (corrupt data, BitRot, etc), but what about when data can not even be readed? How BTRFS deals with that? Will it mark the whole device as BAD and work on degraded mode? Or will it try to re-write data, re-check data can be readed back and continue?

    In my case when SSD says unreadable sector, if i re-write to it, it becomes readable again.

    The weird thing is tbat the SSD after long periods of non powered say some random sectors are not readable, the most weird thing is that the SSD after another long period of non powered say different sectors are not readable (while previous ones that were not readable now are readable)… all without any write to the SSD and without mounting any partitikn on it.

    It happens to all SSD i tested from brand KingDian… after more than a week (stored outside the PC) random sectors become unreadable… after some more days the sectors that are not readable are different sectors and the old ones that were failing are not failing, they can be readed and data is correct.

    If i set two partitions (or more to be more safe) on the SSD and use BRTFS to Raid 1 level on them (among speed will be very poor), will it let me read (supposing random unreadable sectors do not be the same number on all partitions at the same time)? Will BRTFS fix that?

    Weird thing is that after writting to any non-readable sector, it is readable again.

    The problem is only present when i let the SSD without power for more than eigth days in a row (like on hollydays).

    Note: That happens to three SSD i have from brand KingDian, one had died (not seen by BIOS anymore and had been replaced by KingDian with a new one) because i did not re-write to it since i did not know re-write makes it to live again.

    The SMART data say that SSD has no write errors, has no read erros (while some sectors are not readable) and if i run a tool on Windows to re-map that failing sectors, then SMART data say re-maped sectors count correctly, but still say no read and no write errors. Really weird.

    I can not afford more SSD for such old 512MiB RAM 32bit proccesor PC, but if BTRFS can lead with such failing SSDs i can use them to speed boot a lot (Sata I 150MiB/s) vs PATA (60MiB/s), yes, that old PC does not want any SATA II/III HDD, only SATA I drives. Any drive that is SATA II or SATA III is not seen by BIOS, only wants pure old SATA I HDDs,mbut it van see SATA III SSD at SATA I speed of course… i have no SATA I HDD able to reach 150MiB/s, top most is 40MiB/s on SATA I HDD (i have one PATA HDD able to 60MiB/s).

Leave a Reply

Your e-mail address will not be published. Required fields are marked *