- This article has also been published at Deb-a-day.
Suppose, you have three hard drives - sized 80, 40 and 60 GB. And 150 GB of music files, which you need to store on these drives. How would you do it?
The two solutions I knew of, were:
- either to simply have three separate “Music” folders - one per each drive;
- or create some sort of RAID, joining all the drives into an array.
However, the first method is quite tiresome, as one needs to decide how to split the data between the drives and keep track of what is stored where. For example, I might decide to store all “Classical” music on the first disk, and “Rock” music on the second. Then, suddenly, the first drive fills up and the second one still has plenty of space. Now I need to move the files between the disks, or jump around with symlinks.
The RAID method, while solving this problem, always incurs significant loss of either storage reliability or usable disk space.
But recently, I found a better solution to this problem and similar ones: mhddfs. It is a FUSE filesystem module which allows to combine several smaller filesystems into one big “virtual” one, which will contain all the files from all its members, and all their free space. Even better, unlike other similar modules (unionfs?), this one does not limit the ablility to add new files on the combined filesystem and intellegently manages, where those files will be placed.
The package is called «mhddfs» and is readily available in Debian and probably in many other distros as well.
Let's say the three hard drives you have are mounted at
/mnt/hdd3. Then, you might have something akin to the following:
$ df -h Filesystem Size Used Avail Use% Mounted on ... /dev/sda1 80G 50G 30G 63% /mnt/hdd1 /dev/sdb1 40G 35G 5G 88% /mnt/hdd2 /dev/sdc1 60G 10G 50G 17% /mnt/hdd3
After you have installed the
mhddfs package using your favourite package manager, you can create a new mount point, let's call it
/mnt/virtual, which will join all these drives together for you. The beauty of FUSE means you don't really have to be
root for this (can be just a member of the
fuse group), but for the sake of examples' simplicity, let's suppose we are logged in as
# mkdir /mnt/virtual # mhddfs /mnt/hdd1,/mnt/hdd2,/mnt/hdd3 /mnt/virtual -o allow_other option: allow_other (1) mhddfs: directory '/mnt/hdd1' added to list mhddfs: directory '/mnt/hdd2' added to list mhddfs: directory '/mnt/hdd3' added to list mhddfs: move size limit 4294967296 bytes mhddfs: mount point '/mnt/virtual'
-o allow_other» option here means that the resulting filesystem should be visible to all users, not just to the one who created it.
The result will look like this:
$ df -h Filesystem Size Used Avail Use% Mounted on ... /dev/sda1 80G 50G 30G 63% /mnt/hdd1 /dev/sdb1 40G 35G 5G 88% /mnt/hdd2 /dev/sdc1 60G 10G 50G 17% /mnt/hdd3 mhddfs 180G 95G 85G 53% /mnt/virtual
As you can see, the new filesystem has been created. It joined the total size of all drives together (180G), added together the space used by all files there (95G) and summed up the free space (85G). If you look at files in
/mnt/virtual, you'll notice that it has files from all three drives, with all three directory structures “overlayed” onto each other.
But what if you try to add new files somewhere inside that
/mnt/virtual? Well, that is quite tricky issue, and I must say the author of
mhddfs solved it very well. When you create a new file in the virtual filesystem,
mhddfs will look at the free space, which remains on each of the drives. If the first drive has enough free space, the file will be created on that first drive. Otherwise, if that drive is low on space (has less than specified by “mlimit” option of
mhddfs, which defaults to 4 GB), the second drive will be used instead. If that drive is low on space too, the third drive will be used. If each drive individually has less than
mlimit free space, the drive with the most free space will be chosen for new files.
It's even more than that; if a certain drive runs out of free space in the middle of a write (suppose, you tried to create a very large file on it), the write process will not fail;
mhddfs will simply transfer the already written data to another drive (which has more space available) and continue the write there. All this completely transparently for to the application which writes the file (it will not even know that anything happened).
Now you can simply work with files in
/mnt/virtual, not caring about what is being read from which disk, etc. Also, the convenience of having large “contiguous” free space means you can simply drop any new files into that folder and (as long as there's space on at least one member of the virtual FS) not care about which file gets stored where.
If you decide to make that mount point creating automatically for you on each boot, you can add the following line to
mhddfs#/mnt/hdd1,/mnt/hdd2,/mnt/hdd3 /mnt/virtual fuse defaults,allow_other 0 0
For more details, see
The last, but not the least important thing to mention, is the fact that it's very simple to stop using
mhddfs, if you later decide to do so - and not lose any file data or directory structure. Let's say, at some point in time, you purchase a new 500 GB hard disk, and want to sell the smaller disks on Ebay. You can just plug in the new drive, copy everything from
/mnt/virtual onto it, and then remove
mhddfs mountpoint and disconnect old drives. All your folders, which were previously merged in a “virtual” way by
mhddfs, will now be merged in reality, on the new disk. And thanks to the fact that files themselves are not split into bits which are stored on different drives, even in the unlikely event when
mhddfs suddenly no longer works for you (or disappears from existence), you can still copy all your data from all three drives into one single folder, and have the same structure you previously had in that