My 71 TiB ZFS NAS has been running for 10 years without a single disk failure

My 4U 71 TiB ZFS server, built from twenty-four 4-terabyte disks, is over ten years old and still works great. Although it now has a second motherboard and power supply, the system has not encountered a single disk failure so far. How did I manage to achieve no disk failures for ten years?

My server 4U 71 TiB ZFS, assembled from twenty-four 4-terabyte disks, is already more than ten years old and it still works great. Although it now has its second motherboard and power supply, the system has not yet encountered a single disk failure (knock on wood).

How did I manage to achieve no disk failures for ten years?

Let's first talk about the disks themselves

HGST 4-terabyte disks have worked for about 6000 hours over ten years. You might immediately think that something is wrong here, and you would be right. That's only about 250 days of continuous operation. And this (I think) is the secret to the longevity of the disks.

Turn off the server when you are not using it

My NAS is off by default. I turn it on remotely only when I need to use it. I use a script that turns on the server power through a "smart" IoT socket, and after the BMC (Board Management Controller) finishes booting, I use IPMI to turn on the disk array itself. But I could also use Wake-on-Lan as an alternative.

As soon as I finish using the server, I run a small script that shuts down the server, waits a few seconds, and then turns off the socket.

It was not enough for me to just turn off the power to the disks but leave the motherboard on because it consumes 7 watts (about the same as two Raspberry Pis) in standby mode. And with my schedule, this meant wasting energy.

For me, this mode of operation is comfortable because I run other services on low-power devices such as Raspberry Pi4 or servers that consume much less power in standby mode than my "big" NAS.

The motivation here was a significant reduction in electricity bills, but the "side" effect was the longevity of the hard drives.

You may also argue that my case is not indicative and not representative and I was just lucky and a large number of disks helped here. But with the successor of this NAS with 20 Samsung Spinpoint F1s disks with a capacity of 1 TB each, the story was the same: I did not have a single disk failure in it during its entire operation period of 5 years.

Motherboard (failed once)

Although the disks in the NAS are still fine, a few years ago I had to replace the motherboard in it. It lost access to the BIOS and sometimes it could not boot. I tried obvious things like resetting the BIOS, reflashing, and replacing the CMOS battery, but to no avail.

Fortunately, such a motherboard was still available on Ebay at a good price, so it turned out to be easier for me to replace it with a new one than to repair the old one. I needed exactly the same board because the server uses four PCIe slots: 3 x HBA and 1 x 10Gb NIC.

ZFS

The ZFS file system has worked perfectly all these years. I have changed operating systems over the years and have never encountered problems when importing the pool into a new OS installation. If I were to build a new data storage, I would definitely use ZFS again.

I run the zpool scrub integrity check on disks several times a year. The check has never found a single checksum error. Over the entire period of checks, more than a petabyte of data has been read from the disks. Since the check takes about 20 hours to complete and consumes a lot of electricity during execution, I have to run it on "cheap" days when the cost of electricity is minimal.

And I am not at all surprised by this result. Disk failures are most often associated with the risk of the following situations:

  1. Complete failure, when the disk is not even recognized

  2. Bad sectors (problems with reading or writing)

There is also a third type of failure, but it is extremely rare: silent data corruption. It is silent because either the disk itself does not realize that it is operating with corrupted data or the SATA connection does not detect any checksum errors during their transmission.

However, due to all the low-level checksum checks, this risk is extremely small. Yes, it is a real risk, it should not be underestimated, but it is small. In my opinion, this should be a concern at the scale of data centers, but for home use, the probability of its occurrence can be neglected.

Every time you listen to ZFS enthusiasts, you might get the impression that if you are not using ZFS, you risk losing all your data one fine day. I do not agree with this, it all depends on the context and circumstances. Although ZFS is not that difficult to master and, if you are well acquainted with Linux or FreeBSD, this file system is definitely worth trying.


My 71 TiB ZFS NAS has been running for 10 years without a single disk failure

Noise level (very quiet)

My NAS is very quiet for a NAS. But to achieve this, I had to work hard.

Its case contains three powerful 12V fans that cool 24 drive bays. These fans are very noisy if they run at their standard speed. So I decided that their operation at the lowest speed (idle) when they are almost silent would be enough for me. Therefore, I had to add a fan to cool four PCIe cards (HBA and network), otherwise they would overheat. This setup provided sufficient airflow most of the time, but it was not enough as the drives would heat up over time, especially when data was being read/written.

Fortunately, the Supermicro motherboard I bought for the NAS allowed me to control the fans from Linux. So I decided to create a script that would set the fan speed based on the temperature of the hottest drive in the case.

I even visited a math forum and asked its regulars for an algorithm that would best balance disk cooling and quietness. Someone suggested using a "PID controller," which I knew nothing about.

As a result, I had to learn Python, "borrow" a sample PID controller code, and use the "trial and error" method to find the parameters to balance fan noise and cooling performance.

This script has been working perfectly for many years and keeps the disk temperature within 40 degrees Celsius or even lower. As it turned out, PID controllers are a great solution, and I think they should be used in most equipment that controls fans, temperature, and so on, instead of the "dumb" on/off switching when the threshold is reached or the less "dumb" setting of parameters according to the "Temperature-Speed" table.

Network

I started with four-port gigabit network controllers and used network interface bonding to achieve a network data transfer rate of about 450 Mbps between different systems. This setup required a huge amount of UTP cables, so I eventually got tired of it and bought a few well-working but cheap Infiniband cards. With them, I was able to achieve a data transfer rate between systems of about 700 Mbps. When I decided to abandon Ubuntu and return to Debian, I had a problem: the Infiniband cards did not work in the latter, and I could not find a way to fix it. So I decided to buy a few used 10-gigabit Ethernet cards, which are still working properly.

PSU Problems (died)

When turning on the NAS, all the disks in it start simultaneously (without sequential startup), which consumes about 600 W for a few seconds. My power supply has a rated power of 750 W, and the 12-volt line theoretically should have provided enough power, but sometimes the unit would shut down during boot and eventually couldn't handle this mode and was replaced.

UPS (thrown away)

For many years, I used a powerful UPS to protect the NAS from power outages, to be able to properly shut down the NAS during an emergency. Everything worked great, but I noticed that the UPS added another 10W to the consumption and I decided it was time to remove it.

I just took it for granted that I could lose some data due to power outages.

Backup (there is none)

My most important data is backed up three times. Many of the data stored on this NAS are not important enough to be backed up. I rely on hardware replacement and ZFS to protect against data loss due to disk failure. If this is not enough, I will be out of luck. And I have been living with this risk for 10 years. Maybe my luck will run out someday, but for now, I enjoy not having to worry about backups.

Future storage plans (or lack thereof)

Do I have plans for what to do next? Honestly, no. I originally built this NAS of such a size because I didn't want to move data if I ran out of storage space. As a result, I still have enough free space.

I have a spare motherboard, processor, memory, and spare HBA cards, so I will most likely be able to restore the system if something breaks.

Since the size of hard drives has increased significantly, I may switch from a 24-bay case to a smaller form factor. You can create the same amount of storage with just 6-8 hard drives with RAIDZ2 (RAID 6) redundancy. But it will be just as expensive a project.

Another likely scenario is that in the coming years my NAS will eventually break down, and I will decide not to replace it at all, and my "data storage hobby" will come to an end.

Order VPS of the Storage VPS line designed for building network storage with a 25% discount!

And how do you extend the life of your disks?

Comments