Using a NAS is very convenient — data is accessible from anywhere, multiple users can work simultaneously, and with RAID 5 you have the reassuring feeling that you will not lose your data if a single drive fails. RAID 5 stores data in a distributed manner with calculated parity, allowing it to reconstruct missing content. In practice, however, this does not guarantee absolute certainty. Sometimes things do not go according to the ideal scenario, and the entire array collapses.
Common Causes of RAID Array Failure in NAS
Physical failure of one or more drives — the most common cause of a RAID array outage. Drives in a NAS run continuously 24/7 and are subject to higher thermal and mechanical stress. Over time, bad sectors appear, bearing noise develops or electronics fail. If one drive fails in a RAID 5, the array should continue to function, but if a second drive fails, data is usually no longer accessible.
Sudden ATA or SMART errors — even a drive that appears healthy at first glance can start reporting communication errors, known as ATA errors. Similarly, SMART attributes can reveal bad sectors or an unstable platter surface. These errors cause the NAS to mark the drive as faulty — even though it may still be physically functional — and eject it from the array.
Failed rebuild after drive replacement — when a drive is replaced, the NAS initiates a rebuild: recalculating parity and recreating data. If an error occurs during this process (a faulty new drive, a power interruption, a write error), the rebuild fails to complete and the entire array can collapse. This is in fact the cause of failure in this Netgear NAS case.
NAS controller or power supply failure — it is not only drives that can fail. A faulty RAID controller, power supply or NAS firmware can render the array inaccessible. In such cases, the drives are physically fine, but the NAS cannot read them correctly.
File system corruption (e.g. Btrfs or EXT4) — even when the RAID is assembled correctly, the file system can still be the problem. A power outage at the wrong moment, a write failure or a faulty drive is enough for Btrfs or EXT4 to enter an inconsistent state. The user then sees no data, even though it is physically present on the drives.
Detailed information about all types of NAS and network storage failures, including indicative pricing, can be found on the NAS Data Recovery page. Information specific to RAID arrays is available on the RAID Data Recovery page.
An Attempt to Expand Array Capacity Led to Its Failure
The user of this Netgear NAS decided to gradually replace the original 3 TB drives with new 6 TB models. They replaced one drive, let the array rebuild, replaced the next drive, let it rebuild, and so on. The problem occurred when a newly inserted 6 TB drive failed. The NAS began reporting drive errors (ATA errors) and the array collapsed.
However, RAID 5 should be resilient against the failure of a single drive. During later reconstruction, it became apparent that yet another drive had been ejected from the array. The cause is not entirely clear, as the drive itself passed testing without any errors.
Our analysis revealed that each drive contained multiple partitions — system, swap and two large RAID areas. The original data was still on the first RAID partition (~2.7 TB), while the newly created area on the 6 TB drives was incomplete.
ReadyNAS does not handle RAID in a completely standard way. Instead of a single conventional RAID 5 array, it uses its own X-RAID technology, which divides drives into several parts called zones. Each zone is always sized according to the smallest drive in the array. When a larger drive is inserted, ReadyNAS maintains compatibility by using only the portion matching the smaller drives. The remaining capacity is separated into a new zone, from which an additional RAID 5 is created and logically joined to the original one. This is why two RAID partitions appeared on the 6 TB drives — the first, matching the original 3 TB drives (~2.7 TB), and the second, prepared for capacity expansion. In this particular case, however, the new area was never fully assembled because additional drives of the same size were needed to complete it. Extra members appeared in the metadata, and the entire array became inaccessible.
This caused minor complications during array reconstruction. The Btrfs file system was also damaged. After excluding the "faulty" drive and locating the correct Btrfs superblocks, we were able to restore the complete directory structure and the data was recovered.
How to Prevent Data Loss from NAS / RAID
RAID is not a backup. Replacing drives and expanding an array always carries risk. More about why RAID should not be considered a backup in our article about RAID.
Do not attempt repeated rebuilds with a faulty or unverified drive — the risk of data damage increases with each attempt.
Use quality, matching drives, ideally those recommended by the NAS manufacturer.
Regularly back up important data outside the NAS — to another drive, cloud storage or an external device. Never trust a NAS as your only backup.
If your NAS fails, contact data recovery specialists — DIY attempts at home can lead to further damage to the drives or the entire array.
RAID 5 in a NAS may seem safe, but even a single error during a drive replacement can cause total data inaccessibility. In this case, the problem was caused by a failing new 6 TB drive that halted the rebuild and corrupted the array's consistency. Thanks to specialised procedures, we were able to recover the data from the Netgear ReadyNAS.
- Details
- By Frantisek Fridrich
- Parent Category: Blog
- From Practice