RAID 2, RAID 3, RAID 4 - What It Is, How It Works? The History Lesson
Updated on 20/09/2021 After RAID 0 and RAID 1 (with RAID 1+0 and RAID 0+1)…Read More
Last update: July 05, 2021
Today I would like to discuss the topic of RAID – more specifically backup issues. In the blog post title, I proposed a daring thesis which I would like to develop and, as far as I’m able, to prove or defend it. At the very beginning, a few definitions and terms related to RAID have to be introduced, so the readers who aren’t familiar with RAID won’t feel confused.
RAID technology extends the capabilities of drives and unifies individual disks in a group (or groups). Such groups are called arrays. Thanks to the combination of disks into arrays, extra disk capabilities become available. For example, fault-tolerant disk (or disks) arrays, an increased read/write array transfer (or both) in comparison to the transfer on a single disk, or the possibility of expanding arrays to additional drives.
But let’s take it easy, not all at once. There are different types of arrays and their features depend on what types of arrays were used. In this article, I would like to focus on just one of the properties of RAID, mainly redundancy or the array’s resistance to hard disk failures. Now let the fun with backup begin.
Imagine dear reader, that you are an administrator of a data server where drives work in RAID 1. To make matters clear, this is a type of array known as ‘mirror-type RAID’, as it saves the same data on all disks found in the array. Saving SOMETHING on your array means that you will find SOMETHING on each of the RAID array disks.
Suppose that one beautiful day you find a corrupted file on your data server and, to add a dash of drama, assume that this file is a document that belongs to your boss. According to how RAID 1 works, the corrupted file was instantly saved on all disk arrays and there is no possibility to retrieve it. All in all, despite the fact that the copies of data are located on all disks, in case of file corruption – all copies on all disks are damaged. That is the first difference between a copy on a RAID array and a backup copy. When it comes to backup, the moment a file becomes corrupt its copy will still be safe on a backup server. Thus, all copies are not created equal.
Let’s go back in time about two paragraphs, to the moment when your boss’s file hasn’t been damaged (same file server and array). Now, your boss downloaded something from torrents. It was supposed to be a free program for VAT taxes, but it turned out that it was a virus that deleted all the data. Just like in the first example (see proof No. 1), your boss’ data have been deleted from all disk arrays. A backup system should protect the data against deletion, giving a chance for data recovery at the same time.
File systems also tend to fail. In such case, if a file system in a RAID array is damaged, the damage will be replicated X times, where X is the number of disks in the array. But there’s also the other side of the coin. If you manage to fix the file system, it will also be repaired on all of the disk array components. However, if some data will be lost during a file system repair and the data protection with a backup system was not done, then you can write those data off.
I would like to devote my last proof to the topic which is substantially unpleasant – a complete damage of the data server. Fire in the server room or any other factor that could destroy the whole server means data loss. Yes, yes, I understand that in the same burning server room there can be (but doesn’t have to) a backup server. However, there are some techniques that deal with the problem, even when there’s fire – a fireproof cabinet for storing magnetic tapes with data or data storage outside the company (called Off-site Backup) or even outside the building with the data server.
I have outlined four arguments that show why RAID is not the same as backup. I have focused on the weak points of RAID as a system of protection when it comes to data loss. Luckily, there are also many positive aspects of using RAID. Unfortunately, apart from those that were mentioned in the introduction, I’m not going to discuss them as this is a very broad topic and it’s not directly connected with the topic of my post.
If you’re interested, try finding some material on the topic of RAID on the Internet. The ‘data protection’ feature of certain types of RAID can protect the data on disks in case of a disk or disks failure, depending on which RAID type is used and the number of disks in the array. And in my view, that’s a strong point of some types of RAID – fault tolerance.
RAID technology is often used on servers recording data backups. It is an environment usually supervised by a backup server application that uses RAID arrays as a storage unit. Personally, I consider this as a good practice, provided that a suitable RAID was chosen.
We know that your data is priceless – calculate how many disks you need to get it safe with our Open-E JovianDSS Storage and RAID Calculator!
Michael BatemanMarch 27, 03 2015 04:13:23
Thanks for that last paragraph about using a RAID as part of a backup system. The phrase “RAID is NOT BACKUP” kept appearing in certain posts and searches and so I made a point of clicking this and a few others to check myself. You are entirely correct, using a mounted volume from a NAS with a RAID as THE backup with no versioning and no offsite backup is not a backup. But I agree that using a NAS with a RAID as part of an overall backup strategy is fine. Just remember you need some type of Versioning system for “going back in time,” such as using part of said NAS as a Time Machine backup volume for Mac OSX computers.
I use a Synology 1513+ with one expansion unit (total of 10 SATA bays). It hosts three volumes, one for active use, one for periodic local backups of the active volume, and a third for Time Machine Backups, Carbon Copy Clones of my main system disk and other backup utilities. Lastly everything in the main, active drive gets backed up off-site to Amazon Glacier.
Thanks for the great post. -Michael
JohnApril 26, 04 2016 10:17:05
Great post, waiting for more:-)!
rsFebruary 20, 02 2017 01:36:25
Except the reasoning is wrong. What your looking for is a versioning file system on a RAID system. That solves problems 1-3 with RAID as backup. Then you also need a reliable hardware vendor who is around forever (like IBM and AS400). Case 4 can be solved by using a chassi that is fire proof and chaining the hardware to the office.
Cloud may seem like an alternative but has price, security and legal issues which are complex to deal with.
XavierMarch 03, 03 2017 10:55:23
I disagree with this new trend of saying that RAID is not a backup (raid1). You’ve got your data on 2 different disks, how is that not a backup? Sure, you’ll loose everything if the partition gets corrupted or if you accidentally delete a file. But that just means that this particular backup solution, like others, has a vulnerability. Copying your data on a separate computer is considered a backup but if a surge busts both computers then you’re out of luck. Does it mean it’s not a backup? RAID is a backup but just like every other backup solutions, it shouldn’t be used alone. It also depends on your level of comfort. I have some sandbox VMs that I run off the raid. I do that because I wanna protect myself agains HDD failures. But I don’t find that these images are important enough to copy on a second computer. I would even consider copying data in 2 different folders on the same partition to be a backup. A risky backup, but depending on the importance of that data, it might be just enough. Copying on RAID1 is still better than nothing at all anyway. And the day that one drive will fail and you will recover the data from the remaining drive, you will say: “Damn I’m happy to have a backup!”
AnonJanuary 08, 01 2020 01:46:57
It depends on what you want a backup for. RAID is a backup for a drive failing. Saying that it isn’t a hard drive would be like saying a hard drive which keeps a copy of all your data is not a backup because a hurricane could take both of them out at once. Before planning your backup solution you should first figure out what you are trying to defend against.