The Fundamentals of RAID
0 Liked

    The Fundamentals of RAID

    This is a Viking. He has a helmet. The helmet has no horns. Viking helmets didn’t have horns. He has full armor. Vikings wore full armor just like everyone else in that time period. So, why is there a picture of a Viking on our technology blog, some of our more intrepid readers might be thinking? Well, the answer is simple enough, it’s because today we talk about raiding… or rather, RAID, which admittedly might not be as exciting to some as raiding but is fundamental to the way data storage functions nonetheless.

    It’s also an excellent way to show how the word evolved from one that inspired terror and chaos to one that now signifies security and protection from what, for a lot of people, would certainly be terror and chaos. So what is a Redundant Array of Independent Disks and how do we use it in the development world?

    The Softer Side of RAID

    The first thing you, our intrepid reader, should know is that there are two main types of RAID (along with a plethora of customizable options). The first type is “hardware RAID” where the RAID engine is actually built into the server hardware. This is the kind of RAID you can find recommended for one of our earlier products, Open-E DSS V7, albeit that product also supported software RAID. The second type is “software RAID” where the RAID support consists of software running on the server.

    Software RAID can be configured and used in many different ways. For instance, it can be configured to abstract many disks into one virtual device like in OpenBSD’s softraid, act as a more generic file manager like in LVM, or become a component of the file system like in BTRFS or ZFS (which is actually how Open-E JovianDSS uses it). This multitude of setups and configurations is one of the reasons that RAID has become as popular as it is today.

    Hardware RAID: Vendors, Controllers et al.

    So what is hardware RAID? Well, like it’s name implies, hardware RAID consists of a RAID controller, a piece of hardware that acts as a go-between for the operating system and physical disks, that is actually added to the system in question. It’s typically configured via the BIOS or some other system prior to the launch of the OS. 

    There’s several issues that contribute to people opting for software RAID instead of hardware RAID like the fact that it requires proprietary software. This, in turn, creates a vendor lock-in where the customer is stuck to the hardware RAID vendor that they purchased the hardware RAID from if they want to continue to use said hardware. This isn’t the case in software RAID where the software uses the common resources of the system including any other applications or services. While this can be good or bad, it does give users a level of flexibility that the vendor lock-in of hardware RAID can’t match.

    The Various Levels of RAID 

    RAID has many levels. It starts at 0 and goes upwards to 6 in terms of the most popular ones but there’s also ways to hybridize it or embed it (nested RAID) which creates concepts like RAID 10 or RAID 100 amongst others. Given that discussing the differences between all of the various ways to hybridize RAID could easily form several articles, it’s not something that will be discussed here. What will be discussed here is the most common levels of RAID, starting with level 0 and including 1,5 and 6.

    RAID 0, also in some circles passionately known as “not real RAID”, is arguably the simplest RAID configuration a company or individual could have. It usually consists of just a couple of disks, which in this case would mean two but could potentially mean more than that, and doesn’t really offer much in terms of protection. It just stripes (splits) data evenly across the disks with no redundancy, parity information or fault tolerance

    Sounds pretty awful, right? So why use it then? The answer is simple enough, speed! RAID 0 is considered the fastest of the RAIDs and if you’re more concerned with performance than security then it could very well be the best option for you or your company. It’s generally used in things like gaming and scientific computation.

    Let’s move on to RAID 1, the first “true” RAID level that many would consider to be proper RAID. This RAID level consists of two or more disks, although traditionally it has been just two disks, and does something called mirroring which puts the same data on two different disks. This differs from RAID 0 as it doesn’t stripe the data across two disks but rather adds it to one and then adds the same exact data to the second. 

    This makes RAID 1 significantly slower than RAID 0 but a lot more secure as if anything would happen to one of the disks, the other would still be there with all the data. The space that is usable is the equivalent of the smallest of the discs available i.e. if a 250gb disc and a 500gb disc are used then the space available would be 250gb. This configuration offers no striping, parity or spanning of disk space across multiple disks and is mainly used as a relatively lightweight but secure system for home and small scale use.

    Now we move on to RAID 5, or what most people probably think of when they think of RAID (the type that doesn’t include horned helmets). This type of RAID usually has at least 3 disks, consists of typically block level striping and has distributed parity built into every disk.

    RAID 5 provides a proven method of data protection that is trusted across the world. If you’re a bigger business or enterprise then someone at your place of work has probably heard of or might even be using this system right now. That being said, as with everything in life, it’s not without its problems. For instance, the notorious RAID 5 write hole.

    The last RAID level we’ll be discussing is RAID 6. RAID 6 just builds on the security and functionality features of RAID 5 by adding another distributed parity level while at the same time looking to keep the performance as fast as RAID 5. That being said, RAID 6 is typically slower than RAID 5 in a lot of circumstances so it’s mostly used in situations where the performance hit isn’t really an issue while security is. This is further highlighted by the fact that in RAID 5 only one disk can fail before the data is lost while up to two disks can fail before that same data is lost in a RAID 6 configuration.    

    The Wave of Sophisticated RAIDZ

    So how does all this affect Open-E JovianDSS and the ZFS system it’s based on? Well, ZFS and Open-E JovianDSS don’t use traditional RAID but they do use a somewhat similar configuration called RAIDZ. So what is RAIDZ and how does it differ from traditional RAID?  

    RAIDZ was designed specifically for use with ZFS and differs from RAID in several ways. First of all, there’s a fundamental difference in RAIDZ’s technical design compared to RAID. RAIDZ is designed in a way that allows it to not only know what disks are being used for storage but also what the data blocks on those disks are. Regular RAID, on the other hand, has to use some other file system on top of it in order to manage the files. 

    So what does this mean for RAIDZ in terms of performance? Well, for one, restoring data is generally quicker with RAIDZ whenever there’s a large amount of free space on a disk as RAIDZ can check which data needs to be restored and from which disks whereas with RAID, the entirety of the disk would have to be restored including all the free space. Going further in-depth here, with RAID, the block and parity patterns remain regular throughout.

    In stark contrast to this, the block size and parity patterns in RAIDZ are not regular. The blocks don’t have a set size and the patterns can change depending on how data is written onto the disk. As such, RAIDZ generally needs more access overall than typical RAID in order to determine what information is where. This sophistication increases computational requirements but also allows RAIDZ to manage data in a way that’s more efficient than traditional RAID which, in turn, allows processes like data restoration to really shine.

    There are some other key technical differences between RAIDZ and RAID as well. One of these is the way the data is actually striped on the blocks, with RAIDZ typically trying to stripe the data across as many blocks as possible and limiting the amount of information that’s ever only stored on one disk to the size of a sector whereas in regular RAID it could be a whole block. Another difference is the amount of overhead that RAIDZ generally uses compared to RAID with traditional RAID usually taking a disk for parity while RAIDZ typically requires slightly more. In pure numbers, RAID usually takes about 33 percent overhead while RAIDZ could take anywhere from 33 percent to, in worst case scenarios, 100 percent overhead. This is done for various reasons, greater performance and security being two typical ones. The last significant way that RAIDZ differs technically from RAID is the way that it goes about its read-modify-write cycles. With RAIDZ, data parity is independently computed and RAIDZ doesn’t have to read each data extent every time whereas, with regular RAID, the RAID controller has to read all the data blocks in the stripe before it can accurately compute the new parity block.  

    In regards to RAIDZ levels, there are currently three common ones. RAIDZ, or RAIDZ 1 as it’s also known, is essentially RAID 5 and offers single parity like RAID 5. RAIDZ 2 offers dual parity and is most similar to RAID 6. RAIDZ 3 offers triple parity and doesn’t really have an equivalent in the regular RAID configurations that are available today.

    So why use RAIDZ? Well RAIDZ, or rather, some of the sophisticated mechanisms it’s intertwined with, like self-healing, deduplication, and compression, helps solve some pretty common problems that RAID has like the previously mentioned notorious write hole. It’s also just a generally faster, safer system than most traditional RAID setups if you’re using modern hardware and have certain options like compression enabled. 

    The downside of RAIDZ in comparison to traditional RAID is also its upside in a different sense, and that’s the aforementioned sophistication. While its internal sophistication allows it to do some things faster and more securely than traditional RAID, it also sometimes requires much more skill to determine what’s happening with the system and to repair it if something, albeit rarely, does actually go wrong. 

    This is a problem that Open-E JovianDSS addresses by adding a GUI as well as a host of additional mechanisms to make using RAIDZ just as simple as traditional RAID. At the moment, the Open-E JovianDSS GUI and underlying processes allow a user to create a RAIDZ pool in moments just by clicking a few options, ergo minimizing much of the downsides of the underlying sophistication. 

    dRAID: A Very Different Type of RAIDZ

    And then there’s dRAID, which we’re currently working on implementing into Open-E JovianDSS. So, what is dRAID and why is it so special? According to OpenZFS, “dRAID is a variant of raidz that provides integrated distributed hot spares which allows for faster resilvering while retaining the benefits of raidz”. It’s special in that it introduces a concept called “parity declustering” by constructing vdevs from multiple internal RAIDZ groups that all have their own internal P parity devices and D data devices. It also uses a fixed stripe width which allows it to resilver sequentially and carefully chosen precomputed permutation maps. The latter allows the rebuild I/O to distribute evenly among the surviving drives should a disk fail.

    So what are the benefits of using dRAID? There are several, including a potentially reduced effective compression ratio due to the relatively large allocation size, having access to an integrated spare disk that’s fast as well as distributed (which in turn allows it to use both sequential and traditional healing resilvers), and having a fast pool creation process that also includes mapping that can’t be lost or damaged due to the way the pool is created using the aforementioned precomputed permutation maps. The fact that dRAID has access to both sequential and traditional healing resilvers means that any time spent resilvering is significantly shorter than would be using just a traditional hot spare.         

    The Wrap Up

    In closing, this is generally what RAID is and how the different levels of RAID function. If you’d like to know more, the links generally go to our RAID guide section which is a bit more technically detailed so feel free to check it out.

    Now it’s your turn. What do you think: should RAID 0 be considered RAID? Is hardware RAID better than software RAID? Is RAIDZ more sophisticated than regular RAID? Did viking helmets have horns? Let us know in the comments below!

    Rating: / 5.

    No votes yet

    Leave a Reply