Were you ever in the middle of saving some changes when the power went out? Did you lose both the changes and the original version? This typically happens when the data is being saved via a read-modify-write (RMW) operation and something goes wrong. Before getting into why this happens, let’s quickly review what a read-modify-write operation actually is.
A read-modify-write operation typically reads the data, modifies it by making the requested changes and then writes over the original, applying said changes. This operation is great as it allows changes to be made to all sorts of data, big or small. That being said, it’s not without its downsides.
The aforementioned major downside is the simple fact that if there is ever an interruption while changes are ongoing for any reason then all the new data as well as the original data could be corrupted or lost.
Open-E JovianDSS has several features in place that prevent data loss like this from occurring. These features include things like atomic transactions and advanced ways of performing write operations, which we’ll be discussing in this article.
Solving the RMW Problem: Atomic Transactions
Atomic transactions are a relatively straightforward way that Open-E JovianDSS helps ensure your company’s data integrity. Basically the way atomic transactions work is that either the transaction completes in full or it doesn’t complete at all, in which case the system would just keep on using the data from its previous state as though no transaction ever took place. Atomicity ensures that the system checks every transaction to make sure that it was fully completed before allowing the changes to take place.
This is important as it eliminates a plethora of problems that could otherwise occur if the transaction wasn’t conducted using atomicity. These problems include things like lost and/or uncommitted updates, inconsistent analyses and transactions generally preventing other transactions from functioning appropriately.
So how can we ensure that all the transactions are atomic? Well, in our case it’s done by using redirect-on-write, a variant of copy-on-write that both Open-E JovianDSS and ZFS use. Let’s start by explaining what copy-on-write is generally.
Solving the RMW Problem: Copy-on-Write
So what is advanced write technology like copy-on-write, generally speaking? Well traditionally the way copy-on-write worked and still works in a lot of systems is similar to the way read-modify-write works during the first few stages. Both operations read the data but when a request that uses copy-on-write occurs, the system makes a copy of the original data and sends it to a different data block before then proceeding to the modify and write portions of the operation.
Therefore, copy-on-write actually reads the data and performs two writes, one to copy the original data before sending it off to be stored to a different block and a second write to change the original to the new version. The writes can also be set to use the aforementioned atomic transaction process to limit data corruption and ensure data integrity if programmed that way.
This differentiates it from the original read-modify-write operation where no copy of the original data is made and the original is modified directly at the exact place it’s stored. This is why both the original and new data are lost should a disruption occur while the read-modify-write operation is applying changes to the original copy. The same isn’t true if using copy-on-write due to the original data already being safe and secure in another block before the changes start to take effect. That is copy-on-write in its “purest”, or some would say, most generally understood, form.
Examples of where copy-on-write is still used prevalently in its truest form include the Logical Volume Manager (LVM), a volume manager that’s still found in a lot of Linux and Unix systems currently used. It should be noted though that not all LVM versions out there still use the traditional copy-on-write technique as even LVM has started to transition to redirect-on-write.
The Rise of Redirect-on-Write
So at this point you might be thinking, “Well, if that is copy-on-write then what’s this redirect-on-write that both Open-E JovianDSS and ZFS use”?
Well, redirect-on-write happens to be another prominent variant of copy-on-write that has even occasionally eclipsed the most common one. So how do the two differ? Redirect-on-write differs from the original in that instead of making a copy of the original and then sending it to a different block, the original is left untouched in its block. The changes are instead stored in a different block with a pointer system that lets the system know exactly where the changes and the original are so that the correct data can be shown upon request. This is also the original that the system can use should it ever need to due to a failed atomic transaction verification.
This, in effect, means that the operation is even further streamlined, only requiring one write operation to traditional CoW’s two, and that the original copy is potentially even more secure than with traditional copy-on-write. This is one of the reasons why redirect-on-write is gaining ground on traditional copy-on-write or being used in addition to copy-on-write in hybrid configurations to maximize the gains that combining the technology allows.
We experienced this ourselves personally at Open-E when we switched from the more traditional CoW method used in Open-E DSS V7, which uses a LVM based volume manager, to a more RoW based method in our newest product, Open-E JovianDSS, which is based off of ZFS. That being said, redirect-on-write isn’t perfect and it does have issues with things like data fragmentation. These issues are currently being addressed by the ZFS file system that Open-E JovianDSS is based on. Nevertheless, redirect-on-write does have it’s problems and it really is up to the individual to decide which CoW they’d prefer.
The Wrap Up
In any case, by combining atomicity and software that boosts the boons provided by copy-on-write whilst minimizing its downsides, Open-E JovianDSS can ensure that your company’s data will be as safe as can be should something unexpected happen while it’s being updated. So what do you think, are atomic transactions the way of the future? Can RoW and CoW be used interchangeably? What’s the most important document you’ve ever lost due to a failed RMW operation? Let us know in the comments below!