snapshots, data storage, jovianDSS
0 Liked

    How Do ZFS Snapshots Really Work?

    Before discussing how a snapshot works in Open-E JovianDSS, we should start with its definition.

    According to Wikipedia: “In computer systems, a snapshot is the state of a system at a particular point in time. The term was coined as an analogy to that in photography. It can refer to an actual copy of the state of a system or to a capability provided by certain systems.”

    Put simply, a snapshot is a copy of the state of a system at that exact moment. Don’t confuse a snapshot with a screenshot. 🙂 Snapshots are created in a read-only format and, at least initially, do not occupy too much space on a disk as each snapshot takes up a fraction of the initial data set. The snapshots mostly contain metadata and pointers with more just added as more snapshots are taken. The pointers and metadata refer to data that already exists on that volume. 

    With that said, let’s highlight the main questions about this functionality from the description above and discuss them one by one:

    • When should you take a snapshot?
    • Why does it contain metadata?
    • How do we restore data from a snapshot?

    Creating a Snapshot

    The answer to the question of when to create a snapshot is quite simple: at any moment you want. With Open-E JovianDSS, you can choose between setting a retention plan (which would allow you to automatically create and remove snapshots) or manual snapshot creation. The latter may appear to be the most straightforward option, given that it’s just clicking a button at any moment you want, but you still have to remember to actually do that. If you have a retention plan, the system will automatically create the snapshots as often as your business requires (starting from once per minute). With a retention plan, you can also set the period these snapshots are stored for. After this time period elapses, they will be automatically removed, making space for new ones. 

    The Importance of Metadata 

    As we’ve already mentioned, all snapshots contain metadata. The snapshot records the current locations of all the relevant pieces of data on a disk, occupying that space for potential rollbacks. How does this work?

    Well, first of all, each piece of data is scattered on the disk in an unpredictable order. Then, when a snapshot is created, it memorizes the locations of the data and freezes them, preventing any further modifications. This is achieved by using Copy-on-Write (CoW) functionality: one of the ZFS features that Open-E JovianDSS utilizes. With this functionality, instead of adding changes straight to the original data, the system first creates an additional copy on a different block and updates it. This way, the process of changes can be adequately controlled and data can be modified without changing the data that’s in the snapshot.

    This is why you can restore the data to a previous state without significant consequences. This can even be done instantly, no matter which modifications were made, whether it was simple data removal or a complex process like, e.g., encryption by ransomware. Storing snapshots does not take as much capacity on the server as a regular data backup. However, please do remember that a snapshot itself is not a backup!

    With all that said, it is worth keeping in mind that all the snapshots are interconnected. The metadata in each snapshot is linked to older states of the data, which were recorded in preceding snapshots. Thus, when the retention period ends, and it comes time to erase the reference snapshot, the subsequent ones have to absorb some of its data. 

    It is essential to remember that even a snapshot copy without any changes to the data also requires space on the disk for the metadata it has. It’s true that a snapshot is light and does not occupy too much capacity, but what if you take tens of thousands (or more) of those lightweight snapshots? If you have a frequent retention plan set, snapshots can start to clog up your server’s capacity. And if there are any changes to the data, then the volume must keep both the old version (preserved by a snapshot) and the new version occupied on the disk.

    Data Restoration From a Snapshot (Clones & Rollbacks)

    There are 2 main questions users typically have about recording data using a snapshot. The first is, how do we reinstate the data? The second would be, how do we modify the information afterwards? These are especially concerning if the user knows that the snapshot, as claimed, is read-only and cannot be modified. 

    There are two useful functionalities that let you restore the specific state of the data and make the data volume from the snapshot writable again, doing a rollback and creating a clone

    The rollback functionality, which we’ve mentioned above, can recreate the data state of the original volume from a particular snapshot to undo the changes made through unwanted updates or other unfortunate events. Those events could be anything as minor as data removal or even as major as a ransomware attack (as this is also a modification of the data, thus making it reversible).

    The clone is a writable version of the volume state preserved in the metadata of the snapshot. When you choose to make a clone from a snapshot, the process creates a writable copy of the snapshot’s metadata that then allows for further modifications to be made to the data. You can make updates to the cloned data and the new data will start being recorded on a new data block allocated specifically to the clone (without any modifications to the original snapshot!). Should you then opt to delete something, the clone will just remove its reference to the original snapshot’s block (without any changes to the snapshot or original volume).

    The clone is dependent on the snapshot that was used to create it, which is why you cannot delete the original snapshot (without deleting the clone). To break the connection between the snapshot, clone, and the data itself in Open-E JovianDSS, you can manually copy the data from the clone to a new volume. After this is done, none of the data recreated by using a clone will be affected by the removal of the original snapshot or clone. 

    Open-E JovianDSS’ snapshots feature is an essential piece of any data protection plan. Snapshots also ensure data integrity by keeping a regular and consistent record of all the changes made to the data set.

    Ensure the safety of your business continuity and download the free trial today!

    Fun-fact

    The definition of a snapshot appeared as early as 30 years ago, and it looked like this:

    Snapshot – a dump usually of a selected area of storage taken at specified times during the execution of a routine, thereby providing a time history of this section of storage for debugging purposes (Macmillan International, 1991).

    Can you spot the differences? Share your thoughts in the comments!

    Macmillan International. (1991). Macmillan Dictionary. In Macmillan Dictionary of Data Communications. (2d ed., p. 457). Charles J. Sippl

    Rating: / 5.

    No votes yet

    Leave a Reply