Open-E magic of failover - it admin server room
0 Liked

    The Magic of Failover or How to Avoid the System Downtime

    While every data storage provider is concerned about protection by means of encryption, vividly describing the topics of ransomware, malware, and other kinds of possible hacker attacks, we tend to forget to beware of the natural disasters that can happen. By natural disasters, we mean possible downtimes caused by fires, floods, and any other kind of simple system failures, which can happen regardless of how secure your solution is. Let’s see some of the statistics:

    • The AFR – annualized failure rate – for hard drives has risen in the past three years:

    So what can your company do to ensure the constant availability of the data stored on a server? In the case of Open-E JovianDSS failover functionality available in cluster configurations, there is an opportunity to avoid downtimes in the workload. 

    In this article, we will try to share the knowledge about failover in simple terms, describing it in a form of answers to two essential questions:

    • What is failover?
    • When is failover possible?

    Let’s get deeper into the topic.

    What is failover?

    Before describing it in real life, let’s look at a more technical definition of failover. Here’s one from TechTarget:

    • Failover is a backup operational mode in which the functions of a system component are assumed by a secondary component when the primary becomes unavailable. An organization can fail over either after failure or during scheduled down time.
    • Failover is an integral part of mission-critical systems and often a key component of disaster recovery.

    Well, it doesn’t sound so simple, right? Let us explain it with an example that is easy to imagine. Assume you keep the memorable moments of your life in the camera roll of your cellphone, so you can browse photos just anytime. In addition to that, you have a backup copy of those pictures or videos on your computer. This way, if your cellphone breaks, drowns, or does anything else you don’t wish to happen to your device, you will still have a copy of the camera roll and have access to it on the computer. Moreover, when you repair the phone or buy a new one, you can download the camera roll record back.

    Failover works similarly. It allows you to switch the activity from one node to the other in case of any kind of failure and backward when the failure is fixed. In order to refill the failed node with the data stored on the other one, the system initiates such an action as resilvering, which compares the state of both systems and adjusts it to the latest version. Let me be more precise on this one. All data is a sum of binary codes, so your data storage solution will compare those sums on both nodes (checksumming) and initiate the resilvering process to make the data stored on both systems equal.

    When is failover possible?

    As you have probably guessed from the description of the failover mentioned above, it takes two nodes to make it possible. Or, as a data storage specialist would say – a cluster. However, suppose you consider failover in a “cluster-in-a-box” configuration. In that case, you have to remember that the nodes are kept within the same system and will not save you from the power failure of the server, flood, or burn down (but if the disks fail, failover will be the best shot).

    In Open-E JovianDSS, there are several cluster options that can ensure the constant availability of data: shared storage cluster configurations with Fibre Channel protocol (as its length can be even up to 300 km if the low latency factor on the required size is met) and non-shared storage clusters with Ethernet connections. Thanks to such a long distance the second node in the cluster can be located in a different building or even a city (generalizing: the off-site location). That is why Open-E JovianDSS On- and Off-site Data Protection will keep your data constantly protected and available no matter what kind of hardware failure or natural disaster may take place.

    As for the data replication, there are alternative cluster configurations. In Open-E JovianDSS, you can choose between two options: active-active and active-passive cluster. The terms speak for themselves: the active-active configuration has both nodes working as the production ones, ensuring the load balancing of the data recorded (making the read and write performance faster by having both parts of the cluster involved in the active performance); the active-passive configuration keeps one of the nodes as a “redundant medium” of the other one simply by copying the state of the active node.

    Other key features of the Open-E JovianDSS High-Availability Cluster are:

    • Multiple Protocols – the cluster supports SMB, NFS, iSCSI, and FC protocols.
    • Multiple Storage Types – it supports two server nodes connected to a single storage, and one or more JBODs. It also supports SAS and FC SSD/HDD disks.
    • Advanced Cluster Management Software that enables quick access to all features related to the cluster setup.
    • The Independent Virtual IP (VIP) Feature creates a connection to the data that is available regardless of which node is active at a given time
    • Automatic Failover – high availability is achieved by detecting hardware failures and automatically moving the VIP from the primary to the secondary node without the client servers noticing a timeout.

    In order to see how the failover functionality in Open-E JovianDSS works, check out the video below:

     

     

     

     

     

    Rating: / 5.

    No votes yet

    Leave a Reply