{"id":49243,"date":"2018-07-10T06:00:02","date_gmt":"2018-07-10T04:00:02","guid":{"rendered":"http:\/\/blog.open-e.com\/?p=49243"},"modified":"2025-04-04T06:53:00","modified_gmt":"2025-04-04T06:53:00","slug":"cluster-split-brain-explained-part-2","status":"publish","type":"post","link":"https:\/\/www.open-e.com\/blog\/cluster-split-brain-explained-part-2\/","title":{"rendered":"Cluster Split-Brain explained [part 2]"},"content":{"rendered":"<p>\t\t\t\t<span id=\"result_box\" class=\"\" lang=\"en\"><span title=\"zatem cz\u0119\u015b\u0107 druga jako kontunuacja stanu ko\u0144cz\u0105cego cz\u0119\u015b\u0107 pierwsz\u0105 \">Here it is, the second part of the cluster split-brain article series. In case you have missed the first part, you can read it <a href=\"http:\/\/blog.open-e.com\/cluster-split-brain-explained-part-1\/\">here.<\/a><br \/>\n<\/span><\/span><\/p>\n<p>The previous part finished with a <strong>Cluster Split-Brain <\/strong>event, so let&#8217;s start the second part from the moment when both nodes are issuing the same cluster resources.<span id=\"result_box\" class=\"\" lang=\"en\"><\/span><\/p>\n<p><span title=\"Mamy aktualnie dwa serwery (w\u0119z\u0142y) b\u0119d\u0105ce cz\u0119\u015bci\u0105 klastra, kt\u00f3ry uleg\u0142 rozszczepieniu. \">So, we currently have two servers (nodes) that are parts of a cluster that has been split. The n<\/span><span title=\"W\u0119z\u0142y nie maj\u0105 ze sob\u0105 komunikacji i oba przej\u0119\u0142y i udost\u0119pniaj\u0105 te same zasoby klastwowe - Udzia\u0142X oraz VIP1.\">odes do not &#8220;communicate&#8221; with each other but both nodes have taken over and shared the same cluster resources &#8211; ShareX and VIP1. <\/span><span title=\"Klienci zapisali r\u00f3\u017cne dane na te zasoby niezale\u017cnie zatem nie mo\u017cemy ich ju\u017c razem po\u0142\u0105czy\u0107. \">Clients have saved various data independently and we cannot connect them.<\/span><\/p>\n<p><span title=\"I tu pojawia si\u0119 mechanizm zabezpieczaj\u0105cy przed utrat\u0105 danych po wyst\u0105pieniu rozszczepienia klastra czyli po split-brain. \">And <strong>now!<\/strong> starts the mechanism that protects your whole environment against data loss after the cluster split-brain. <\/span><span title=\"W sytuacji przej\u0119cia zasob\u00f3w i utrarty komunikacji ze zdalnym w\u0119z\u0142em, ka\u017cdy z w\u0119z\u0142\u00f3w oznacza siebie jako odseparowanego - separated. \">In case resources have been taken over and communication with a remote node is lost, each node recognizes itself as separated. <\/span><span title=\"To oznaczenie jest wykorzystywane przy powrocie komunikacji mi\u0119dzy w\u0119z\u0142ami. \">This kind of recognition is used when restoring the communication between nodes. Nevertheless, w<\/span><span title=\"Rozwa\u017cymy natomiast co by si\u0119 sta\u0142o gdyby tego mechanizmu nie by\u0142o, a zatem: \">e will consider what would happen if the mechanism was not included.<br \/>\n<\/span><\/p>\n<p><span title=\"Wraca po\u0142\u0105czenie mi\u0119dzy w\u0119z\u0142ami cluster path\/mirror path.\">Imagine the connection path between cluster path \/ mirror path returns. <\/span><span title=\"Oba w\u0119z\u0142y maj\u0105 zaimportowanego poola Pool0 wi\u0119c oba zaczynaj\u0105 synchronizowa\u0107 JEDNOCZE\u015aNIE!\">Both nodes have Pool0 <\/span><span title=\"Oba w\u0119z\u0142y maj\u0105 zaimportowanego poola Pool0 wi\u0119c oba zaczynaj\u0105 synchronizowa\u0107 JEDNOCZE\u015aNIE!\">imported,\u00a0<\/span><span title=\"Oba w\u0119z\u0142y maj\u0105 zaimportowanego poola Pool0 wi\u0119c oba zaczynaj\u0105 synchronizowa\u0107 JEDNOCZE\u015aNIE!\">so both of them start synchronizing <\/span><span title=\"swoje dane na drugi w\u0119ze\u0142!\">their data for the other node <\/span><span title=\"nadpisuj\u0105c jednocze\u015bnie dane, kt\u00f3re ju\u017c si\u0119 na nim znajduj\u0105!\">while overwriting the data that is already on it <\/span><span title=\"swoje dane na drugi w\u0119ze\u0142!\">AT THE SAME TIME!<\/span><span title=\"nadpisuj\u0105c jednocze\u015bnie dane, kt\u00f3re ju\u017c si\u0119 na nim znajduj\u0105!\">\u00a0<\/span><span title=\"\u017cadnych danych praktycznie nie da si\u0119 odzysaka\u0107. \">It&#8217;s a recipe for disaster, as the data cannot be recovered. Even i<\/span><span title=\"Nawet je\u015bli przed napraw\u0105 po\u0142\u0105czenia mi\u0119dzy w\u0119z\u0142ami wy\u0142\u0105czymy np w\u0119ze\u0142 A, naprawimy lini\u0119 cluster path\/mirror path i dopiero uruchomimy w\u0119ze\u0142 A, to wstaj\u0105c do\u0142\u0105\u0107zy on do klastra i odda swoje dyski pod zarz\u0105dzanie w\u0119z\u0142owi B, jako, \u017ce to on zarz\u0105dza aktualnie Poolem0.\">f we turn off node A before fixing the connection between nodes, we will repair the cluster path \/ mirror path and just start node A, it will join the cluster and put its disks under the control of node B, as it manages the current Pool0. <\/span><span title=\"Dane w\u0119z\u0142a A zostan\u0105 nadpisane przez dane z w\u0119z\u0142a B - jest katastrofa ale w mniejszym rozmiarze bo stracili\u015bmy &quot;tylko&quot; dane w\u0119z\u0142a A i oba w\u0119z\u0142y maj\u0105 teraz dane w\u0119z\u0142a B. Analogicznie b\u0119dzie je\u015bli to w\u0119ze\u0142 B zostanie wy\u0142\u0105czony, linia naprawiona itd. \">The data from node A will be overwritten by the data from node B. Which is also a disaster but a minor one because we &#8220;only&#8221; lost data from node A and both nodes now have data belonging to node B. An analogical situation happens if node B is turned off.<\/span><\/p>\n<p><span title=\"No wi\u0119c co spowoduje nasz mechanizm separacji? \">So what will cause our separation mechanism?<\/span><\/p>\n<p><span title=\"Jak ju\u017c wiemy, dzi\u0119ki niemu ka\u017cdy w\u0119ze\u0142 oznaczy\u0142 si\u0119 po swojej stronie jako separated.\">As we already know, thanks to this mechanism each node was marked on its side as separated. <\/span><span title=\"Po naprawie linii cluster path\/mirror path pomi\u0119dzy w\u0119za\u0142mi, sprawdz\u0105 one w pierwszej kolejno\u015bci jaki stan ma drugi w\u0119ze\u0142.\">After fixing the cluster path \/ mirror path line between the nodes, they first check the state of the second node. <\/span><span title=\"Je\u015bli oba w\u0119z\u0142y maj\u0105 stan separated to \u017caden z nich nie udost\u0119pni swoich dysk\u00f3w drugiemy w\u0119z\u0142owi, aby nie dosz\u0142o do nadpisania, i tym samym, utraty danych.\">If both nodes are in separated mode, none of them will share their disks with the other node in order to prevent overwriting, and thus data loss. <\/span><span title=\"Nawet je\u015bli przed napraw\u0105 linii komunikacyjnej kt\u00f3ry\u015b z w\u0119z\u0142\u00f3w zostanie wy\u0142\u0105czony, linia naprawiona, i w\u0142\u0105czony ponownie to ma on ju\u017c po swojej stronie stan separated.\">Even if the node had been switched off before the communication line was repaired, and then the line was repaired and switched on again, it still has a separated state on its side. W<\/span><span title=\"Oznacza to, \u017ce przy pod\u0142\u0105czaniu si\u0119 do klastra wykryje r\u00f3wnie\u017c stan separated na zdalnym w\u0119\u017ale i nie udost\u0119pni mu swoich dysk\u00f3w dzi\u0119ki czemu dane s\u0105 bezpieczne. \">hen connecting to a cluster, it will also detect the separated mode on the remote node and will not share its disks, which means the data is secure.<\/span><\/p>\n<p><span title=\"No dobra ale co klient mo\u017ce zrobi\u0107?\">Okay, but what can the client do?<\/span><\/p>\n<p><span title=\"Ano du\u017co i to nie mo\u017ce tylko musi.\">Well, a lot actually. <\/span><span title=\"Powinien sprawdzi\u0107 we w\u0142asnym zakresie dane na obu w\u0119z\u0142ach i np. wykona\u0107 kopi\u0119 danych z w\u0119z\u0142a A, kt\u00f3re nie znajduj\u0105 si\u0119 na w\u0119\u017ale B. I wymusi\u0107 udost\u0119pnianie dysk\u00f3w z w\u0119z\u0142a A do w\u0119z\u0142a B i na odwr\u00f3t - umo\u017cliwia to specjalna funkcja na interfejsie przegl\u0105darkowym Joviana.\">The user should check the data on both nodes on his own and, for example make a copy of data from node A (the data which are not on node B). Next, he or she should force sharing disks from node A to node B and vice versa &#8211; this is possible thanks to a special functionality in the Open-E JovianDSS webGUI. <\/span><span title=\"Po tej operacji weze\u0142 B nadpisze dane na w\u0119\u017ale A i klient mo\u017ce teraz spokojnie dogra\u0107 dane, kt\u00f3rych w\u0119ze\u0142 B nie posiada\u0142 z wykonanej wcze\u015bniej kopii zapasowej. \">After this operation, node B will overwrite the data on node A, and the client can set up the data which node B did not have from the previously created backup.<\/span><\/p>\n<p><span title=\"Ok ale gdzie w tym wszystkim mechanizm zabezpieczaj\u0105cy przed split brainem?\">But where is this mechanism that prevents split brain? <\/span><span title=\"I dlaczego nie dzia\u0142a w tym przypadku?\">And why it does not work in this case? <\/span><span title=\"Nie da si\u0119 tego jako\u015b bardziej zabezpieczy\u0107? \">Can it be more secure?<\/span><\/p>\n<h3><span title=\"Na te wszystkie pytania odpowiem w cz\u0119\u015bci 3 moich wypocin\">The answers to all these questions will be revealed in part 3 of this series!<br \/>\n<\/span><\/h3>\n","protected":false},"excerpt":{"rendered":"<p>Here it is, the second part of the cluster split-brain article series. In case you have missed the first part, you can read it here. The previous part finished with&nbsp;&#8230;<\/p>\n","protected":false},"author":2,"featured_media":49283,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[14,29,18,27],"tags":[146,150,154,191,198,205,428,469,621],"class_list":["post-49243","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-backup-storage-technology","category-data-protection","category-hardware","category-open-e-joviandss","tag-cluster","tag-cluster-split-brain","tag-clustering","tag-data-protection","tag-data-security","tag-data-synchronization","tag-node","tag-open-e-joviandss","tag-split-brain"],"acf":[],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/posts\/49243","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/comments?post=49243"}],"version-history":[{"count":1,"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/posts\/49243\/revisions"}],"predecessor-version":[{"id":55100,"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/posts\/49243\/revisions\/55100"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/media\/49283"}],"wp:attachment":[{"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/media?parent=49243"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/categories?post=49243"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.open-e.com\/blog\/wp-json\/wp\/v2\/tags?post=49243"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}