Welcome to Open-Experts — The Data Storage Podcast!
Our charismatic host, Todd Maxwell, with almost 20 years of experience in the data storage…
Read MoreWelcome to Open-Experts – The Data Storage Podcasts, episode #4!
Picture this: your company’s data hits 100% of your system’s capacity. Chaos, downtime, and costly fixes follow. Sound familiar? Don’t let it happen to you! In this episode of Open-Expert – The Data Storage Podcast, we dive into the looming issue of data storage capacity. So join us as we explore practical, ready-to-implement strategies to keep your systems running smoothly, even as your data storage demands grow. We share expert insights into optimizing the storage and highlight cutting-edge solutions like Toshiba’s MG Series 22TB disks — designed for those who need serious space.
Whether you’re a small business or managing enterprise-level data, this episode is packed with actionable tips to help you avoid the dreaded “disk full” scenario. Tune in and future-proof your data storage strategy today!
or read the transcript:
I’m back with another message to all you capacity storage, synchro, asynchro engineers! Okay, here is one for you all. Have you ever heard the term “don’t paint yourself into a corner”? Well, this just means you don’t have enough space, capacity to store your data.
So I had a case recently where one of our customers ran out of capacity. I know many of you had or will have this issue, even with your cell phone filling up, but for production – not good, right? So this customer had filled up the zpool to about 99.5% of the 1.2 PB they had. There are a couple options for them to overcome this issue that I will go over in a minute.
Now, we do have defaults in the system that is set to 80 percent thresholds, and you can change this as well to, let’s say, a higher or lower value, whatever that is, when it reaches that value. And you will receive emails when it does reach that value. That is if you set up the emails, right? Though most don’t, from what I see in cases, crazy, I know. But I’ll get into that on another day…
Not all OSs have the same percent of reaching thresholds. Okay, so, for example, let’s say Windows VSS. It is recommended to have, what, 15 to 20% of free space available. Yes, you can fill it up, but the applications will start to fail due to, what? Insufficient drive space to write. So keep in mind – when the system is full, you will feel it, and everything will slow down to a crawl. And the OS will go sh…, I’m not going to say it, a crap, not to mention the hate mail you’re going to get from your boss!
So, we provided some options for this customer to think about to get out of that painted corner.
Get a new server with bigger drives, copy the data from the old server to the new server. Now, many can do that, but, as typically most companies and engineers view the lifespan of a server, what, 3 to 5 years? These are the rich kids on the block for sure, and in many ways, they are right, as their company policies require this.
Option two is to keep the server and do what’s called an in-place upgrade of the drives. This is done by failing one drive in the disk group and replacing it with the new drive that is, let’s say, two times bigger, for example. So you have an older 8TB drive, and you want to replace it with the new 22TB. Now, this works fine, uh, but it does take time, and you can only do one drive, like I said, in a disk group at a time until it’s completed. You could do two in a Z-2, but that’s going to cripple your system. But you repeat this procedure until all the drives have been replaced. And then you go into our console tools, and you do control decks, which is our extended tools.
And there’s an option there to expand the pool size after replacing all the disks.
Start to delete files that you do not need, they do not need. Now, this is a bigger topic in itself, as I’ve seen in many cases where companies go through several, admins, over the course of many years, and they forget or don’t know, or other reasons, or maybe they’re at the bar too much, which in many cases I would be there as well, but I’m not going to get into that one for sure. But this has consequences if that data is needed for a legal battle. Many times, we’ve seen this, and they wish they did save it. And the time to sift through the files may, my God, would just be enormous, right? In itself. I mean, it’s very painful asking every person from C level to HR to the janitor, Hey, do you need these files? Can you look them over? Can you decide which ones you can delete? And then inbred says, “Hey, I’ll be back from vacation!”. You get the idea – next thing you know, you’re stacked up to two to three months and getting that resolved, and data continues to grow.
So, the customer opted to do option two to keep the server-replaced drives over time. They were able to move some of the data to another server. And by the time, well, they did this, it worked, which in their case made sense as the CPU was good enough to perform for the next four years. And it was already 3 years old. And the HBA was okay. Firmware is okay. And the 10 gig-ees were fine for their usage on their bandwidth. So that all worke!
So, with that, let’s talk about one of the big capacity drives our QA team tested and certified. It’s the massive 22TB Toshiba. The model is an MG10SFA22TE.
And by the way, you can download the PDF report from our site. Just go to the technical support tab, and you’ll see on the left to click on the Open-E certified hardware components. So you can get the full report. The short summary for this Toshiba drive is a 3.5 inch, it’s SAS, 7200 RPM, 512MB buffer size, all tested with a single node, HA, shared cluster, and metro cluster.
All tests were performed with zpool configuration with a RAID Z2, so some of you don’t know, Z2 is like a RAID 6 for the disk groups, each having two disks for parity.
Now, our tests were performed with 4k, random workloads, and 1MB sequential workloads. And in the reports, we show the following performance results. You know, like mixed random IO performance, random read IO performance, random write IO performances, and, of course, sequential read megabyte performance. These are very important. Oh, and one more, the sequential write megabyte performance as well. So this gives you a big round observance of what the performance of the drive will do.
Again, to get these PDF reports, just go to our Open-E website, click on the technical support tab, and you will see on the left to click on the Open-E certified hardware components. There’s a plethora of other reports done.
Okay, time for me to end this as I have cases to work on as we are so busy like a three legged hamster running from two damn cats. Remember to back up, no matter what! Ask yourselves, how fast can you restore your data? Now go out and fix something like a real engineer. See ya guys!
Leave a Reply