I know I haven’t exactly been filling the blog-space with useful tidbits, random ramblings, or musings on clown psychology lately. To be fair, however, I have been pretty heads-down in two different areas: one on my CCIE Routing and Switching lab studies; the other on a large Flexpod (UCS, NetApp, VMWare, Nexus) implementation project at the office.
We are a long time VMware shop, and when I came on board back in 2006 I made a push for even more utilization. We started expanding heavily and made the choice at the time to use Equalogic iSCSI SANs for our backend storage solution. This worked well for a number of years, but now that Dell has purchased Equalogic we have seen our support quality slipping, resolution times stretching, and parts deliveries slowing significantly. As such we moved to NetApp.
One of the things with NetApp that makes it an intriguing solution is the software, particularly around cloning of virtual machines as well as snapshots and de-duplication. While these are powerful technologies to be sure, sometimes getting your head around the details can be tricky. Even trickier? Mating up what you thought you knew, or what you’ve grown used to, with the reality now.
What am I talking about in particular? Defragmentation and Microsoft.
Everyone who uses a SAN and VMware probably knows by now that defragmentation, especially where VMDK files are concerned, is a giant waste of time. After all, you’re trying to move data blocks on a hard drive around to make them more efficient, but those blocks are really just representations of blocks inside of a file, using blocks, on multiple hard drives. Simple, right?
Well, with NetApp snapshots the reasons for not defragmenting get even more ammunition: you’ll actually use more space. Why? Because the snapshots are tracking change blocks (deltas) and so don’t take up any space at all when first created (or very little) since you’re just duplicating the root inode. Every time you defragment, you’re potentially rearranging all of the “blocks” of the file system, which is going to then trigger a bigger delta come the next snapshot. It’s the same reason why you sometimes delete space on a volume and don’t see it: the snapshot has to grow by the same amount as the change, and since the snapshots are on the same volume as the data… well, there you go.
So, all of this is great. Don’t run defrag and life is good, right? Sure. But Microsoft is trying ever harder to be helpful and they’re actually crossing more and more into that territory occupied by Apple that I don’t care for: the one where they obscure all of the details to “just make it work” and you have to hunt to find even the most basic of features.
Turns out that in Windows 2008 and 2008 R2, defragmentation is set up as a scheduled task by default. Every Wednesday as a matter of fact. The good news is that the task is disabled out of the box. The bad news? Most server admins have an almost pavlovian need to defragment Windows boxes and probably have turned this (or some variation) on for many machines in your environment. Possibly even through a GPO that someone set and forgot many cycles ago.
At the end of the day, defragmenting a virtualized Windows box might make the OS think its happy, and it might even make the server admins happy. Storage folks, however? The shrieking from the unfashionable wing of the IT area will probably be all the indication you need that bad things are afoot. Hell, if you’re really looking for some of the old ultra-violence, run defrag on all your machines at night… at the same time. If you’re lucky, and everyone’s asleep, you might even manage to offline a LUN. And that’s just good fun for everyone.