Backup Exec 2010 R3 SP3 on Windows Server 2012 Core using Windows deduplication

Hmmm… it’s looking like Symantec have a problem with Server 2012 Core with Windows deduplication turned on.

We’d been advised that the above release of BE 2010 would at least let us back up our 2012 boxes, so after installation quickly installed the agent on our 2 2012 boxes: 1 built GUI, 1 built Core.

The GUI one seems fine, backups are taking normalish amounts of time and restores work. The Core box, however, implies that a full backup o four data (4TB?) would basically never finish, but then the job finishes really quickly. And… the restores don’t work. Or at least not fully, which is why we back things up to be honest.

3 calls to Symantec later and there’s still not much to report from them, but doing some testing I found that:

  • Build a Hyper-V test server with deduplication against a 40GB disk, got some test data on it (90MB image files) and set deduplication off. Brilliant- 94% deduplication;
  • Installed BE agent, ran backup, ran restore- nothing, just corrupt files;
  • Installed the full GUI role, re-ran backups and restores, nothing again;
  • Uninstalled BE agent, rebooted, reinstalled BE agent, nothing again;
  • Rebuilt server as Server 2012 GUI (from start), installed BE agent;
  • Ran backups/ restores, worked! despite some strange VSS errors;

I can only think that BE just really doesn’t like Core. I tried a small (2.7GB) backup of some of our production data and the results were really random- some of the restored Word documents worked, some didn’t. Same for Excel files. I couldn’t get any PDFs to open, whereas JPGs only seemed to fail if generated as user content, not as embedded software files.

Currently no idea- although I suppose it’d be advisable not to use the above BE version to back up any production data that sits on a Windows Server Core DeDuplication volume for now.

UPDATE 15NOV2013: it looks like Backup Exec (or at least our BE environment) has a serious problem with Server Core- but at least it’s easy to fix at least. The first time the BE agent gets pushed out, it DOESN’T install the BE 2010 R3 SP3 update- which is why the backups are corrupt, because support for Server 2012 only came in with SP3 so the agent remains on 13.0.5204.0. Re-installing the agent however does bring the agent up to 13.0.5204.1225, and magically the backups start to work. Symantec tech support are trying to recreate this is in a lab environment, but it maybe it’s just our environment this affects. As before, Server 2012 GUI seems to have no issues. This is tediously repeatable… I’ve just done it again with another brand-new, unpatched Core machine and the same thing has happened.

When Server 2012 DeDuplication goes bad

This is a really weird one.

Our old SAN was really struggling with space despite having EMC DeDuplication switched on, so I commissioned a Server 2012 guest on some new SAN storage, and started serving files through this guest off a dynamic VHDX file (sat on another chunk of the same storage). Because our old SAN used NDMP backups, I had to use RoboCopy to migrate the data which is slow but did the job. The reason for the intermediate VHDX is simply for portability- we could pick it up off the SAN and dump it on any Windows server (even stand-alone), whereas obviously anything stored directly on the SAN would need to be connected up to servers on the iSCSI network. The VHDX file is connected to the guest on a SCSI bus as this enables hot-removal unlike the IDE buses.

We’d also been waiting for Backup Exec 2010 R3 SP3, as this was the first release of 2010 that at least enabled the agent to work on Server 2012.

This was all good so far; RoboCopy had- as expected- kept all the NTFS permissions correct so it was just a case of sharing the folders out. As both shares now also sat on the same volume, I though Windows Server 2012 DeDuplication could really get to work by DeDuping across the shares, which previously hadn’t been possible due to the configuration. The DeDupe process started slowly but I didn’t think anything of this, because the Celerra took ages to fully DeDupe all its volumes. I reasoned that this should be fine as it was the Hyper-V guest doing the DeDuplication; I presumed that the guest should see the VHDX as just another block of storage rather than trying to DeDuplicate the VHDX file from a host machine.

This is where it all went a bit strange. Nobody had complained about access speeds or anything, yet Backup Exec was taking absolutely ages to back up this volume (I did some rough maths and figured out it would actually never complete a full backup inside a week, compared to “just” taking 48 hours or so previously). Then it turned out that actually, it was doing full backups in less than an hour because it wasn’t actually doing full backups.

This is when it became apparent that there is a clash between the presentation layer of Server 2012 (what I would normally refer to as the Explorer shell, but this is Core), DeDuplication and dynamic VHDX files. The storage system within Windows knows how much actual data is on the volume because it reports x TB used, correctly. DeDuplication seems to go mad and actually un-DeDuplicate everything, so you end up with a space saving of 0bytes (from 11GB… so it’s gone backwards). And the… “explorer” shell goes further and actually loses all reason, reporting insanely small “size on disk” numbers for vast amounts of data (real-world example: 5.1TB of actual data takes up just 37GB on disk. Yeah, right). So to be fair, this is why Backup Exec is being so erratic- it’s asking Windows for the amount of data, and Windows replies that although there’s supposedly 5TB of data, it’s only taking up 37GB on disk.

The fix is currently unknown, as apparently even Microsoft haven’t encountered this one. I’ve stopped the actual DeDupe jobs (even though it’s still enabled against the disk) and set a RoboCopy job off to rehydrate (fingers crossed) everything onto yet another volume (this time pass-through to the new SAN) so that at least we can start getting valid backups. We’re passing information on to Microsoft to see if they can come up with anything, and have discovered that this is perfectly repeatable: enable DeDupe against another (live but non-backed up) dynamic VHDX and the same thing happens, Windows reports the actual amount of data as X but the size on disk as X-a-lot-of-space.

Update 02-NOV-2013: still unresolved. Found a few discrepancies in NTFS permissions but this doesn’t explain much. The data seems to rehydrate onto another volume, but that’s maybe the wrong word as “rehydrate” implies it was deduplicated in the first place, which it wasn’t really. Still, at least we can get some form of backup for now.