Fixing Corrupted Time Machine Backups

Corrupted TimeMachine backups are annoying. There frequently isn’t any actual problem with the data, just that TimeMachine doesn’t know how to access.

Disclaimer

This is not my own work, just making a copy of the work I found here1.

I’ve used this a couple of times on network TimeMachine backups (shared over both AFP and SMB) and this has helped. I don’t always need to do all of the steps, and in the most recent, running a simple fsck solved the problem. YMMV

Uncorrupting TM Backups

This past couple months have been a time of experimenting and cleaning up household ‘tech debt’. I’ve had a couple of old laptops that had had met untimely deaths and still needed to have everything copied from their respective hard drives , I had a bunch of different photo libraries and outlook PST files sitting around that needed to be consolidated, all of them more tedious than difficult, but when my Mac Time Machine crashed – that was a pain to deal with.

Apple, being true to their delusions that ‘things just work’, really means that when things do actually fail, it is that much more difficult to determine how to fix it.

A couple of weeks ago, my Time Machine back up to my Synology DS 412+ NAS started to give me errors. It took me a bit of effort to dig through various websites to figure it all out.

The sad part is that the error I received has been reported as far back as 20102 — Apple still hasn’t fixed it, or explained the cause. Some sites claim that it has to do with an error introduced with OSX Snow Leopard. I’m running OSX Yosemite – that’s 4 versions later… I’m documenting everything I went through to get my backups working again so that someone else can save their time! If you find these instructions valuable, please add comments or likes.

The first sign of a problem was a popup on my machine about ‘Time Machine couldn’t complete backup…. already in use’

I had received this error before, and I usually just shut my NAS off and after rebooting, it just works. I know – that isn’t a good habit… and behold, this time it didn’t work! I received a subsequent error about ‘Time Machine completed a verification of your backup… Time Machine must create a new backup for you’. It is the most absurd resolution — if it creates a new backup, it ays it will delete the old history. Why would it not at least keep the history for you in a different directory or something??

Before, explaining how to fix the second error, I’ll step back a bit and explain how I Should have attempted to fix the original problem. Skip the next paragraph if you don’t need this info. There are many sites that discuss the already in use error (listed at the bottom). The error does not seem to be specific to Synology, but to fix it on the Synology, you simply go into ‘Widgets‘ (Top right corner on Synology DSM 5.2), open ‘Connected Users‘ (use the plus sign to add the Widget if it is not already there) and kill the appropriate afp connection – more than likely there will only be one. I had 2 so I killed them both.

Just reconnect Time Machine and it should work. Thank you Clement for the info!3

Now for the ‘corrupt’ backup error. After digging around, I learned some new things about Time Machine — namely that a lot of people don’t trust it for example! There are a number of sites that give some details on a fix, but none of them worked exactly for me. Some ommitted steps (perhaps because they didn’t need those steps), and others didn’t fully explain steps, so it was difficult to understand what was risky and what was not. Here is my run through of it. The two people who’s sites gave the most help were from: Tony and from Garth. Overall this took about 6 hours including research, and some of the reboots – most of that time was just waiting though.

To begin with, as a first step, stop Time Machine. You don’t want it to kick in midway while you are doing something. Open Time Machine Preferences and toggle the Time Machine to ‘OFF’.

Next open Terminal and change your permissions to have root control (in general you want to do this sparingly) by typing:

1$ sudo su –

Note — you SHOULD try to be connected to your NAS via Gigabit Ethernet — WiFi will take a long time. It is not impossible (I actually did it on Wifi), but if you need your machine or have time contraints, make sure you have a strong connection.

First we need to deal with the permissions on the sparsebundle. A sparsebundle is a type of file for the package where TM backups go. You do not realy need to know the details of this other than what is provided in these instructions. Sometimes the sparsebundle is marked bad when the error above has occurred, and this would prevent you from proceeding. We need to reset the permissions.

type: (this is just changing permissions and will complete immediately)

1$ chflags -R nouchg “/Volumes/<TM backup name>/<mybackupname>.sparsebundle”

If you do not know the name of your Time Machine or sparsebundle, go to Time Machine Preferences and it should be listed there. To find the name of the sparse bundle, type:

1$ ls “/Volumes/<TM backup name>/”

which will give you a file with the .sparsebundle extension. You now want to attach that sparsebundle to a disk mount.

You do so by typing: (low risk and which should respond almost immediately)

1$ hdiutil attach -nomount -readwrite -noverify -noautofsck “/Volumes/<TM backup name>/<mybackupname>.sparsebundle”

the result should be something to the effect of:

1/dev/disk5 GUID_partition_scheme
2/dev/disk5s1 EFI
3/dev/disk5s2 Apple_HFS     <Your Volume name>

Take note of the disk for Apple_HFS (or Apple_HFSX). It will be /dev/diskXs2 where x will be a number randomly allocated by your machine.

This next step is where it starts to get difficult and may take some time (seriously, don’t do this on WiFi unless you really have no other choice!!). It is also where I started to see a lot of differences on sites.

Basically, you are going to run a check on the sparsebundle (which you just attached so that it will work like a drive) and attempt to repair it.

Some sites said to use diskutil, and other sites were adamant to not use diskutil because it fails you are basically toast. Instead the recommendation is to use fsck.

Start with running the following which will attempt to rebuild the catalog btree in the sparsebundle.

1$ fsck_hfs -drfy /dev/diskXs2

After it completes (it may appear to hang some times), hopefully the response will be:

1‘The Volume was repaired successfully’

but more likely you will receive:

1‘The Volume could not be repaired’

Looking more closely, I actually received this message first: RebuildBTree – record 25 in node 10000 is not recoverable.

Thus I had to attempt to run fsck again, but with the -p flag to try and fix inconsistencies first.

1$ fsck_hfs -p /dev/diskXs2

This should complete with a repair successful message. This is one point at which things differred for me though. When I initially ran this still failed at the Rebuild. I had to disconnect from the NAS and power it down. I then had to restart it, and repeat all the steps above up and including the command to attach the volume again. I then had to run the rebuild again. Note: the value of x below might be different now so take note of that as well.

1$ fsck_hfs -drfy /dev/diskXs2

This rebuild worked perfectly. I was getting pretty excited at this point that I may actually be winning this…

Lastly, there were a couple of edits required so that TM knows the sparsebundle is no longer corrupted. Info out there was also a little inconsistent, but here is what I did.

Edit plist file. You can do this in finder by right clicking on the specific sparsebundle, selecting Open Package Contents, and opening in TextEdit, however I did this in Terminal using vi. Use which ever is your preference.

1$ vi “/Volumes//.sparsebundle/com.apple.TimeMachine.MachineID.plist”

Within that file, edit nodes:

1<key>VerificationState</key>
2<integer>0</integer>

changing the value of integer from 2 to 0.

Some docs tell you to remove node RecoveryBackupDeclinedDate and it’s corresponding node but I could not find that in any of the files.

Update Jan 4, 2017

Note Graham’s comment below. The above step did not work for him, but he was able to accomplish the same via Finder.

Update Apr 30, 2017

Note the comment from @barbequesteve below about binary vs xml based plist files.

Finally, I powered everything down, restarted the NAS and went back to TM and toggled it to ‘ON’. It had to verify the back up which took some time. Subsequently, when opening Time Machine and navigating to a previous backup from 2 months back, it looked like it may have been stuck at ‘Waiting…’, but it just took time to refresh (Again, it would be better to be connected on Ethernet)

All in all, it took about 6 hours to complete, but most of the time it simply waiting for verifying the disk or the backup. Good luck and add comments or questions and I’ll do my best to answer.

Update June 9, 2019

Note the comment from Marcel@ below.

It seems that Mojave has made this issue more difficult. Note the [link he refers to as well]4 which states you need to add /sbin/fsck_hfs to System Preferences -> Security & Privacy -> Full Disk Access, to run the fsck command.

I haven’t tried this myself, but Marcel says it worked for him.

Footnotes and References

Copyright

Comments