You can file this under “do as I say, not as I do.”

When I advise my neighbors and friends about their home computing setup, I’m a broken record on the subject of backups. I tell them over and over to back up frequently; to back up to separate media and if possible to separate subsystems; and to inspect their disks frequently for early warning signs of failure.

For example, my wife’s computer backs her data files up to a separate disk every night. The files she uses at work back up every afternoon to a server, which in turn replicates the data to separate disks every night, and I make offsite copies every month or so. Backup is serious business.

So when the primary data drive on my PC went out to lunch with little warning, I should have been prepared, right? Wrong. I hadn’t taken a backup since late 2011. More than 600GB of precious photos and videos were apparently gone.

At some point, the primary data store had passed the capacity of my largest backup drive, and I was too lazy to back up to multiple smaller drives. I had figured, based on prior experience, that the drive would give some warning that it was going bad. And it did: about 15 minutes. That’s hardly enough to copy off 600GB of data. It seemed I was in the soup.

Fortunately, help was at hand. My son had lost a 1TB drive a few months before, and he had researched a procedure for dealing with dying-but-not-quite-dead drives. He pointed out that since Windows could still see the drive, and even read the directory (if not the data), the disk still had some life and could potentially be rescued. He pointed me in the right direction. In brief, the procedure goes like this:

  1. Immediately get the drive OUT of your PC. Windows will keep trying to write to bad sectors no matter what, exhausting the drive’s remaining life. Power down the PC by brute force if you have to.
  2. Get a duplicate drive of equal or greater size.
  3. Install the dying drive and the new drive in a system with really good cooling. If you don’t have one, leave the dying drive on a table and point a fan directly at it to keep it cool. Put the system on an Uninterruptible Power Supply too, if you have one.
  4. Download and burn a copy of the Trinity Rescue Kit Linux Live CD. There are lots of Linux Live CDs – that is, Linux systems that run from a CD without a hard drive – but this one is specialized for rescuing disks and performing other forms of surgery on Windows systems experiencing seemingly intractable problems.
  5. Copy the entire dying drive to the new drive at the physical level. Don’t bother with partitions. If Windows can see the dying drive, then the Master Boot Record and the partition table are okay. Just copy everything, using the specialized ddrescue command. (ddrescue can also be used to copy individual files or directories.)
  6. Be patient. ddrescue trades off every possible speed optimization in favor of getting the data. However, if it can’t succeed in reading a sector in a reasonable amount of time, it will write zeroes to the destination and move on.
  7. When the process finishes, put the cloned drive back in your system and run CHKDSK/F to fix up the file system.
  8. Hope for the best.

In my case, the cloning procedure ran for 36 hours. (I was on tenterhooks for fear of power outages, which are pretty common in my town.) The copy rate, which was 30MB/second when things were going well, would often slow down to less than 1KB/second. Yet when all was said and done, just 12 sectors – 6KB – proved unrecoverable. The only thing CHKDSK complained about was corruption in the upper- case names file. And the dead drive was still under warranty. Success!

So now, of course, I’ve bought a spare 1TB drive to put in my PC for disk-to-disk backup every night, and a 2TB external drive for backups on removable media. I am truly reformed . . . until the next time something goes wrong.