FTP Online
 
 

Plan, Backup, Test, Repeat—
It's Not Just a Good Idea, It's a Necessity
AFollow these steps to create a disaster recovery plan that will help you sleep at night.
by Steven J. Vaughan-Nichols

Posted September 15, 2003

I'm a pretty tech-savvy guy, in theory, with decades of experience in running networks—from small office/home office (SOHO) ones up to networks with thousands of computers. But then came the day when I noticed an awful odor of smoking electronics in the office. A quick look around showed that the power supply in Subtle, my Windows 2000 primary domain controller (PDC) server for my 20-station SOHO network, had gone to meet its maker. Guess who didn't have a current backup? While mine was not an enterprise network, the situation made me realize just how important backups are—and how you must make sure that you not only avoid losing your information, but you get it back online in a hurry.

You see, worse than losing my server was realizing how lazy I've been with my backups. I managed to restore most of my data, but I lost two days doing it. No business can afford that kind of downtime.

With security threats like Blaster abounding, the danger of your business being knocked out of service is greater than ever. You must have a backup and disaster recovery (DR) plan to ensure that your backup and restore systems are capable of swinging into action at a moment's notice. And more than that, you must be certain that it really will work.

Good intentions are never enough. The best backup and recovery plan in the world won't do you a lick of good if you don't follow it. You also can't make such a plan on a napkin during lunchtime. (Of course, all the plans, good wishes, and protocols in the world don't matter if they don't work when you need them—if you don't believe me, just ask the North American Electric Reliability Council.)

Here are the key steps you must follow to make sure that you'll have a solid DR plan in place.

Meet With Cross-Division Staff
You, the rest of the senior IT staff, and representatives from your company's other divisions need to sit down and determine exactly what you need in the way of an emergency backup and DR plan. Backup and recovery is not just a problem that affects the IT team.

Determine What You Need to Back Up
Everyone knows you need to save your server data and applications, but what about data on local systems? Does anyone keep mission-critical data on a desktop? Do you want to keep backups of individual desktop programs, or will you keep master copies of the standard desktop for emergency deployments?

Do you want to archive your e-mail separately, or just store it as part of your system backup? It doesn't matter if company policy is to keep important mail messages at the central mail server, if your senior-level employees always delete them and keep them in their e-mail client's local storage.

You need to ask all these questions and get the real-world answers. Find out how your company is really using data so that you can arrange for the appropriate backup methods.

Decide How Often You Need to Perform Backups
The backup part of your plan must also detail exactly who does what, and when. Do you backup every evening? Every weekend? I favor full system backups on the weekend, with incremental backups every night.

Choose the Right Backup Format
For pure speed, it's hard to beat backing up to CD-WORM and DVD-RAM discs. But even DVD-RAM can only store up to 4.7GB of raw, uncompressed data per disc. That may be fine for storing a workstation's vital data, but it simply isn't good enough for server storage.

There's also a compatibility problem with using recordable DVDs. Besides DVD-RAM, there are DVD+R, DVD+RW, and DVD-R discs and drives. Need I add that they're incompatible with each other? While devices like the Pioneer DVR-A05 make fine workstation backups, you must have a strict, company-wide standard on both DVD recordable devices and media, or face the consequences.

For serious network storage, you still only have one serious choice: tape. Tape's input/output speed is much better than it has been, but it's still not close to a disk's speed. On the other hand, you can store over 100GBs of data on a single tape cartridge; for example, the well-thought-of Quantum SDLT 320 can hold 150GB per tape.

Some single tape drives, notably Exabyte's VXA tape series, offer close to disk random file speeds, but these drives top out at an uncompressed 80GBs per drive.

For serious storage, what you need is an automated tape library box, such as the Quantum ATL SuperLoader, which can handle 17 SDLT tapes for a total of over 2.5 terabytes of uncompressed storage, or the Overland PowerLoader SDLT 320, with slightly less storage.

Decide Where Backups Will Be Stored
Once you have that backup, though, where are you going to keep it? The old rule of thumb was that you kept at least one copy on-site and another at a local safe house, such as a bank deposit box. In the post 9/11 world, though, that isn't likely to be enough.

Some companies have even taken to keeping their backups with outsourced data protection companies like Iron Mountain. If you require the best-possible backup at an external location, such companies offer a valuable service.

Simply keeping complete backups at multiple sites will be sufficient for most companies, but there are several caveats. The first is that the data needs to be physically secure. We tend to focus on hackers stealing away data, but a data thief can do just as well by dropping a 4mm data tape into his pocket. The second problem is that if you're using the Internet to replicate data to a storage facility, you need to make sure that the connection itself is protected by encryption, a Secure Socket Layer (SSL) connection, a Virtual Private Network (VPN), or all of the above.

Of course, remote backups have one problem—they're extremely difficult to use for restoring. That's one reason why hot backups have become more popular. While not an enterprise-wide solution, it makes good sense to keep hot backup servers if you have a critical server or server farm. Normally, these servers are on-site and constantly replicate the work being done by the main servers. It's an expensive solution, but it's cheaper than having your e-commerce site disappear with one lightning strike.

Can't afford that? Well, it's your company, but you can also get some of the benefits of hot backups by mirroring data on redundant arrays of inexpensive disks (RAID) and storage area networks (SANs). That way, such commonplace workplace dangers as the death of a single hard drive or server won't bring your business to a crawl.

Decide How Often to Replace Backup Media
You need to replace your backup media on a regular schedule. For example, I'd never use a backup, whether tape or rewritable optical disc, more than two dozen times, even though the mean time between failures (MTBF) on data recovery media is much greater than that. Call me cynical, but I've seen too many backup tapes go bad—up to and including breaking—when they were needed most.

Create a Communications Plan
The most overlooked disaster backup and recovery issue is the one no one ever thinks of: communications. You must have a procedure for notifying the appropriate people about any serious system failure. Everyone in the New York office may know that the intranet servers just died, but the people in the San Francisco office may be clueless.

You also need to make sure that the location of the backups is known to the appropriate people. For example, if the Chicago office goes up in flames, you need to plan on who restores the data and where they will be restoring it. For the biggest disasters, everyone in the company should be kept in the loop, starting with the IT personnel in charge of backup and recovery.

This is also true about restoring data. Everyone who will be affected by data recovery needs to know exactly what was restored successfully and what was lost. For example, will transactions be lost? Will the local offices need to resynchronize with master information at the data center? You need to answer these questions as soon as answers are available.

Test Your Entire Plan Regularly
With all your backups in place, are you safe now? Hardly! One of the most common mistakes is failing to test your real-world recovery methods on a regular basis. I've seen one 400-employee company locked down for a couple of days because the old recovery software no longer worked with the new drives. Don't let this happen to you.

Test, and then retest, your recovery methods. If one site is totally lost, can you restore its data at another site? How long will it take? Yes, testing like this will blow away the weekend for IT staff, but you must be sure that your recovery methods will work. If you don't, you're all too likely to find that all the effort you've put into backing-up data was a waste of time.

Along with testing recovery procedures, you must also test your communication procedures to make sure they'll actually work. If your star recovery person has a new, unrecorded cell phone number and is fishing for marlin off Key West, you're dead in the water.

Become an Uncompromising Enforcer of the Plan
Let there be no doubt about it—this is a heck of a lot of work. No one will really want to test the procedures, but you have to enforce them. IT staffers, end-users, CTOs, CIOs… it doesn't matter—no one likes spending time on backup and restore issues. It's like life insurance; we know we need it, but most of us want to spend as little as time as possible dealing with it.

You can't do that with corporate backup and recovery, though. You must set up your plan, deploy it, and then constantly test and retest it. It's not just a good idea, it's a necessity.

About the Author
Steven J. Vaughan-Nichols (sjvn@vna1.com) has been writing about technology since 10MB PC hard drives were considered huge and 5.25" floppies were the PC's backup media of choice.