BACKUP SAFETY

by Wayne M. Krakau - Chicago Computer Guide, October 1992

This follow-up to my April column on backups was inspired by my recent encounter with yet another outrageously inadequate backup system. The hardware was perfectly good, a Colorado Memory Systems Jumbo (one of my favorites), but the batch files that automated its use were worse than useless.

The client's initial call (via a referral) was to discuss possible inadequacies in general. This led to a discussion of their worries about backup system. As a courtesy, I offered to walk them through a verify (also called a compare) using their last backup tape. Before I got to the verify, I had them do a tape status request so I could compare the raw number of bytes saved with the CHKVOL (the Netware command for checking the space on a network volume) results we had just obtained.

The status check showed zero bytes saved! No verify was required. Later research showed that all of their tapes were empty. This meant that they hadn't had a valid backup for almost a year. I immediately talked them through a manual backup of their Netware volume. That backup included an automatic verify, and subsequently was proven valid. They now have a new batch file driving their backup procedure, and it works.

This episode leads to my first recommendation. Test your backup system. Any time that the hardware or software on the backup computer is changed, the backup system should be tested. If any significant changes are made to the network as a whole, the backup system should be tested. Even if no changes are made to either, tests should be scheduled on a regular basis with once a year being a likely interval.

These tests should consist of two parts. The first is manually running a verify (compare) operation. Verify results often include perfectly innocuous warning messages about various Netware system files. If someone has used the network after the backup was made but before the test, additional warning messages about changed files also could be generated. This is the perfect time to learn how to tell the difference between genuinely hazardous error messages and these normal warnings. That skill could eliminate a potentially billable but unneeded call for help in the future.

The second part of these tests is a test restore. This part is potentially dangerous, so you have to make an honest appraisal of your skill level before deciding whether to do it yourself or call for assistance in this test. The first test is harmless enough that even a non-computer person can safely attempt it with little risk, but a restore can conceivably destroy data, so be frank in evaluating your own abilities.

A test restore could be done by creating extra directories and filling them with copies of word processing documents. Then use Netware's SYSCON utility to make some test users. Use the SYSCON or FILER utilities (or the GRANT command for masochists) to make several Trustee Assignments to these directories for your test users. If you have Netware 3.xx (that's the buzzword for 3.0 and above), you can make Trustee Assignments at the file level, too. Use the FILER utility or the FLAG command to assign various combinations of attributes to these test files. Finally, do either a full or partial backup, depending on time constraints.

After the backup is finished, delete all of the test users and the extra directories. Then try to restore them. After the restore, examine the directories, files, and test users to see that all rights and attributes were restored.

Another tool in your backup arsenal is the BINDFIX command. Its normal use is to repair the bindery (the database of all Netware objects and their characteristics), but it also creates valuable backup files of its own. As the first step in attempting to fix the bindery, it creates a backup. The original files are NET$BIND.SYS and NET$BVAL.SYS for Netware 2.xx and NET$OBJ.SYS, NET$PROP.SYS, and NET$VAL.SYS for Netware 3.xx. They are flagged as SYSTEM-HIDDEN. The backup files have the suffix "OLD" and are automatically reflagged as READ-WRITE.

These backup files can be copied to a floppy disk and stored for emergencies. In the future, if you get a bindery error that can't be fixed by BINDFIX, rather than going through the time and trouble of running a restore, just copy the BINDFIX backup files back to the SYSTEM subdirectory and run BINDFIX's companion command, BINDREST. BINDREST deletes the existing (presumably bad) bindery files and replaces them with the old ones. The entire process takes about a minute!

Please note that both the BINDFIX and BINDREST commands are usually run while logged in as the SUPERVISOR with everyone else logged out. If someone else inadvertently accesses the bindery while these commands are running, even a perfectly good bindery can be corrupted.

The final backup tool is Netware's NBACKUP utility. While it is designed to be a complete backup utility, it is limited to using only DOS devices such as disk drives, or an obsolete (and quite rare) model of a Wangtek tape drive. This severely reduces its usefulness as a general-purpose backup instrument, but leaves one simple but effective capability. This utility can backup the bindery and the directory structure (along with its attendant Trustee Assignments) to a single floppy disk in just two to three minutes and, more importantly, can restore it in even less time.

The value of that last ability can be seen if we examine the process of repairing a network whose entire hard disk was trashed. This means we are stuck with either a new, empty disk or an old reformatted (and also empty) disk. The first step is reinstalling Netware. The second it to do two restores. Why two restores - because the Trustee Assignments are dependent upon the bindery as well as the directory and file structure. It takes two passes using a slow (usually) tape drive to get it all straight.

NBACKUP provides an alternative. First, use NBACKUP on a regular basis to create special backup disks. This is done by first selecting a local drive for the log files. Then select "*" (Netware commands understand "*" as a synonym for "*.*") for directories to include and files to include. Leave the setting for directories to exclude at its default of blank (meaning none). Finally (and here's the sneaky part) set the files to exclude to "*". This will force NBACKUP to get the bindery, the directory structure, and the trustee assignments, but skip the files.

When recovering from the disaster, first run NBACKUP to get back all three at once. Then make a single pass (restore) with your backup tape. With all but the fastest tape drives, you will save a lot of down time with this method. Its only real limitation is its required skill level. NBACKUP is not automateable and its interface is kind of obtuse. I only recommend it to clients who have someone at an appropriate level of computer knowledge (as well as the spare time to do this manual task).

These recommendations can provide a safer environment in which to run your network. The real key to all of them is planning.

©1992, Wayne M. Krakau