Tales from the flying disk doctor - I win one, I lose one

Copyright Dr Alan Solomon, 1986-1995

Some databases have intricate, delicate file structures.  PFS:file is
one of those - the data (which isn't in ASCII) is stored in the same
file as the pointers, and it is stored as a linked-list, which means
that if the slightest thing goes wrong, the whole database is wrecked,
and it is an appalling job to reconstruct.  Fortunately, I've reverse
engineered the structure, but it still isn't easy.  Most databases
store the data in one file, and indexes and pointers in another.
Then, if the file is damaged, it is easier to sort out, as all you
have to to is sort out the data file, and then use the database to
re-index it.  PC-File+ is one of the nice ones, so when Michael phoned
me up and described the symptoms of a damaged database, I thought it
would be a piece of cake.  When he turned up with his Amstrad and his
backup, I thought again.  The wretched database was 8 megabytes long.
All it was, was a list of names and addresses, and when I hexdumped
the file, I could see that 95% of it was spaces.  But big files are a
pain to handle.  He also seemed to think that I should do the same
repair job on last month's backup.  Ugh.

He did his backup onto floppy disks, using PC-File's "compress"
utility.  That has to be the slowest, most inefficient file compresser
I've ever seen.  It took ages to uncompress his backups, and it must
have taken him ages to make each backup.  I introduced him to PKARC,
which is small, quick and far better.

When I looked at his file, I could see the trouble - there were 52
surplus bytes in the first record.  Don't ask me how they got there;
I think it might even have been a bug in his backup system, as one
wrong byte in a compressed file means lots of wrongness in the
expanded file.  So - here's the question.  What program will neatly
slice 52 bytes out of the middle of a file?  Norton certainly won't,
and I don't know any other disk repair tools that will.  So I wrote my
own;  I call it Dbrepair, and it has proved very handy.  It does a few
other things too, like recognise when the records have got out of
synch, and bring them back into line.

So I ran Dbrepair on his two files, and it worked a treat.  I love
doing databases full of names and addresses;  some people have such
funny names.  I put the data back onto diskettes for him and off he
went happily.  He'll run out of hard disk space in a year or so, and
before then, his database will crowd everything else off his disk.
Still, that's his problem, or at least, it's his problem until it
happens, because I told him it would, and I told him I'd fix that for
him too when it happens.

Manny didn't have such a happy ending.  Manny has been trashed by his
tape streamer, and Manny is hopping mad.  Manny has been doing backups
onto tape every day now for eighteen months.  He's been rotating his
backups, and taking one out for storage every so often, and keeping it
off site.  Manny's been bitten before, and he knows how to do a backup
system.  So what happened to Manny?

Well, Manny is no fool, and he's never actually needed to use his
backup before.  The only time he ever did a restore was when he
installed the tape, just to check that it really was storing his data,
and wasn't just showing him a directory.  Manny knows where his towel
is (and where his data is).

He deleted the wrong file by mistake, the first time ever.  No
problem, he thought, I'll restore the backup.  This backup was very
current, so he just told it to restore the whole hard disk, rather
than mess about working out the syntax for restoring one file.  After
the restore, half his software didn't work, and most of his data was
naff.

I had a look.  It took me quite some time to understand that disk, but
eventually I realised that every cluster up to 1834 was fine, and
every cluster after that point was wrong.  It looked as if it had been
slid forward by one cluster, so the directory was pointing at the
wrong data.  But it wasn't as simple as that, otherwise I could just
have slid all the data back again.  If a file only needed the first x
bytes of a cluster, the rest of that cluster was just garbage.  That
would be OK for normal use, but it meant that I couldn't use the
sliding trick.

I installed the tape drive in my IBM, and did a tape directory -
everything looked fine.  Then I restored the backup.  Everything went
well, until the tape reached a file called LPT2, and then DOS said
"Error writing to device LPT2, Abort, Retry or Ignore?".  You could
see what had happened - there was a file on the disk called LPT2, and
the tape software had read it and written it to the tape, but when the
tape software read it from the tape and tried to write it via DOS, DOS
thought it should be sent to the printer.  So, should I abort, retry
or ignore?  I consulted the apprentices (I always get them to think
things through, before telling them the right answer) and they said
that Abort was out, because then the program would just stop.
Correct.  They said that we'd have to Ignore that file and continue.
Wrong.  That's what the last person to do a restore did, and it didn't
work.  I hooked up a printer on the second parallel port, and said
Retry.  A file got sent to the printer, and the tape restore
continued.

Nice try, but no coconut.  The files were still one cluster out of
true, with the end-of-file garbage ensuring that the obvious
cluster-slide wouldn't work.

LPT2 is not usually possible as a file name - you try creating it, and
you'll see what I mean.  But it is just possible - Manny did it, and I
had no trouble re-creating it.  It would be barely acceptable if the
tape drive lost just that file, but to corrupt all the data after
that?  Nice one, Cyril.

At that point, I knew I needed to know how to control that tape drive
myself, without using their software.  I tried several other unlikely
ideas, but none of them worked, and the next day, I rang the UK
distributor to get some technical information on the drive, like how
do you read it.  It only took me ten minutes to convince him that the
standard tape software wasn't what I needed, and then he told me he
hadn't the foggiest idea how to get what I wanted.  So I phoned the
company that makes the drive, in the US.  They told me that all the
support was all done by another company.

So I phoned this other company, and they told me that they didn't have
any technical information, but they were just badge engineering them
from a third company.  So I phoned the third company, and the techie
told me that as I'd got the drive from the first company, I had to get
my support from them also.  I blew up at him, told him the cost of
transatlantic phone calls, told him my customer was looking for
someone to sue for losing his data, told him that I knew it wasn't his
fault personally, and asked him nicely if I could have a copy of the
manuals (which was all I actually wanted).  Yes, he said.

Unfortunately, it takes a long time for manuals to cross the Atlantic,
and I can't actually guarantee to Manny that I'll get his data back,
because it could be that the data was written already corrupt on the
tape, although I rather think that the tape is OK, and it's only the
restore process that is failing.  And meanwhile, Manny has a business
to run, so he's hired a couple of people to type in all his data from
his paper records.

So, I guess you're all waiting for me to tell you the name of the
offending hardware?  I'm not going to, for several reasons.  First, I
don't want to get sued, because what I'm saying here is that their
tape drive is actually worse than useless - Manny would be alright
today if he hadn't bought it.  And I can't prove what I say, unless I
buy one as evidence, and I'm blowed if I'll do that.  But more
importantly, I don't want you to know the name, because a lot of other
people are rebadging the same drive, and I bet the same problem exists
on lots of other brand tape drives.  All I can suggest is that you
check out the one you have, because otherwise, one day, you'll create
a file called LPT2 (or several other names) and then, when you
restore, you'll lose all your data.  And I mean all.