The programmer and his craft

Copyright Dr Alan Solomon, 1986-1995

Programming is a Craft.  It isn't a science, as there is too much
subjective judgment in it.  Nor is it an art, as there can be
objective grounds for saying "This program doesn't work".  It is a
Craft, reminiscent of the older crafts such as tanner, blacksmith or
tailor.

Some programmers work in Pascal, C or Basic.  Some work in Assembler;
they are usually regarded as the Master Craftsmen.  Others work in
Dbase or Symphony macros - they are the journeymen.  But all
programmers, from the greenest Basic apprentice right up to the
wizards who write the microcode for the processor chips, are members
of the Craft, and even a Basic programmer can be a Master Craftsman.

First, let us watch a Master Craftsman at work.  His desk is very
untidy;  papers and printouts everywhere.  Manuals are open at various
places, with pencils holding them open at other places.  There's a
computer on his desk;  possibly two.  He doesn't seem to be doing
anything - occasionally he grunts a bit, and looks something up in the
manual.  Sometimes, he turns to another page in his printout, but
mostly he drinks coffee and seems to be elsewhere.  He doesn't seem to
be aware of what is going on around him, and his colleagues are well
aware of what he is doing, so never address him by name.  Anything
else he can ignore, but he mustn't be spoken to directly.  So what is
he doing;  what is going on in his mind?

He's debugging.  His program doesn't do exactly what he intended.  He
hasn't made a mistake - that would be an oversimplification.  What's
happened is, he hasn't clarified his thoughts precisely enough, hasn't
thought of all the ramifications of his code.  He can't see exactly
what the computer is doing - all he knows is that it isn't what he
intended.  What he's trying to do, is understand what it is that he
hasn't done.  He can see the output of the process, but he can't see
at what point the thing isn't working;  where the computer starts
doing what he said instead of what he intended.

So, he has in front of him a complete listing of his program, a
printout of the inputs and the outputs, a manual for the language and
a manual for the computer.  He's building a complex model in his mind
of the part of the program that he thinks is the faulty part;  he's
understanding what the computer is actually doing as it moves through
that code - what is happening to the data, the machine state and to
his variables.  This model can take up to an hour to construct, and
requires a sustained effort of concentration.  If anyone breaks in,
even just to offer him more coffee, the whole house of cards could
collapse, and he'll have to start again.  Indeed, if there is a real
danger of such an interruption, he probably wouldn't be able to
achieve the necessary concentration.

Sometimes he'll seek help.  Explaining the problem to a fellow
programmer often helps, even if the other person isn't listening
properly.  Often, the explanation will suddenly tail off, as the
programmer sees what the problem is, and hurries off to fix it.

Sometimes, he can't see it, and tries making various changes in the
hope that one of them will work.  Or else he puts "debugging
statements" into the program, to show him what is happening at various
points.

Suddenly, he'll shout "AAAHHHH", a great, formless Eureka shout, as he
suddenly sees what is going wrong.  Once identified, it is usually
easy to fix, rerun the program, and on to the next problem.

These moments can be quite exhilarating;  there's a real feeling of
achievement in cracking a tough problem, and in programming, you know
when you've cracked it.  But the Craft of the programmer lies in
avoiding these moments.

It takes a long time to find a deeply embedded bug, and the process
can be exhausting.  After a major session rooting out something
particularly obscure, you might not be able to do anything else for
the rest of the day.  The Craft consists of the skill not to put the
bugs into the program in the first place.  Most Master Craftsmen have
evolved a way of doing this.

I'm not talking about a trivial two-day program here;  it probably
doesn't much matter how you go about that.  I'm talking about a major
work, what might be called a Master Piece (named after the piece of
work that would earn a Journeyman his Mastership).

The first thing to do is to work out what you want the program to do.
List the inputs and the outputs;  think about the size of the data and
the program, and how fast it will run.  Think about how data will be
fed in, and how it will come out.  I usually let the problem sit in my
mind for a few weeks, while I'm working on something else.

The next task is to choose the software and hardware.  What language
will you code in;  Assembler for compactness and speed, but very slow
coding.  Basic if you must, but much better is Turbo, C or even
Fortran.  Perhaps you should use an even higher level language like
dBase or Prolog, sacrificing run-time speed for speed of coding.  You
should also consider other higher level languages, such as Wizard.

The next thing to do is do design your data structure.  It is worth
doing this before you begin to write the program - it makes it much
easier.  The language you choose may limit the choice of data
structure.  For example, I recently wrote a major program in Turbo,
and wanted to use four-byte reals for storage of the database, and
eight-byte reals for writing a fake 123 spreadsheet.  But Turbo uses
six-byte reals, unless you use Turbo-87, which uses eight-byte reals.
This situation illustrates the way you frequently have to make design
decisions, trading off one factor for another.  I decided not to use
four-byte reals for storage, as this would have meant abandoning Turbo
(which would have made other parts of the program much harder to
write) or else writing a routine to translate four-byte to eight-byte
reals and back again, which would have slowed the program down.
Similarly, translating between Turbo's six-byte reals and 123's
eight-byte reals would have been slow, so the final decision was to
store all the data as eight-byte reals, and accept the space penalty.
Space versus time trade-offs are one of the most frequent design
decisions that have to be made - you can either go by experience, or
else do some timing trials.  When the program is finished, I'll
probably go back and write a routine for converting between four and
eight-byte reals, and do some timing trials on it, comparing the extra
time taken to do this with the time saved by needing to read fewer
bytes.

With the overall design of the program complete, you write it down,
either as a flowchart, or more often as pseudocode (pseudocode is
something in between written English and a formal computer language).
Then you begin to code.

Opinions differ about whether to start at the top and work down, or at
the bottom and work up, so it is probably just as good either way.  I
usually do both at once.  With the bottom-up approach, it's pretty
clear that you're going to need some routines in all sorts of places,
so you can write those first and get them working.  Then you can write
the routines that use these, then the higher level routines, and so on
till you write the main program, usually as just a series of
subroutine calls and a few simple loops.  With the top-down approach,
you start writing the main program.  If anything looks difficult or
complex, you just call a subroutine that handles it.  When the main
program is done, you start writing the subroutines that you skipped
over before.  Again, anything that interrupts the smooth flow of code
is given its own subroutine, and so on.  Eventually, the last
subroutine is written.

Both methods are using the divide and conquer strategy on programming,
and this is the only strategy that works for big programs.  Once the
problem has been divided into sub-problems, you can solve each smaller
problem more easily.  But an essential requirement of this approach is
that problems, once solved, do not become unsolved by the solution to
other problems.  This is what ultimately led to the downfall of
Napoleon, and to many a programmer since then.  It is crucial to
isolate your subroutines from each other, and allow them to
communicate via well-controlled routes.  Otherwise changing something
in one routine can stop another one from working.  The major problem
with Microsoft Basica and GWBasic is that it does not lend itself to
this divide-and-conquer strategy, which is why Master Programmers do
not use Basic unless the client insists (and even then, I would
probably refuse, on same grounds that a Master Armourer would refuse
to make a sword out of cast iron).

Another trick that all Master Programmers use is to recycle their
code.  Some of my routines appear in dozens of completely different
programs;  once you've got a routine fully debugged, you use it
whenever a program calls for anything like it.  Either you modify it
to do the new job, or else you make it more general, so it will be
more useful in future.  Every Master Craftsman makes tools for his own
use;  these subroutines are ours.

But we also need to buy tools.  Very few people would want to write
their own compiler, so we buy it in.  Money is no object, here - the
cost of struggling with an inferior compiler is immense.  My favourite
compilers are Turbo, Microsoft Fortran and the Microsoft Assembler.  I
don't speak C, but if I did, I'd get the Microsoft C compiler;  partly
for compatibility with the other Microsoft compilers I use (I also
have their Pascal) and partly because I've heard good things about the
Codeview debugger in version 4.  But I now do nearly everything in
Turbo.  My Basic compiler is also Microsoft, as a program compiled
with that will work on any MS-DOS computer (unlike the IBM Basic
compiler).  You can also run Codeview with Microsoft's Fortran
compiler, version 4 - I've tried it and it looks very nice.

You can also buy libraries of subroutines - C programmers are forced
to do this because of the incomplete nature of the language.  But it
can often be worthwhile for Pascal, Fortran or even Basic programmers
to buy a library rather than re-invent the wheel.  Try to get
libraries in source code format if possible, as then you can modify
and customise them.

We all use other tools - most especially an editor.  I use the IBM
Professional editor, because at the time I bought my editor, it was
cheap and good.  Now it is probably inferior to other, newer editors,
but I stick to it because it works well, and my fingers know it
backwards.

I also use a collection of Public Domain programming tools, available
from the User Group.  NEWDOSED is my favourite, as it lets me go back
through previous commands and edit them;  another good one is FASTKEYS
which speeds up moving the cursor on the screen. There are various
versions of GREP floating around - vital for finding all mentions of
a variable.  MAKE is something that you'll want to use if you are
developing major programs - since using the Microsoft MAKE I can't
imagine being without it (this MAKE isn't PD, of course, but there
is probably a PD MAKE).

But the greatest tool used by the Master Programmer is his own brain;
his ability to reason logically and his power of concentration.
Without these, the most you can aspire to is Journeyman.